GEMF Map Store Format
This document attempts to document the file format used in the GEMF tile sets. A tile set consists of a single large file containing a header and then all of the tiles concatenated together. The file typically has a .gemf extension. This tile store format is intended to provide a static (i.e. cannot be updated without regenerating from scratch) file containing a large number of tiles, stored in a manner that makes efficient use of SD cards and with which it is easy to access individual tiles very quickly. It is intended to overcome the existing issues with the way tiles are stored in most existing Android map applications as these are not very scalable.
At the end of this page, there are some links to some proof-of-concept code that generates and uses this tile store format.
All data in the header except offsets are stored in the form of big-endian 32-bit integers. Offsets are stored in the form of big-endian 64-bit integers. The original intention was that these be unsigned, but it is probably irrelevant given the ranges of values involved.
By way of example, the value 695125 is represented in hexadecimal as a 32-bit integer as 0x000A9B55 and would be stored in the file as four bytes: 0x00, 0x0A, 0x9B and 0x55, in that order.
3. Header Area
The header area contains a simple set of details describing the tiles that are available in the data area. The details contained in this header are documented below (starting from the first byte in the file).
3.1 Overall Header:
|GEMF Version||This is the version of the file format specification. It is set at 4 for the specification documented herein.|
|Tile Size||This is the size along one side of the (square) tile. Currently only 256 is supported.|
3.2 Range Data
After the overall header, there is a set of tile source names. It is intended that the map viewer holds this in memory. There can be several tile sources (providers). The number is stored first as:
|Num Sources||This is the number of tile sources (providers)|
Each source is then:
|Index||Enumeration: first source is 0, second source is 1, etc|
|Name Length||Number of bytes in the name field.|
|Name||ASCII encoded string (not null-terminated) of length "Name Length" bytes for this tile provider, e.g. "OpenStreetMap.org".|
3.3 Range Data
After the source list, there is a set of ranges of tiles. It is intended that the map viewer holds this in memory. For large tile sets, it is impractical to either buffer the entire list of available tiles or to scan through a file every time trying to find a tile and this range data simplifies this process. There are a number of concatenated ranges. The number is stored first as:
|Num Ranges||This is the number of groups of tiles (see below).|
Each range is then:
|Zoom||A zoom level, as used by openstreetmap etc. I believe the range is 0 - 17.|
|X Min||The minimum OSM X tile in this range|
|X Max||The maximum OSM X tile in this range|
|Y Min||The minimum OSM Y tile in this range|
|Y Max||The maximum OSM Y tile in this range|
|Source Index||Index into the list of tile sources|
|Offset||The offset into the range details part of the header file for this range - 64-bit integer (section 3.4).|
For example, a map covering the whole of the Bristol, UK at zoom level 15 could be described as:
15, 16134, 16163, 10824, 10850, 0, offset
where offset depends on the number of ranges in the file and which range this particular range set is.
3.4 Range Details
After the range data, there is an address and length associate with each individual tile. This should not be cached in memory but should be read whenever a tile is required. Each tile holds:
|Address||Offset in the GEMF file at which the image data starts - 64-bit integer|
|Length||Length (in bytes) of the image data|
These are grouped by the associated range and then each 'y' value for the first 'x' value, each 'y' value for the second 'x' value etc. There is a detailed example in section 7.
3.5 Handling Non-Rectangular Regions
There are three approaches that can be taken to dealing with non-rectangular regions; GEMF viewers should be able to support any of these.
- Create multiple ranges for each zoom level to build up a set of rectangular regions that, when combined, cover the whole region. This is the approach taken by default by the generation script (generate_efficient_map_file.py, see below)
- Create a rectangular region that covers the whole area (and more) and then for the tiles that are outside of the range, set the data length in the range details (see 3.4) to be 0. This is supported by the generation script when the option "--allow-empty" is passed.
- Create a rectangular region that covers the whole area (and more) and then have a single blank tile in the data area. All tiles that are outside of the actual range can have the address and length in the data area pointing at the same blank tile. This is not supported by the generation script and is not the preferred option (as the 0-length file allows viewers to display the missing tile in their own preferred way).
4. Data Area
This is simply a concatenation of all of the tile data in the order specified in the header area and follows immediately after that area.
5. Reading a Tile
Assuming the range data has been read into memory, in order to read a tile, the following steps are carried out.
- Firstly, the range containing the tile is identified by examining the Zoom, X Min, X Max, Y Min and Y Max values for each range, along with the source if required.
- The Range Offset in the data file is then used to calculate the offset into the GEMF file of the Range Details for this tile.
Bytes in Word = 4 Bytes in Address = 8 Number of Y Values = Y Max + 1 - Y Min Index of X in Range = Tile X - X Min Index of Y in Range = Tile Y - Y Min Tile Index in Range = (Index of X in Range * Number of Y Values) + Index of Y in Range Offset into Range = Tile Index in Range * (Bytes in Word + Bytes in Address) Offset into GEMF File = Offset into Range + Range Offset
- The Address (64-bit integer) and Length (32-bit integer) are then read from the GEMF file at the calculated offset.
- The image data is then read from the GEMF file using that address and length.
6. Split Data Files
It is also possible to split the data file into multiple files in order to overcome any file size limitations of the file system on which the GEMF file is stored. In this case, the header remains unchanged, and the data area is split on an image file boundary into multiple files. If the first file (containing the entire header) has a file name of map_data.gemf, the next file will be map_data.gemf-1, then map_data.gemf-2 etc. To read split data files, find the lengths of each file and subtract them from the required data offset until the data offset is less than the current file length. Then open the file and read the image data from that file. The reference implementations (section 8) all support split data files.
7. Simple Example
This example assumes two sets of tiles covering Bristol, UK at zoom levels 14 and 15.
Each value listed is stored as 4 bytes as described (except the HEX byte string for the source name).
4 // GEMF Version 256 // Tile Size 1 // Source Count // First Source: 0 // Source index 17 // Length of name HEX:[4F 70 65 6E 53 74 72 65 65 74 4D 61 70 2E 6F 72 67] // OpenStreetMap.org (ASCII encoded) 2 // Number of ranges // First range: 14 // Zoom level 8067 // Minimum X Value 8081 // Maximum X Value 5412 // Minimum Y Value 5425 // Maximum Y Value 0 // First source 105 // Offset of first range // This is: // 4 + # GEMF Version is stored in 4 bytes // 4 + # Tile size is stored in 4 bytes // 4 + # Source count is stored in 4 bytes // (4 + 4 + 17) + # This is the length of the (only) source in this example // 4 + # Number of ranges is stored in 4 bytes // ( // 2 * # There are two ranges in this file // (4*6 + 1*8) # Each range uses 6 integers (zoom, minx, max, miny, // # maxy, source) each 4 bytes and one long (offset), 8 bytes // ) // Second range 15 // Zoom level 16134 // Minimum X Value 16163 // Maximum X Value 10824 // Minimum Y Value 10850 // Maximum Y Value 0 // Only one source, so first source again 2625 // Offset of second range // This is: // 105 + # Offset of the first range // 2520 # This is the size of the data from the first // # range: number of tiles is ((8081-8067+1)*(5425-5412+1)) // # = 210 tiles, each storing (address, data) and taking 8+4 bytes // # so total is 12*210 = 2520
There is then the (address - 64-bit, data - 32-bit) details for each tile, followed by the concatenation of the image files.
8. External Software
This section contains a list of software, written by others, with support for the GEMF file format. There is also some free (public domain) sample software to assist with creating new implementations listed below.
- Locus: an excellent map application for Android with support for online and offline maps, including GEMF maps, vector maps and SQLite maps. Also offers many other features such as track recording, POI handling, import and export of POIs and tracks.
- osmdroid: a library for using openstreetmap data in your Android application. Includes support for the GEMF format as well as Zip files. Also includes a Tile Packager PC application that can generate GEMF files from openstreetmap.
- GEMF Tool SQLite Converter: a tool for converting SQLite map stores into GEMF files.
9. Example Software
All software is released into the public domain.
At this link, there is a simple python script that generates a file called map_data.gemf containing a set of files. To use it, create a Maverick format tile store (with directory structure root_folder/zoom/x/y.png.tile) and call it as "python generate_efficient_map_file.py root_folder" from the parent directory of the Maverick tile store. There is also a script here to get some tiles from openstreetmap.org, get some tiles and create a GEMF file automatically. This needs to be placed in the same directory as the (updated) generate_efficient_map_file.py. To change the centre latitude/longitude for download, edit the first few lines of the python script.
At this link, there is a the source code for a Qt-based viewer (should work on Windows, Linux or Mac OS X) for viewing the GEMF files. At this link is a Windows binary of the viewer. This assumes that you are in Dursley, UK, so starts the viewer at this location. Scrolling is done by click-and-drag in "Pan Mode" (Tools Menu). Zooming is done by pressing the + and - keys on the keyboard. If you switch to "Select Mode", it will report the tiles required for this area in the attached terminal (if there is one). Scaling to allow access to intermediate zoom levels can be enabled by toggling the option on the Options menu. If you want to look at tile sets starting in another location, you'll have to change the source code (near the top of QMapViewScene.cpp). The code for reading the GEMF file is in src/GEMFReader.cpp.
At this link, there is the source code and binary for a very simple Android application that will view a GEMF file stored in /sdcard/maps/map_data.gemf. It starts up in the current location (or Dursley, UK if no location services are available). If the GEMF file installed on your phone doesn't cover the area surrounding this location, it's likely you'll get a black screen as this simple application does not support downloading maps from online providers at present. Scaling to allow access to intermediate zoom levels can be enabled in the settings (press menu). Scrolling is done by touch-and-drag and zooming by pressing the buttons in the corners of the screen. The code for reading the GEMF file is in src/uk/co/gataki/GEMFReader.java.
Finally, a small (12 MB) example tile set centred on Dursley is available at this link.
10. Version History
Revision 4 - Added support for splitting data files to allow for large files on file systems that do not support them.
Revision 3 - Changed to use 64-bit integers for file offsets and added a list of providers (sources)
Revision 2 - Combined GEMFH and GEMFD into a single GEMF file
Revision 1 - Initial release