* tweaks from running on planet

* add no building merge log

* table format

* update maven descriptions
pull/154/head
Michael Barry 2022-03-31 06:42:28 -04:00 zatwierdzone przez GitHub
rodzic 827da1912a
commit 4983bbc04e
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: 4AEE18F83AFDEB23
11 zmienionych plików z 8781 dodań i 67 usunięć

Wyświetl plik

@ -1,8 +1,8 @@
# Generating a Map of the World
To generate a map of the world using the built-in [basemap profile](planetiler-basemap), you will need a machine with
Java 16 or later installed and at least 10x as much disk space and 1.5x as much RAM as the `planet.osm.pbf` file you
start from. All testing has been done using Digital Ocean droplets with dedicated
Java 16 or later installed and at least 10x as much disk space and at least 0.5x as much RAM as the `planet.osm.pbf`
file you start from. All testing has been done using Digital Ocean droplets with dedicated
vCPUs ([referral link](https://m.do.co/c/a947e99aab25)) and OpenJDK 17 installed through `apt`. Planetiler splits work
among available CPUs so the more you have, the less time it takes.
@ -28,23 +28,41 @@ First decide where to get the `planet.osm.pbf` file:
Download the [latest release](https://github.com/onthegomap/planetiler/releases/latest) of `planetiler.jar`.
Then run `java -Xms100g -Xmx100g -jar planetiler.jar` (replacing `100g` with 1.5x the `planet.osm.pbf` size)
with these options:
If your system has at least 1.5x as much memory as the input OSM file size, run this command to store node location
cache in-memory:
- `--bounds=world` to set bounding box to the entire planet
- `--nodemap-type=sparsearray` to store node locations in a sparse array instead of a sorted table - `sortedtable` is
more efficient when there are large gaps in ID spaces (i.e. extracts) and `sparsearray` is more efficient with no/few
ID gaps (planet, or renumbered extracts).
- `--nodemap-storage=ram` to store all node locations in RAM instead of a memory-mapped file - when using `ram` give the
JVM 1.5x the input file size instead of 0.5x when using `mmap`
- `--download` to fetch [other data sources](NOTICE.md#data) automatically
- One of these to point planetiler at your data source:
- `--osm-path=path/to/planet.osm.pbf` to point Planetiler at a file you downloaded
- `--osm-url=http://url/of/planet.osm.pbf` to download automatically
- `--osm-url=s3:211011` to download a specific snapshot from the AWS Registry of Open Data or `--osm-url=s3:latest` to
download the latest snapshot
- `--area=planet` to use the file in `./data/sources/planet.osm.pbf` or download the latest snapshot from AWS S3
mirror if missing.
```bash
java -Xmx110g \
`# return unused heap memory to the OS` \
-XX:MaxHeapFreeRatio=40 \
-jar planetiler.jar \
`# Download the latest planet.osm.pbf from s3://osm-pds bucket` \
--area=planet --bounds=planet --download \
`# Accelerate the download by fetching the 10 1GB chunks at a time in parallel` \
--download-threads=10 --download-chunk-size-mb=1000 \
`# Also download name translations from wikidata` \
--fetch-wikidata \
--mbtiles=output.mbtiles \
`# Store temporary node locations in memory` \
--nodemap-type=array --storage=ram
```
If your system has less than 1.5x as much memory as the input OSM file size, run this command to store node location
cache in a temporary memory-mapped file by setting `--storage=mmap` and `-Xmx20g` to reduce the JVM's memory usage.
```bash
java -Xmx20g \
-jar planetiler.jar \
`# Download the latest planet.osm.pbf from s3://osm-pds bucket` \
--area=planet --bounds=planet --download \
`# Accelerate the download by fetching the 10 1GB chunks at a time in parallel` \
--download-threads=10 --download-chunk-size-mb=1000 \
`# Also download name translations from wikidata` \
--fetch-wikidata \
--mbtiles=output.mbtiles \
`# Store temporary node locations at fixed positions in a memory-mapped file` \
--nodemap-type=array --storage=mmap
```
Run with `--help` to see all available arguments.
@ -69,8 +87,7 @@ Then I added a script `runworld.sh` to run with 100GB of RAM:
```bash
#!/usr/bin/env bash
set -e
java -Xmx100g -Xms100g \
-XX:OnOutOfMemoryError="kill -9 %p" \
java -Xmx100g \
-jar planetiler.jar \
`# Download the latest planet.osm.pbf from s3://osm-pds bucket` \
--area=planet --bounds=world --download \

Wyświetl plik

@ -33,7 +33,7 @@ To generate a map of an area using the [basemap profile](planetiler-basemap), yo
- Java 16+ (see [CONTIRBUTING.md](CONTRIBUTING.md)) or [Docker](https://docs.docker.com/get-docker/)
- at least 1GB of free disk space plus 5-10x the size of the `.osm.pbf` file
- at least 1.5x as much free RAM as the input `.osm.pbf` file size
- at least 0.5x as much free RAM as the input `.osm.pbf` file size
#### To build the map:
@ -116,15 +116,23 @@ See the [planetiler-examples](planetiler-examples) project.
## Benchmarks
Some example runtimes (excluding downloading resources):
Some example runtimes for the Basemap OpenMapTiles-compatible profile (excluding downloading resources):
| Input | Profile | Machine | Time | mbtiles size | Logs |
|-------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------|----------------------------------------------------------|---------------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------|
| s3://osm-pds/2021/planet-211011.osm.pbf (65GB) | Basemap | DO 16cpu 128GB | 3h9m cpu:42h1m avg:13.3 | 99GB | [logs](planet-logs/v0.1.0-planet-do-16cpu-128gb.txt), [VisualVM Profile](planet-logs/v0.1.0-planet-do-16cpu-128gb.nps) |
| [Daylight Distribution v1.6](https://daylightmap.org/2021/09/29/daylight-v16-released.html) with ML buildings and admin boundaries (67GB) | Basemap | DO 16cpu 128GB | 3h13m cpu:43h40m avg:13.5 | 101GB | [logs](planet-logs/v0.1.0-daylight-do-16cpu-128gb.txt) |
| s3://osm-pds/2021/planet-211011.osm.pbf (65GB) | Basemap (without z13 building merge) | Linode 50cpu 128GB | 1h9m cpu:24h36m avg:21.2 | 97GB | [logs](planet-logs/v0.1.0-planet-linode-50cpu-128gb.txt), [VisualVM Profile](planet-logs/v0.1.0-planet-linode-50cpu-128gb.nps) |
| s3://osm-pds/2021/planet-220214.osm.pbf (67GB) | Basemap v0.3.0 (without z13 building merge) | r6g.16xlarge (64cpu/512GB) with ramdisk and write to EFS | 1h1m cpu:24h33m avg:24.3 | 104GB | [logs](planet-logs/v0.3.0-planet-r6g-64cpu-512gb-ramdisk.txt) |
| s3://osm-pds/2021/planet-220307.osm.pbf (67GB) | Basemap v0.3.0 (without z13 building merge) | c5ad.16xlarge (64cpu/128GB) | 47m cpu:26h53m avg:34.2 | 97GB | [logs](planet-logs/v0.3.0-planet-c5ad-128gb.txt) |
| Input | Version | Machine | Time | mbtiles size | Logs |
|-------------------------------------------------------------------------------------------------------------------------------------------|---------|-----------------------------|-----------------------------------|--------------|------------------------------------------------------------------------------------------------------------------------|
| s3://osm-pds/2021/planet-220307.osm.pbf (67GB) | 0.4.0 | c6gd.4xlarge (16cpu/32GB) | 2h43m cpu:34h53m avg:12.8 | 103GB | [logs](planet-logs/v0.4.0-planet-c6gd-32gb.txt) |
| s3://osm-pds/2021/planet-220307.osm.pbf (67GB) | 0.4.0 | c6gd.8xlarge (32cpu/64GB) | 1h30m cpu:35h23 avg:23.5 | 103GB | [logs](planet-logs/v0.4.0-planet-c6gd-64gb.txt) |
| s3://osm-pds/2021/planet-220307.osm.pbf (67GB) | 0.4.0 | c6gd.16xlarge (64cpu/128GB) | 1h1m cpu:38h39m gc:3m39s avg:38.1 | 103GB | [logs](planet-logs/v0.4.0-planet-c6gd-128gb.txt) |
| s3://osm-pds/2021/planet-211011.osm.pbf (65GB) | 0.1.0 | DO 16cpu 128GB | 3h9m cpu:42h1m avg:13.3 | 99GB | [logs](planet-logs/v0.1.0-planet-do-16cpu-128gb.txt), [VisualVM Profile](planet-logs/v0.1.0-planet-do-16cpu-128gb.nps) |
| [Daylight Distribution v1.6](https://daylightmap.org/2021/09/29/daylight-v16-released.html) with ML buildings and admin boundaries (67GB) | 0.1.0 | DO 16cpu 128GB | 3h13m cpu:43h40m avg:13.5 | 101GB | [logs](planet-logs/v0.1.0-daylight-do-16cpu-128gb.txt) |
**Without z13 building merge:**
| Input | Version | Machine | Time | mbtiles size | Logs |
|------------------------------------------------|---------|----------------------------------------------------------|--------------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------|
| s3://osm-pds/2021/planet-220307.osm.pbf (67GB) | 0.4.0 | c6gd.16xlarge (64cpu/128GB) | 47m cpu:24h36m avg:31.4 | 97GB | [logs](planet-logs/v0.4.0-planet-c6gd-128gb-no-z13-building-merge.txt) |
| s3://osm-pds/2021/planet-220214.osm.pbf (67GB) | 0.3.0 | r6g.16xlarge (64cpu/512GB) with ramdisk and write to EFS | 1h1m cpu:24h33m avg:24.3 | 104GB | [logs](planet-logs/v0.3.0-planet-r6g-64cpu-512gb-ramdisk.txt) |
| s3://osm-pds/2021/planet-211011.osm.pbf (65GB) | 0.1.0 | Linode 50cpu 128GB | 1h9m cpu:24h36m avg:21.2 | 97GB | [logs](planet-logs/v0.1.0-planet-linode-50cpu-128gb.txt), [VisualVM Profile](planet-logs/v0.1.0-planet-linode-50cpu-128gb.nps) |
## Alternatives

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Wyświetl plik

@ -7,7 +7,7 @@
<artifactId>planetiler-core</artifactId>
<name>Planetiler Core</name>
<description>Planetiler Core</description>
<description>Planetiler is tool to build planet-scale vector tilesets from OpenStreetMap data fast.</description>
<parent>
<groupId>com.onthegomap.planetiler</groupId>

Wyświetl plik

@ -66,6 +66,9 @@ public class Planetiler {
private final Arguments arguments;
private final Stats stats;
private final Path tmpDir;
private final Path nodeDbPath;
private final Path multipolygonPath;
private final Path featureDbPath;
private final boolean downloadSources;
private final boolean onlyDownloadSources;
private Profile profile = null;
@ -76,8 +79,6 @@ public class Planetiler {
private Path output;
private boolean overwrite = false;
private boolean ran = false;
private Path nodeDbPath;
private Path multipolygonPath;
// most common OSM languages
private List<String> languages = List.of(
"en", "ru", "ar", "zh", "ja", "ko", "fr",
@ -98,6 +99,11 @@ public class Planetiler {
tmpDir = arguments.file("tmpdir", "temp directory", Path.of("data", "tmp"));
onlyDownloadSources = arguments.getBoolean("only_download", "download source data then exit", false);
downloadSources = onlyDownloadSources || arguments.getBoolean("download", "download sources", false);
nodeDbPath = arguments.file("temp_nodes", "temp node db location", tmpDir.resolve("node.db"));
multipolygonPath =
arguments.file("temp_multipolygons", "temp multipolygon db location", tmpDir.resolve("multipolygon.db"));
featureDbPath = arguments.file("temp_features", "temp feature db location", tmpDir.resolve("feature.db"));
}
/** Returns a new empty runner that will get configuration from {@code arguments}. */
@ -487,13 +493,15 @@ public class Planetiler {
}
// in case any temp files are left from a previous run...
FileUtils.delete(tmpDir);
FileUtils.delete(tmpDir, nodeDbPath, featureDbPath, multipolygonPath);
Files.createDirectories(tmpDir);
FileUtils.createParentDirectories(nodeDbPath, featureDbPath, multipolygonPath, output);
if (!toDownload.isEmpty()) {
download();
}
ensureInputFilesExist();
Files.createDirectories(tmpDir);
if (fetchWikidata) {
Wikidata.fetch(osmInputFile(), wikidataNamesFile, config(), profile(), stats());
}
@ -510,13 +518,10 @@ public class Planetiler {
config.bounds().setFallbackProvider(osmInputFile);
}
Files.createDirectories(tmpDir);
nodeDbPath = tmpDir.resolve("node.db");
multipolygonPath = tmpDir.resolve("multipolygon.db");
Path featureDbPath = tmpDir.resolve("feature.db");
featureGroup = FeatureGroup.newDiskBackedFeatureGroup(featureDbPath, profile, config, stats);
stats.monitorFile("nodes", nodeDbPath);
stats.monitorFile("features", featureDbPath);
stats.monitorFile("multipolygons", multipolygonPath);
stats.monitorFile("mbtiles", output);
for (Stage stage : stages) {
@ -554,11 +559,11 @@ public class Planetiler {
long outputSize = profile.estimateOutputBytes(osmSize);
// node locations and multipolygon geometries only needed while reading inputs
readPhase.addDisk(tmpDir, nodeMapSize, "temporary node location cache");
readPhase.addDisk(tmpDir, multipolygonGeometrySize, "temporary multipolygon geometry cache");
readPhase.addDisk(nodeDbPath, nodeMapSize, "temporary node location cache");
readPhase.addDisk(multipolygonPath, multipolygonGeometrySize, "temporary multipolygon geometry cache");
// feature db persists across read/write phase
readPhase.addDisk(tmpDir, featureSize, "temporary feature storage");
writePhase.addDisk(tmpDir, featureSize, "temporary feature storage");
readPhase.addDisk(featureDbPath, featureSize, "temporary feature storage");
writePhase.addDisk(featureDbPath, featureSize, "temporary feature storage");
// output only needed during write phase
writePhase.addDisk(output, outputSize, "mbtiles output");
// if the user opts to remove an input source after reading to free up additional space for the output...

Wyświetl plik

@ -88,12 +88,14 @@ public class FileUtils {
}
}
/** Deletes a file or directory recursively, failing silently if missing. */
public static void delete(Path path) {
if (Files.isDirectory(path)) {
deleteDirectory(path);
} else {
deleteFile(path);
/** Deletes files or directories recursively, failing silently if missing. */
public static void delete(Path... paths) {
for (Path path : paths) {
if (Files.isDirectory(path)) {
deleteDirectory(path);
} else {
deleteFile(path);
}
}
}
@ -139,22 +141,24 @@ public class FileUtils {
}
/**
* Ensures all parent directories of {@code path} exist.
* Ensures all parent directories of each path in {@code paths} exist.
*
* @throws IllegalStateException if an error occurs
*/
public static void createParentDirectories(Path path) {
try {
if (Files.isDirectory(path) && !Files.exists(path)) {
Files.createDirectories(path);
} else {
Path parent = path.getParent();
if (parent != null && !Files.exists(parent)) {
Files.createDirectories(parent);
public static void createParentDirectories(Path... paths) {
for (var path : paths) {
try {
if (Files.isDirectory(path) && !Files.exists(path)) {
Files.createDirectories(path);
} else {
Path parent = path.getParent();
if (parent != null && !Files.exists(parent)) {
Files.createDirectories(parent);
}
}
} catch (IOException e) {
throw new IllegalStateException("Unable to create parent directories " + path, e);
}
} catch (IOException e) {
throw new IllegalStateException("Unable to create parent directories " + path, e);
}
}

Wyświetl plik

@ -163,14 +163,19 @@ public class Wikidata {
*/
public static WikidataTranslations load(Path path) {
Timer timer = Timer.start();
try (BufferedReader fis = Files.newBufferedReader(path)) {
WikidataTranslations result = load(fis);
LOGGER.info(
"loaded from " + result.getAll().size() + " mappings from " + path.toAbsolutePath() + " in " + timer.stop());
return result;
} catch (IOException e) {
LOGGER.info("error loading " + path.toAbsolutePath() + ": " + e);
if (!Files.exists(path)) {
LOGGER.info("no wikidata translations found, run with --fetch-wikidata to download");
return new WikidataTranslations();
} else {
try (BufferedReader fis = Files.newBufferedReader(path)) {
WikidataTranslations result = load(fis);
LOGGER.info(
"loaded from " + result.getAll().size() + " mappings from " + path.toAbsolutePath() + " in " + timer.stop());
return result;
} catch (IOException e) {
LOGGER.info("error loading " + path.toAbsolutePath() + ": " + e);
return new WikidataTranslations();
}
}
}

Wyświetl plik

@ -10,7 +10,9 @@
<packaging>pom</packaging>
<name>Planetiler Parent</name>
<description>Planetiler Parent</description>
<description>
Super pom of the Planetiler tool for building planet-scale vector tilesets from OpenStreetMap data fast.
</description>
<url>https://github.com/onthegomap/planetiler</url>
<properties>