32 KiB
Configurable Planetiler Schema
You can define how planetiler turns input sources into vector tiles by running planetiler with a YAML configuration file as the first argument:
# from a java build
java -jar planetiler.jar generate-custom --schema=schema.yml
# or with docker (put the schema in data/schema.yml to include in the attached volume)
docker run -v "$(pwd)/data":/data ghcr.io/onthegomap/planetiler:latest generate-custom --schema=/data/schema.yml
Schema files are in YAML 1.2 format and support anchors and aliases for reusing chunks. This page and accompanying JSON schema describe the required format and available options. See the samples directory for working examples.
🚧 The configuration schema is under active development so the format may change between releases. Only a subset of the Java API is currently exposed so for more complex schemas you should switch to the Java API (see the examples project). Feedback is welcome to help shape the final product!
Root
The root of the schema has the following attributes:
schema_name
- A descriptive name for the schemaschema_description
- A longer description of the schemaattribution
- An attribution string, which may include HTML such as linkssources
- An object where key is the source ID and object is the Source definition that points to a file containing geographic features to processtag_mappings
- Specifies that certain tag key should have their values treated as a certain data type. See Tag Mappings.layers
- A list of vector tile Layers to emit and their definitionsexamples
- A list of Test Case input features and the vector tile features they should map to, or a relative path to a file with those examples in it. Run planetiler withverify schema_file.yml
to see if they work as expected.args
- Set default values for arguments that can be referenced later in the config and overridden from the command-line or environmental variables. See Arguments.definitions
- An unparsed spot where you can define anchor labels to be used in other parts of the schema
For example:
schema_name: Power Lines
schema_description: A map of power lines from OpenStreetMap
attribution: <a href="https://www.openstreetmap.org/copyright" target="_blank">© OpenStreetMap contributors</a>
sources: { ... }
tag_mappings: { ... }
layers: [...]
args: { ... }
examples: [...]
Source
A description that tells planetiler how to read geospatial objects with tags from an input file.
type
- Enum representing the file format of the data source, one ofosm
,shapefile
, orgeopackage
.local_path
- Local path to the file to use, inferred fromurl
if missing. Can be a string or expression that can reference argument values.url
- Location to download the file from if not present atlocal_path
. For geofabrik named areas, usegeofabrik:
prefixes, for examplegeofabrik:rhode-island
. Can be a string or expression that can reference argument values.projection
- Planetiler will try to determine the projection automatically for shapefile/geopackage sources, but if that is not correct you can override the projection by specifying a coordinate reference system authority code likeEPSG:3857
orEPSG:4326
here. Can be a string or expression that can reference argument values.
For example:
sources:
osm:
type: osm
url: geofabrik:switzerland
Tag Mappings
Specifies that certain tags should have their values parsed to a certain data type. This can be specified as an object where key is the tag name and value is the data type, for example:
tag_mappings:
population: integer
If you still want to be able to access the original value, then you can remap the parsed value into a new tag
using type
and input
fields:
tag_mappings:
population_as_int:
input: population
type: integer
Arguments
A map from argument name to its definition. Arguments can be referenced later in the config and overridden from the command-line or environmental variables. Argument definitions can either be an object with these properties, or just the default value:
default
- Default value for the argument. Can be an expression that references other arguments.description
- Description of the argument to print when parsing it.type
- Data type to use when parsing the value. If missing, then inferred from the default value.
For example:
# Define an "area" argument with default value "switzerland"
# and a "path" argument that defaults to the area with .osm.pbf extension
args:
area:
description: Geofabrik area to download
default: switzerland
osm_local_path: '${ args.area + ".osm.pbf" }'
# Use the value of the "area" and "path" arguments to construct the source definition
sources:
osm:
type: osm
url: '${ "geofabrik:" + args.area }'
local_path: '${ args.osm_local_path }'
You can pass in --area=france
from the command line to set download URL to geofabrik:france
and local path
to france.osm.pbf
. Planetiler searches for argument values in this order:
- Command-line arguments
--area=france
- JVM Properties with "planetiler." prefix:
java -Dplanetiler.area=france ...
- Environmental variables with "PLANETILER_" prefix:
PLANETILER_AREA=france java ...
- Default value from the config
Argument values are available from the args
variable in
an inline script expression or the arg_value
expression.
Built-in arguments
args
can also be used to set the default value for built-in arguments that control planetiler's behavior:
threads
- Default number of threads to use.write_threads
- Default number of threads to use when writing temp featuresprocess_threads
- Default number of threads to use when processing input featuresfeature_read_threads
- Default number of threads to use when reading features at tile write timeminzoom
- Minimum tile zoom level to emitmaxzoom
- Maximum tile zoom level to emitrender_maxzoom
- Maximum rendering zoom level up toforce
- Overwriting output file and ignore warningscompress_temp
- Gzip temporary feature storage (uses more CPU, but less disk space)mmap_temp
- Use memory-mapped IO for temp feature filessort_max_readers
- Maximum number of concurrent read threads to use when sorting chunkssort_max_writers
- Maximum number of concurrent write threads to use when sorting chunksnodemap_type
- Type of node location mapnodemap_storage
- Storage for node location mapnodemap_madvise
- Use linux madvise(random) for node locationsmultipolygon_geometry_storage
- Storage for multipolygon geometriesmultipolygon_geometry_madvise
- Use linux madvise(random) for multiplygon geometrieshttp_user_agent
- User-Agent header to set when downloading files over HTTPhttp_retries
- Retries to use when downloading files over HTTPdownload_chunk_size_mb
- Size of file chunks to download in parallel in megabytesdownload_threads
- Number of parallel threads to use when downloading each filemin_feature_size_at_max_zoom
- Default value for the minimum size in tile pixels of features to emit at the maximum zoom level to allow for overzoomingmin_feature_size
- Default value for the minimum size in tile pixels of features to emit below the maximum zoom levelsimplify_tolerance_at_max_zoom
- Default value for the tile pixel tolerance to use when simplifying features at the maximum zoom level to allow for overzoomingsimplify_tolerance
- Default value for the tile pixel tolerance to use when simplifying features below the maximum zoom levelskip_filled_tiles
- Skip writing tiles containing only polygon fills to the outputtile_warning_size_mb
- Maximum size in megabytes of a tile to emit a warning about
For example:
# Tell planetiler to download sources using 10 threads
args:
download_threads: 10
Built-in arguments can also be accessed from the config file if desired: ${ args.download_threads }
.
Layer
A layer contains a thematically-related set of features from one or more input sources.
id
- Unique name of this layerfeatures
- A list of features contained in this layer. See Layer Featurestile_post_process
- Optional processing operations to merge features with the same attributes in a rendered tile. See Tile Post Process
For example:
layers:
- id: power
features:
- { ... }
- { ... }
tile_post_process:
merge_line_strings:
min_length: 1
tolerance: 1
buffer: 5
Layer Feature
A feature is a defined set of objects that meet a specified filter criteria.
source
- A string source ID, or list of source IDs from which features should be extracted. If missing, features from all sources are included.geometry
- A string enum that indicates which geometry types to include, and how to transform them. Can be one of:point
line
orpolygon
to pass the original feature throughany
(default) to pass the original feature through regardless of geometry typepolygon_centroid
to match on polygons, and emit a point at the centerline_centroid
to match on lines, and emit a point at the centroid of the lineline_midpoint
to match on lines, and emit a point at midpoint of the linecentroid
to match any geometry, and emit a point at the centerpolygon_point_on_surface
to match on polygons, and emit an interior pointpoint_on_line
to match on lines, and emit a point somewhere along the linepolygon_centroid_if_convex
to match on polygons, and if the polygon is convex emit the centroid, otherwise emit an interior pointinnermost_point
to match on any geometry and for polygons, emit the furthest point from an edge, or for lines emit the midpoint.
include_when
- A Boolean Expression which determines the features to include. If unspecified, all features from the specified sources are included.exclude_when
- A Boolean Expression which determines if a feature that matched the include expression should be skipped. If unspecified, no exclusion filter is applied.min_zoom
- An Expression that returns the minimum zoom to render this feature at.min_size
- An Expression that returns the minimum length of line features or square root of the minimum area of polygon features to emit below the maximum zoom-level of the map.attributes
- An array of Feature Attribute objects that specify the attributes to be included on this output feature.
For example:
source: osm
geometry: line
min_zoom: 7
include_when:
power:
- line
attributes:
- { ... }
- { ... }
Feature Attribute
Defines an attribute to include on an output vector tile feature and how to compute its value.
key
- ID of this attribute in the tileinclude_when
- A Boolean Expression which determines whether to include this attribute. If unspecified, the attribute will be included unless excluded byexcludeWhen
.exclude_when
- A Boolean Expression which determines whether to exclude this attribute. This rule is applied afterinclude_when
. If unspecified, no exclusion filter is applied.min_zoom
- The minimum zoom at which to render this attributemin_zoom_by_value
- Minimum zoom to render this attribute depending on the value. Contains an object with<value>: zoom
entries that indicate the minimum zoom for each output value.type
- The Data Type to coerce the value to, ormatch_key
to set this attribute to the key that triggered the match in the include expression, ormatch_value
to set it to the value for the matching key.min_tile_cover_size
- Include this attribute only on geometries over a certain size at a given zoom level, where 1.0 means the entire width of a tile for lines, or area of a tile for polygons.
To define the value, use one of:
value
- A constant string/number/boolean value, or an Expression that computes the value for this key for each input element.coalesce
- A Coalesce Expression that sets this attribute to the first non-null match from a list of expressions.tag_value
- A Tag Value Expression that sets this attribute to the value for a tag.arg_value
- An Argument Value Expression that sets this attribute to the value for a tag.
For example:
key: voltage
min_zoom: 10
include_when: "${ double(feature.tags.voltage) > 1000 }"
tag_value: voltage
type: integer
Tile Post Process
Specific tile post processing operations for merging features may be defined:
merge_line_strings
- Combines linestrings with the same set of attributes into a multilinestring where segments with touching endpoints are merged.merge_polygons
- Combines polygons with the same set of attributes into a multipolygon where overlapping/touching polygons are combined into fewer polygons covering the same area.
The follow attributes for merge_line_strings
may be set:
min_length
- Minimum tile pixel length of features to emit, or 0 to emit all merged linestrings.tolerance
- After merging, simplify linestrings using this pixel tolerance, or -1 to skip simplification step.buffer
- Number of pixels outside the visible tile area to include detail for, or -1 to skip clipping step.
The follow attribute for merge_polygons
may be set:
min_area
- Minimum area in square tile pixels of polygons to emit.
For example:
merge_line_strings:
min_length: 1
tolerance: 1
buffer: 5
merge_polygons:
min_area: 1
Data Type
A string enum that defines how to map from an input. Allowed values:
boolean
- Map 0, "no", or "false" to false and everything else to truestring
- Returns the string representation of the input valuedirection
- Maps "-1" to -1, "1" "yes" or "true" to 1, and everything else to 0. See Key:oneway.long
- Parses an input as a 64-bit signed numberinteger
- Parses an input as a 32-bit signed numberdouble
- Parses an input as a floating point number
Expression
Expressions let you define how to dynamically compute a value (attribute value, min zoom, etc.) at runtime. You can
structure data-heavy expressions in YAML (ie. match or coalesce) or
simpler expressions that require more flexibility as an inline script
using ${ expression }
syntax.
Constant Value Expression
The simplest expression just returns a constant value from a string, number or boolean, for example:
value: 1
value: 'string'
value: true
Tag Value Expression
Use tag_value:
to return the value for each feature's tag at runtime:
# return value for "natural" tag
value:
tag_value: natural
Argument Value Expression
Use arg_value:
to return the value of an argument set in the Arguments section, or overridden from the
command-line or environment.
# return value for "attr_value" argument
value:
arg_value: attr_value
Coalesce Expression
Use coalesce: [expression, expression, ...]
to make the expression evaluate to the first non-null result of a list of
expressions at runtime:
value:
coalesce:
- tag_value: highway
- tag_value: aerialway
- tag_value: railway
- "fallback value"
Match Expression
Use { value1: condition1, value2: condition2, ... }
to make the expression evaluate to the value associated
with the first matching boolean expression at runtime:
value:
# returns "farmland" if subclass is farmland, farm, or orchard
farmland:
subclass:
- farmland
- farm
- orchard
ice:
subclass:
- glacier
- ice_shelf
# "otherwise" keyword means this is the fallback value
water: otherwise
If the values are not simple strings, then you can use an array of objects with if
and value
keys and a last object with an else
key:
value:
- if:
place: city
value: 100000
- value: 5000
if:
place: town
- value: 100
if:
place: [village, neighborhood]
# fallback value
- else: 0
In some cases it is more straightforward to express match logic as a default_value
with overrides
, for example:
min_zoom:
default_value: 13
overrides:
5:
# match motorway or motorway_link
highway: motorway%
6:
highway: trunk%
8:
highway: primary%
Default values, and values associated with conditions can themselves be an Expression.
Type
Add the type
property to any expression to coerce the result to a particular data type:
value:
tag_value: oneway
type: direction
Inline Script Expression
Use ${ expression }
syntax to compute a value dynamically at runtime using an
embedded Common Expression Language (CEL) script.
For example, to normalize highway values like "motorway_link" to "motorway":
value: '${ feature.tags.highway.replace("_link", "") }'
If a script's value will never change, planetiler evaluates it once ahead of time, so you can also use this to compute a complex value with no runtime overhead:
value: "${ 8 * 24 - 2 }"
Inline Script Contexts
Scripts are parsed and evaluated inside a "context" that defines the variables available to that script.
Notice: Contexts are nested, so each child context can also access the variables from its parent.
1. Root Context
Available variables:
args
- a map from argument name to value, see also built-in arguments that are always available.
2. Process Feature Context
Context available when processing an input feature, for example testing whether to include it from include_when
.
Additional variables, on top of the root context:
feature.tags
- map with key/value tags from the input featurefeature.id
- numeric ID of the input featurefeature.source
- string source ID this feature came fromfeature.source_layer
- optional layer within the source the feature came fromfeature.osm_changeset
- optional OSM changeset ID for this featurefeature.osm_version
- optional OSM element version for this featurefeature.osm_timestamp
- optional OSM last modified timestamp for this featurefeature.osm_user_id
- optional ID of the OSM user that last modified this featurefeature.osm_user_name
- optional name of the OSM user that last modified this featurefeature.osm_type
- type of the OSM element as a string:"node"
,"way"
, or"relation"
On the original feature or any accessor that returns a geometry, you can also use:
feature.length("unit")
- length of the feature if it is a line, 0 otherwise. Allowed units: "meters"/"m", "feet" /"ft", "yards"/"yd", "nautical miles"/"nm", "kilometer"/"km" for units relative to the size in meters, or "z0 tiles"/" z0 ti", "z0 pixels"/"z0 px" for sizes relative to the size of the geometry when projected into a z0 web mercator tile containing the entire world.feature.area("unit")
- area of the feature if it is a polygon, 0 otherwise. Allowed units: any length unit like " km2", "mi2", or "z0 px2" or also "acres"/"ac", "hectares"/"ha", or "ares"/"a".feature.min_lat
/feature.min_lon
/feature.max_lat
/feature.max_lon
- returns coordinates from the bounding box of this geometryfeature.lat
/feature.lon
- returns the coordinate of an arbitrary point on this shape (useful to get the lat/lon of a point)feature.bbox
- returns the rectangle bounding box that contains this entire shapefeature.centroid
- returns the weighted center point of the geometry, which may fall outside the the shapefeature.point_on_surface
- returns a point that is within the shape (on the line, or inside the polygon)feature.validated_polygon
- if this is a polygon, fixes any self-intersections and returns the resultfeature.centroid_if_convex
- returns point_on_surface if this is a concave polygon, or centroid if convexfeature.line_midpoint
- returns midpoint of this feature if it is a linefeature.point_along_line(amount)
- when amount=0 returns the start of the line, when amount=1 returns the end, otherwise a point at a certain ratio along the linefeature.partial_line(start, end)
- returns a partial line segment from start to end where 0=the beginning of the line and 1=the endfeature.innermost_point
/feature.innermost_point(tolerance)
- returns the midpoint of a line, or the pole of inaccessibility if it is a polygon
For example:
key: bbox_area_km2
value: ${ feature.bbox.area('km2') }
3. Post-Match Context
Context available after a feature has matched, for example computing an attribute value.
Additional variables, on top of the process feature context:
match_key
- string tag that triggered a match to include the feature in this layermatch_value
- the tag value associated with that key
4. Configure Attribute Context
Context available after the value of an attribute has been computed, for example: set min zoom to render an attribute.
Additional variable, on top of the post-match context:
value
the value that was computed for this key
For example:
# return the value associated with the matching tag, converted to lower case:
value: '${ match_value.lowerAscii() }'
Built-In Functions
Inline scripts can use the standard CEL built-in functions plus the following added by planetiler (defined in PlanetilerStdLib).
coalesce(any, any, ...)
returns the first non-null argumentnullif(arg1, arg2)
returns null if arg1 is the same as arg2, otherwise arg1min(list<number>)
returns the minimum value from a listmax(list<number>)
returns the maximum value from a list- map extensions:
<map>.has(key)
returns true if the map contains a key<map>.has(key, value)
returns true if the map contains a key and the value for that key is value<map>.has(key, value1, value2, ...)
returns true if the map contains a key and the value for that key is in the list provided<map>.get(key)
similar tomap[key]
except it returns null instead of throwing an error if the map is missing that key<map>.getOrDefault(key, default)
returns the value for key if it is present, otherwise default
- string extensions:
<string>.charAt(number)
returns the character at an index from a string<string>.indexOf(string)
returns the first index of a substring or -1 if not found<string>.lastIndexOf(string)
returns the last index of a substring or -1 if not found<list>.join(separator)
returns a string that joins elements together separated by the provided string<string>.lowerAscii()
returns the input string transformed to lower-case<string>.upperAscii()
returns the input string transformed to upper-case<string>.replace(from, to)
returns the input string with all occurrences of from replaced by to<string>.replace(from, to, limit)
returns the input string with the first N occurrences of from replaced by to<string>.replaceRegex(pattern, value)
replaces every occurrence of regular expression with value from the string it was called on using java's built-in replaceAll behavior<string>.split(separator)
returns a list of strings split from the input by a separator<string>.split(separator, limit)
splits the list into up to N parts<string>.substring(n)
returns a copy of the string with first N characters omitted<string>.substring(a, b)
returns a substring from index [a, b)<string>.trim()
trims leading and trailing whitespace
Boolean Expression
A boolean expression evaluates to true or false for a given input feature. It can be specified as a structured boolean expression, a complex boolean expression, or an inline script.
Structured Boolean Expression
Boolean expressions can be specified as a map from key to value or list of values. For example:
# match features where natural=glacier, waterway=riverbank, OR waterway=canal
include_when:
natural: water
waterway:
- riverbank
- canal
Planetiler optimizes runtime performance by pre-processing all of the include_when
boolean expressions in
each match expression and include_when
block in order to evaluate the minimum set of them at
runtime based on the tags present on the feature.
To match when a tag is present, use the __any__
keyword:
# match when the feature has a building tag
include_when:
building: __any__
To match when a feature does not have a tag use ''
as the value:
# exclude features without a name tag
exclude_when:
name: ""
To match when the value for a key matches a pattern, use the %
wildcard character:
# include features where highway tag ends in "_link"
include_when:
highway: "%_link"
When a feature matches a boolean expression in the include_when
field, the first key that triggered the match is
available to other expressions as match_key
and its value is available as match_value
(See Post-Match Context):
include_when:
highway:
- motorway%
- trunk%
- primary%
railway: rail
attributes:
# set "kind" attribute to the value for highway or railway, with trailing "_link" stripped off
- key: kind
value: '${ match_value.replace("_link", ") }'
Complex Boolean Expressions
The structured boolean expressions above match when any of the tag conditions are
true, but to match only when all of them are true, you can nest them under an __all__
key:
# match when highway=pedestrian or highway=service AND area=yes
__all__:
highway:
- pedestrian
- service
area: yes
__all__
can take an array as well. By default, each array item matches if any of its children match, and you can
make that explicit with the __any__
keyword:
# match when highway=pedestrian OR foot=yes, and area=yes
__all__:
- highway: pedestrian
foot: yes
- area: yes
# equivalent to:
__all__:
- __any__:
highway: pedestrian
foot: yes
- area: yes
You can also match when the subexpression is false using the __not__
keyword:
# match when place=city AND capital is not 'yes' or '4'
__all__:
place: city
__not__:
capital: [yes, "4"]
Inline Boolean Expression Script
You can also specify boolean logic with an inline script that evaluates to true
or false
using the ${ expression }
syntax. For example:
# set the `min_zoom` attribute to:
# 2 if area > 20 million, 3 if > 7 million, 4 if > 1 million, or 5 otherwise
min_zoom:
default_value: 5
overrides:
2: "${ double(feature.tags.area) >= 2e8 }"
3: "${ double(feature.tags.area) >= 7e7 }"
4: "${ double(feature.tags.area) >= 1e7 }"
⚠️ If you use an expression script in include_when
, it will get evaluated against every input element
and will not set the match_key
or match_value
variables. When possible,
use structured boolean expressions which are optimized for runtime matching
performance.
You can, however combine a post-filter in an __all__
block which will only get evaluated if
the structured boolean expressions matches first:
# Include a feature when place=city or place=town
# AND it has a population tag
# AND the population value is greater than 10000
include_when:
__all__:
- place: [city, town]
- population: __any__
# only evaluated if previous conditions are true
- "${ double(feature.tags.population) > 10000 }"
Test Case
An example input source feature, and the expected vector tile features that it produces. Run planetiler
with verify schema.yml
to test your schema against each of the examples. Or you can add the --watch
argument watch
the input file(s) for changes and validate the test cases on each change:
# from a java build
java -jar planetiler.jar verify schema.yml --watch
# or with docker (put the schema in data/schema.yml to include in the attached volume)
docker run -v "$(pwd)/data":/data ghcr.io/onthegomap/planetiler:latest verify /data/schema.yml --watch
name
- Unique name for this test case.input
- The input feature from a source, with the following attributes:source
- ID of the source this feature comes from.geometry
- Geometry type of the input feature, one ofpoint
line
polygon
or a WKT encoding of a specific geometry.tags
- Key/value attributes on the source feature.
output
- The output vector tile feature(s) this map to, or[]
for no features. Allowed attributes:layer
- Vector tile layer of the expected output feature.geometry
- Geometry type of the expected output feature.min_zoom
- Min zoom level that the output feature appears in.max_zoom
- Max zoom level that the output feature appears in.min_size
- Minimum length of line features or square root of the minimum area of polygon features to emit below the maximum zoom-level of the map.tags
- Attributes expected on the output vector tile feature, ornull
if the attribute should not be set. Useallow_extra_tags: true
to fail if any other tags appear besides the ones specified here.allow_extra_tags
- Iftrue
, then fail when extra attributes besides tags appear on the output feature. Iffalse
or unset then ignore them.at_zoom
- Some attributes change by zoom level, so get values at this zoom level for comparison.
For example:
name: Example power=line
input:
geometry: line
source: osm
tags:
power: line
voltage: "1200"
output:
- layer: power
geometry: line
min_zoom: 7
tags:
power: line
voltage: 1200
See shortbread.spec.yml for more examples.
Anchors and Aliases
Planetiler configs let you define YAML anchors with the &
prefix and use them later with the *
prefix:
# add attributes to a feature, and also define name_en and name_de anchors that can be reused later
attributes:
- &name_en
key: name_en
tag_value: name:en
- &name_de
key: name_de
tag_value: name:de
# reuse name_en and name_de attributes on another feature
attributes:
- *name_en
- *name_de
This can be useful to avoid copy/pasting config, and to make it easier to make changes in bulk.