* Switch OMT to use the new tools v4.0.0
* borders are dynamically generated from the PBF file instead of downloading a prepared CSV file
* all tools are executed as current user instead of root, thus files are easier to modify/delete if needed
* all data is stored in the local file system instead of docker volumes (Docker currently has a limitation of non-root operation for internal volumes). This also makes it easier to examine and test it.
* New `init-dirs` make target creates all the needed dirs - `build, data, cache`
* `make clean` deletes the whole `build` dir instead of individual files.
* `clean-docker` for backward compatibility deletes `cache` dirs (it used to be a volume)
* all `psql` calls are now done with `ON_ERROR_STOP=1`
* got rid of `pgclimb-*` targets -- same results can be done with `psql` (`pgclimb-list-views` & `pgclimb-list-tables` renamed to `list-views` and `list-tables`)
Add a one-liner script to take the tile lists from the dated subfolders, merge and deduplicate them, and write it out to tiles.txt.
Also update docs to reflect current behavior of `docker-compose run update-osm`.
This PR allows queries to be parallelized on recent versions of Postgres. The `PARALLEL SAFE` modifier has been added to the layer functions and a PLPGSQL function to convert strings into number has been replaced.
`PARALLEL SAFE` is a modifier for `CREATE FUNCTION` available since Postgres 9.6, so this change does not break current OpenMapTiles supported database version. More details about this topic [here](https://www.postgresql.org/docs/current/parallel-safety.html) and at the reference documentation for [`CREATE FUNCTION`](https://www.postgresql.org/docs/current/sql-createfunction.html).
### Testing procedure
The procedure to test this was:
* Imported `spain.pbf` in a clean environment
* Dumped the OpenMapTiles database from the Postgres Docker image
* Created a clean Postgres 12 database using the default Docker image
* Installed `postgis` 3 from the default Debian package and `osml10n` 2.5.8 from the repository (`make`, etc.)
* Restored the dump
* Lowered the postgres planner parameters for triggering parallel plans:
```sql
set parallel_setup_cost = 5;
set parallel_tuple_cost = 0.005;
```
* Manually added the `PARALLEL SAFE` modifier to each function involved in layer queries (not on updates or inserting functions).
* For each layer, run a testing query to confirm parallel workers were created, something like this:
```sql
explain analyze
select * from layer_aerodrome_label(tilebbox(8,128,95),10,null)
union all
select * from layer_aerodrome_label(tilebbox(8,128,97),10,null);
```
* After all the layers were processed and confirmed to start parallel executions, a more complete example was run. This example just retrieves the geometries for all the layers from the same tile but without using any MVT related function.
<details><summary>Testing query</summary>
```sql
-- Using the function layer_landuse
explain analyze
select geometry from layer_water(tilebbox(14,8020,6178),14)
union all
select geometry from layer_waterway(tilebbox(14,8020,6178),14)
union all
select geometry from layer_landcover(tilebbox(14,8020,6178),14)
union all
select geometry from layer_landuse(tilebbox(14,8020,6178),14)
union all
select geometry from layer_mountain_peak(tilebbox(14,8020,6178),14)
union all
select geometry from layer_park(tilebbox(14,8020,6178),14)
union all
select geometry from layer_boundary(tilebbox(14,8020,6178),14)
union all
select geometry from layer_aeroway(tilebbox(14,8020,6178),14)
union all
select geometry from layer_transportation(tilebbox(14,8020,6178),14)
union all
select geometry from layer_building(tilebbox(14,8020,6178),14)
union all
select geometry from layer_water_name(tilebbox(14,8020,6178),14)
union all
select geometry from layer_transportation_name(tilebbox(14,8020,6178),14)
union all
select geometry from layer_place(tilebbox(14,8020,6178),14)
union all
select geometry from layer_housenumber(tilebbox(14,8020,6178),14)
union all
select geometry from layer_poi(tilebbox(14,8020,6178),14)
union all
select geometry from layer_aerodrome_label(tilebbox(14,8020,6178),14);
```
</details>
You can inspect the execution plan and results on [this page](https://explain.dalibo.com/plan/3z). Also [attaching](https://github.com/openmaptiles/openmaptiles/files/3951822/explain-tile-simple.tar.gz) the query and JSON output for future reference. The website gives a ton of details, but you may want to search for nodes mentioning `workers` or `parallel` like in this area referring to `osm_border` or `osm_aeroway_linestring` entities
![image](https://user-images.githubusercontent.com/188264/70647153-9cac9300-1c48-11ea-96ea-ac7a1e2f4a79.png)
### Next steps
Since the execution plan is not showing a parallel append at the top level, meaning it's not running each layer individually, I want to continue experimenting with parameters and queries to see if it's possible to even parallelize more the request.
I will post my finding here, even no change in the code should happen.
cc. @nyurik
Co-authored-by: Yuri Astrakhan <yuriastrakhan@gmail.com>
Buildings from ways and multipolygons are loaded in table `osm_building_polygon`. But a table for `osm_building_multipolygon` is also loaded, the content is not used except to ensure an `osm_id` is from a multipolygon. To check if the object is from a multipolygon we have only to check if `osm_id` is negative. It is the counter part of e0c8ece375/layers/building/building.sql (L89)
I checked the objects are the same after this change.
Make a few more mappings declarative, and removes values declared in both SQL and the yaml file.
Here's the diff comparing `build/tileset.sql` in master vs the new PR. The changes are mostly stylistic, except when a nested `if` statement is expanded into individual `if ... and ...` conditions (logically identical)
```diff
55c55
diff --git a/build/tileset.sql b/build/tileset.sql
index 4e59357..7c6c444 100644
--- a/build/tileset.sql
+++ b/build/tileset.sql
@@ -52,7 +52,7 @@ CREATE INDEX IF NOT EXISTS osm_ocean_polygon_gen4_idx ON osm_ocean_polygon_gen4
CREATE OR REPLACE FUNCTION water_class(waterway TEXT) RETURNS TEXT AS $$
SELECT CASE
WHEN "waterway" IN ('', 'lake') THEN 'lake'
- WHEN "waterway"='dock' THEN 'dock'
+ WHEN "waterway" = 'dock' THEN 'dock'
ELSE 'river'
END;
$$ LANGUAGE SQL IMMUTABLE;
@@ -1813,24 +1813,41 @@ CREATE OR REPLACE FUNCTION highway_class(highway TEXT, public_transport TEXT, co
WHEN "highway" IN ('tertiary', 'tertiary_link') THEN 'tertiary'
WHEN "highway" IN ('unclassified', 'residential', 'living_street', 'road') THEN 'minor'
WHEN "highway" IN ('pedestrian', 'path', 'footway', 'cycleway', 'steps', 'bridleway', 'corridor')
- OR "public_transport"='platform'
+ OR "public_transport" = 'platform'
THEN 'path'
- WHEN "highway"='service' THEN 'service'
- WHEN "highway"='track' THEN 'track'
- WHEN "highway"='raceway' THEN 'raceway'
- WHEN highway = 'construction' THEN CASE
- WHEN construction IN ('motorway', 'motorway_link') THEN 'motorway_construction'
- WHEN construction IN ('trunk', 'trunk_link') THEN 'trunk_construction'
- WHEN construction IN ('primary', 'primary_link') THEN 'primary_construction'
- WHEN construction IN ('secondary', 'secondary_link') THEN 'secondary_construction'
- WHEN construction IN ('tertiary', 'tertiary_link') THEN 'tertiary_construction'
- WHEN construction IN ('', 'unclassified', 'residential', 'living_street', 'road') THEN 'minor_construction'
- WHEN construction IN ('pedestrian', 'path', 'footway', 'cycleway', 'steps', 'bridleway', 'corridor')
- OR public_transport = 'platform' THEN 'path_construction'
- WHEN construction = 'service' THEN 'service_construction'
- WHEN construction = 'track' THEN 'track_construction'
- WHEN construction = 'raceway' THEN 'raceway_construction'
- END
+ WHEN "highway" = 'service' THEN 'service'
+ WHEN "highway" = 'track' THEN 'track'
+ WHEN "highway" = 'raceway' THEN 'raceway'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('motorway', 'motorway_link')
+ THEN 'motorway_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('trunk', 'trunk_link')
+ THEN 'trunk_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('primary', 'primary_link')
+ THEN 'primary_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('secondary', 'secondary_link')
+ THEN 'secondary_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('tertiary', 'tertiary_link')
+ THEN 'tertiary_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" IN ('', 'unclassified', 'residential', 'living_street', 'road')
+ THEN 'minor_construction'
+ WHEN "highway" = 'construction'
+ AND ("construction" IN ('pedestrian', 'path', 'footway', 'cycleway', 'steps', 'bridleway', 'corridor') OR "public_transport" = 'platform')
+ THEN 'path_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" = 'service'
+ THEN 'service_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" = 'track'
+ THEN 'track_construction'
+ WHEN "highway" = 'construction'
+ AND "construction" = 'raceway'
+ THEN 'raceway_construction'
END;
$$ LANGUAGE SQL IMMUTABLE;
@@ -4073,6 +4090,12 @@ RETURNS TEXT AS $$
WHEN "subclass" IN ('fast_food', 'food_court') THEN 'fast_food'
WHEN "subclass" IN ('park', 'bbq') THEN 'park'
WHEN "subclass" IN ('bus_stop', 'bus_station') THEN 'bus'
+ WHEN ("subclass" = 'station' AND "mapping_key" = 'railway')
+ OR "subclass" IN ('halt', 'tram_stop', 'subway')
+ THEN 'railway'
+ WHEN "subclass" = 'station'
+ AND "mapping_key" = 'aerialway'
+ THEN 'aerialway'
WHEN "subclass" IN ('subway_entrance', 'train_station_entrance') THEN 'entrance'
WHEN "subclass" IN ('camp_site', 'caravan_site') THEN 'campsite'
WHEN "subclass" IN ('laundry', 'dry_cleaning') THEN 'laundry'
@@ -4082,7 +4105,7 @@ RETURNS TEXT AS $$
WHEN "subclass" IN ('hotel', 'motel', 'bed_and_breakfast', 'guest_house', 'hostel', 'chalet', 'alpine_hut', 'dormitory') THEN 'lodging'
WHEN "subclass" IN ('chocolate', 'confectionery') THEN 'ice_cream'
WHEN "subclass" IN ('post_box', 'post_office') THEN 'post'
- WHEN "subclass"='cafe' THEN 'cafe'
+ WHEN "subclass" = 'cafe' THEN 'cafe'
WHEN "subclass" IN ('school', 'kindergarten') THEN 'school'
WHEN "subclass" IN ('alcohol', 'beverages', 'wine') THEN 'alcohol_shop'
WHEN "subclass" IN ('bar', 'nightclub') THEN 'bar'
@@ -4098,9 +4121,6 @@ RETURNS TEXT AS $$
WHEN "subclass" IN ('bag', 'clothes') THEN 'clothing_store'
WHEN "subclass" IN ('swimming_area', 'swimming') THEN 'swimming'
WHEN "subclass" IN ('castle', 'ruins') THEN 'castle'
- WHEN (subclass = 'station' AND mapping_key = 'railway')
- OR (subclass IN ('halt', 'tram_stop', 'subway')) THEN 'railway'
- WHEN (subclass = 'station' AND mapping_key = 'aerialway') THEN 'aerialway'
ELSE subclass
END;
$$ LANGUAGE SQL IMMUTABLE;
@@ -4301,22 +4321,22 @@ $$
COALESCE(NULLIF(name_de, ''), name, name_en) AS name_de,
tags,
CASE
- WHEN "aerodrome"='international'
- OR "aerodrome_type"='international'
+ WHEN "aerodrome" = 'international'
+ OR "aerodrome_type" = 'international'
THEN 'international'
- WHEN "aerodrome"='public'
- OR "aerodrome_type"='civil'
+ WHEN "aerodrome" = 'public'
+ OR "aerodrome_type" = 'civil'
OR "aerodrome_type" LIKE '%public%'
THEN 'public'
- WHEN "aerodrome"='regional'
- OR "aerodrome_type"='regional'
+ WHEN "aerodrome" = 'regional'
+ OR "aerodrome_type" = 'regional'
THEN 'regional'
- WHEN "aerodrome"='military'
+ WHEN "aerodrome" = 'military'
OR "aerodrome_type" LIKE '%military%'
- OR "military"='airfield'
+ OR "military" = 'airfield'
THEN 'military'
- WHEN "aerodrome"='private'
- OR "aerodrome_type"='private'
+ WHEN "aerodrome" = 'private'
+ OR "aerodrome_type" = 'private'
THEN 'private'
ELSE 'other'
END AS class,
```
quicker and cleaner diagram image generation.
Remove etl-graph and mapping-graph targets - redundant
Also, the obsolete "fields" is still in Imposm's code and both names are accepted,
but "fields" is not documented anywhere (PR submitted), and could be removed at any moment.
Our docs were not supporting it until this PR, so renaming it at the same time.
Several images have been updated due to a more inclusive mapping scan
Requires https://github.com/openmaptiles/openmaptiles-tools/pull/147 (merged)
Minor code cleanup:
SQL already returns NULL in the "WHEN" condition
if it is not matched by any of the cases.
Co-authored-by: Eva Jelinkova <evka.jelinkova@gmail.com>
* Use _resolve_wikidata in layer mapping.yaml
Mark all tables that should not be populated with the Wikidata
international labels with a special OMT-specific flag.
This should be ok to merge even before the new tools version
is used because imposm seems to ignore anything it doesn't understand.
The next tools version will remove it when generating imposm mapping file.
* Migrate to new Wikidata importer
Uses latest tools to populate the wd_names table
during the quickstart. This can be merged already,
or we can wait for the next tools version.
* Move simplified border tables to OMT as MAT VIEWs
Consolidate derived table creation in OMT repository.
Move all non-original simplified geometry tables from import-osmborder
image to this repo, allowing further optimization.
Later we can remove derived table creation fro mthe import-osmborder image.
Mark all tables that should not be populated with the Wikidata
international labels with a special OMT-specific flag.
This should be ok to merge even before the new tools version
is used because imposm seems to ignore anything it doesn't understand.
The next tools version will remove it when generating imposm mapping file.
Move materialized view creation from the tools repo.
This PR should be merged before the https://github.com/openmaptiles/openmaptiles-tools/pull/115
Merge the other PR shortly after this one to avoid doing the same work twice - first creating simplified table, then dropping it and recreating them as materialized views.
Tag all SQL materialized views with a machine-readable comment
to indicate that this materialized view can be created without
data:
/* DELAY_MATERIALIZED_VIEW_CREATION */
In the next version of tools this comment can be optionally
replaced with the "WITH NO DATA" parameter, thus allowing
a much faster execution of the SQL script. All materialized
viewes will be populated with data in parallel afterwards
using the `refresh-views` tools script.
Simplify some of the OSM->OMT field value mappings using declarative syntax.
This approach is not for all cases, but in many it removes
the need of storing the same field in both the .yaml and .sql files.
TODO: support more complex AND/OR cases
* Use unified tools version for all images
* do not start postserve as part of quickstart, but added a help message how to start it
* wait for SQL start with pgwait