* Add info about using MID_ZOOM to optimize tile generation
* Remove params for generation that are now automatically detected
Co-authored-by: Tomas Pohanka <TomPohys@gmail.com>
This PR adds the ability to create SQL tests that ensure that OSM data is properly imported and updated in the OpenMapTiles data schema. The tests work by injecting test OSM data and updates into the database and checking to ensure that the data is properly loaded into the database using standard SQL statements. With this framework in place, developers can now write small tests to inject known data into the database and ensure that imports and updates are working correctly.
In addition to the framework, basic tests are provided for four layers. These initial tests are in no way comprehensive, but they provide a structure and framework for developers to add their own tests or expand the existing ones to cover more cases.
Usage:
`make clean && make sql-test`
## How it works
The SQL tests consist of the following parts:
1. **Test import data**, located in `tests/import`. This test data is in the [OSM XML](https://wiki.openstreetmap.org/wiki/OSM_XML) format and contains the data that should be initially injected into the database. The files are numbered in order to ensure that each test data file contains OSM id numbers that are distinct from the other files. For example, the file starting with `100` will use node ids from 100000-199999, way ids from 1000-1999, and relation ids from 100-199.
1. **Test update data**, located in `tests/update`. This test data is in the [osmChange XML](https://wiki.openstreetmap.org/wiki/OsmChange) format, and contains the data that will be used to update the test import data (in order to verify that the update process is working correctly. These files are also numbered using the same scheme as the test import data.
1. **Import SQL test script**, located at `tests/test-post-import.sql`. This script is executed after the test import data has been injected, and runs SQL-based checks to ensure that the import data was properly imported. If there are failures in the tests, an entry will be added to the table `omt_test_failures`, with one record per error that occurs during the import process. A test failure will also fail the build. To inspect the test failure messages, run `make psql` and issue the comment `SELECT * FROM omt_test_failures`.
1. **Update SQL test script**, located at `tests/test-post-update.sql`. This script performs the same function as the import test script, except that it occurs after the test update data has been applied to the database. Note that script will only run if the import script passes all tests.
This test applies 2-3 months of weekly updates on the `europe/monaco` extract from [Geofabrik](http://download.geofabrik.de/europe/).
It is worth noting that the contents of the updates may vary, and some SQL update flows may not be tested if the relevant changes did not occur during that period.
Resolves#1226
Fixes#1156Fixes#810Fixes#1228
This PR replaces `osmborder`, which is no longer maintained, with `imposm` mappings and SQL code to generate borders. Key features that were moved into the imposm/SQL layer:
1. Grouping by `osm_id` and aggregating by lowest `admin_level` value so that there's only one copy of ways that are members of multiple relations.
2. Filtering out of point features in boundary relations (typically `admin_centre` and `label` roles).
3. Move disputed boundary detection logic into SQL.
This will increase the database size slightly because of the limits of what imposm can do, as some of the filtering is done in the SQL layer after importing, rather than being done in `osmborder`.
Fixes#948
This PR does the following:
1. Changes the trigger for the PR comment updater from the cron method to workflow_run, triggered on completion of the test cases. This should remove the delay between the completion of the performance tests and the updating of the corresponding comment in the PR.
2. Separates the integrity check and performance check into separate workflows and allows them to run in parallel. This will allow the project to take advantage of multiple CI runners if they're available (which appears to be the case).
In addition, this fixes an issue with post-merge undeleted/updated branches on PRs. The current "cron" method causes the CI to run the pr-update job over and over forever, unnecessarily.
As described in github/docs#799, and the [github docs](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#workflow_run), a `workflow_run` trigger will only fire when the workflow file is on the main branch. Thus, this change will not fire the PR updater on this PR. Thus there's no way to test this working properly without merging onto master and then testing on one of the other PRs.
* Set `MAX_ZOOM` to 7 by default.
* Remove `QUICKSTART_MIN/MAX_ZOOM` - unneeded complexity with two env vars. We can just use `MIN_ZOOM` and `MAX_ZOOM`. See also #261
* Generate dc-config yaml file with a new `make generate-dc-config` step. It will compute BBOX based on the downloaded data file. This step is not needed for planet generation.
* Generate Imposm replication file only when `DIFF_MODE` is `true`. Not needed otherwise. If the data source does not support it, it will throw an error.
Closes#904
* Make all data-related targets like `download*`, `import-osm`, `import-borders`, and `generate-tiles` into `area`-aware -- making it possible for multiple data files to coexist inside the `./data` dir.
* Add `make download area=... [url=...]` command to automatically download any kind of area by checking Geofabrik, BBBike, and OSM.fr, optionally from a custom URL. Supports `area=planet` too.
* Do not re-download area with `make download-*` if it already exists.
* Automatically rename `<area>-latest.osm.pbf` into `<area>.osm.pbf`
* If `area=...` parameter is not given to `make`, see if there is exactly one `*.osm.pbf` file, and if so, use `*` as the `area`.
* Configure many variables in the .env file, overriding the defaults in tools
* If `<area>.osm.pbf` exists, but `<area>.dc-config.pbf` is missing, generate it using `download-osm make-dc` command.
Also:
* closes#614
* closes#647
* partially addresses #261
* Trims SSD drives, flushes cache before each performance test. Unfortunately these are still incomplete -- need to use real hardware machines for all these to take effect.
* A bit more output in PR updater
* Adds a script to downloads multiple areas and compute their test parameters
* added a large test that uses a combined 76MB file with equatorial-guinea, liechtenstein, district-of-columbia, greater-london
* cache wikidata downloads
`master-tools` branch is the same as `master` branch, except that it uses `latest` from the tools repo. This allows us to quickly track if master is compiling correct.
Include closed PRs in the update cycle, because there could be a case that PR got closed before the job had a chance to finish, and we should still update it.
Results now show a table of how long each step took, as well as the PG database size change.
* use `time` to compute profiling for each step
* call postgres to get database size
Turned out that some update jobs failed due to
```
{
"message": "Bad credentials",
"documentation_url": "https://developer.github.com/v3"
}
```
This is probably due to credentials expiring (long workflow startup?),
or some internal github issue.
For now, removing authenticated `curl` calls because most
of them can be done anonymously, and keeping them only when needed.
* delete output escaping (forgot to remove it -- was used for the older system)
* stop early if there are no pull requests (e.g. in case this is a fork)
A cron-based approach to find pull requests, possibly from forks,
that finished profiling, and post their results as comments.
See in-depth explanation of how this works at
https://github.com/nyurik/auto_pr_comments_from_forks
* On pull request and on commit, run base test followed by the test of the change,
comparing the results, and publishing the results to the Pull Request.
If the pull request is updated, the resulting comment will be updated.
* also save quickstart.log as an artifact
Note that due to GitHub workflow security restrictions, it is not possible to post PR comments if the change originated from a fork. I am still looking for workarounds.
To view what would have been posted, in the build results at the bottom, open `PR performance` details, and expand the ` Comment on Pull Request` (and its subitem).
Optimizations: the process keeps two caches -- one for the data test file, and one for the results of the performance run for the "base" revision. If this or other PR has been executed for the same revision and the same test data, performance test will only run for the proposed changes, not for the base.
Co-authored-by: Tomas Pohanka <TomPohys@gmail.com>
This is a partial migration of https://github.com/openmaptiles/openmaptiles/pull/785
* Use `import-data` instead of `import-lakelines`, `import-water`, and `import-natural-earth`
* Upgrade docker-compose.yml to version 2.3 (allows some extra env var usage in yaml file itself)
* Remove `openmaptiles-tools:latest` usage -- no longer needed, can use current version 4.1
* `db-start` does not do a container recreation in case docker-compose.yml definition has changed.
* a few minor cleanups in quickstart.sh