OpenDroneMap-docs/source/large.rst

89 wiersze
4.6 KiB
ReStructuredText
Czysty Zwyk艂y widok Historia

2018-03-11 16:49:17 +00:00
.. large
Splitting Large Datasets
========================
2019-05-22 18:35:33 +00:00
Starting with ODM version ``0.6.0`` you can split up very large datasets into manageable chunks (called submodels), running the pipeline on each chunk, and then producing merged DEMs, orthophotos and point clouds. The process is referred to as "split-merge".
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Why might you use the split-merge pipeline? If you have a very large number of images in your dataset, split-merge will help make the processing more manageable on a large machine (it will require less memory). If you have many machines all connected to the same network you can also process the submodels in parallel, thus allowing for horizontal scaling and processing thousands of images more quickly.
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Split-merge works in WebODM out of the box as long as the processing nodes support split-merge, by enabling the ``--split`` option when creating a new task.
2018-03-11 16:49:17 +00:00
Calibrate images
----------------
2019-09-12 02:42:43 +00:00
Image calibration is recommended (but not required) for large datasets because error propagation due to image distortion could cause a bowl effect on the models. Calibration instructions can be found at `Calibrate Images <using.html#calibrating-the-camera>`_.
.. figure:: images/msimbasi_bowling.png
:alt: image of lens distortion effect on bowling of data
:align: center
Bowling effect on point cloud over 13,000+ image dataset collected by World Bank Tanzania over the flood prone Msimbasi Basin, Dar es Salaam, Tanzania.
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Local Split-Merge
-----------------
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Splitting a dataset into more manageable submodels and sequentially processing all submodels on the same machine is easy! Just use ``--split`` and ``--split-overlap`` to decide the the average number of images per submodels and the overlap (in meters) between submodels respectively::
2018-09-24 16:50:29 +00:00
2019-05-22 18:35:33 +00:00
docker run -ti --rm -v /my/project:/datasets/code opendronemap/odm --project-path /datasets --split 400 --split-overlap 100
2018-09-24 16:50:29 +00:00
2019-05-22 18:35:33 +00:00
If you already know how you want to split the dataset, you can provide that information and it will be used instead of the clustering algorithm.
2018-09-24 16:50:29 +00:00
2019-05-22 18:35:33 +00:00
The grouping can be provided by adding a file named image_groups.txt in the main dataset folder. The file should have one line per image. Each line should have two words: first the name of the image and second the name of the group it belongs to. For example::
2018-09-24 16:50:29 +00:00
01.jpg A
02.jpg A
03.jpg B
04.jpg B
05.jpg C
2019-05-22 18:35:33 +00:00
will create 3 submodels. Make sure to pass ``--split-overlap 0`` if you manually provide a ``image_groups.txt`` file.
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Distributed Split-Merge
-----------------------
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
ODM can also automatically distribute the processing of each submodel to multiple machines via `NodeODM <https://github.com/OpenDroneMap/NodeODM>`_ nodes, orchestrated via `ClusterODM <https://github.com/OpenDroneMap/ClusterODM>`_.
2018-03-11 16:49:17 +00:00
2019-09-12 03:03:35 +00:00
.. figure:: images/clusterodm.png
:alt: image of lens distortion effect on bowling of data
:align: center
2019-05-22 18:35:33 +00:00
The first step is start ClusterODM::
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
docker run -ti -p 3001:3000 -p 8080:8080 opendronemap/clusterodm
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Then on each machine you want to use for processing, launch a NodeODM instance via::
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
docker run -ti -p 3000:3000 opendronemap/nodeodm
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Connect via telnet to ClusterODM and add the IP addresses/port of the machines running NodeODM::
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
$ telnet <cluster-odm-ip> 8080
Connected to <cluster-odm-ip>.
Escape character is '^]'.
[...]
# node add <node-odm-ip-1> 3000
# node add <node-odm-ip-2> 3000
[...]
# node list
1) <node-odm-ip-1>:3000 [online] [0/2] <version 1.5.1>
2) <node-odm-ip-2>:3000 [online] [0/2] <version 1.5.1>
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Make sure you are running version ``1.5.1`` or higher of the NodeODM API.
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
At this point, simply use the ``--sm-cluster`` option to enable distributed split-merge::
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
docker run -ti --rm -v /my/project:/datasets/code opendronemap/odm --project-path /datasets --split 400 --split-overlap 100 --sm-cluster http://<cluster-odm-ip>:3001
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Limitations
-----------
2018-03-11 16:49:17 +00:00
2019-06-14 14:32:07 +00:00
The 3D textured meshes are currently not being merged as part of the workflow (only point clouds, DEMs and orthophotos are).
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
GCPs are fully supported, however, there needs to be at least 3 GCP points on each submodel for the georeferencing to take place. If a submodel has less than 3 GCPs, a combination of the remaining GCPs + EXIF data will be used instead (which is going to be less accurate). We recommend using the ``image_groups.txt`` file to accurately control the submodel split when using GCPs.
2018-03-11 16:49:17 +00:00
2019-05-22 18:35:33 +00:00
Aknowledgments
--------------
Huge props to Pau and the folks at Mapillary for their amazing contributions to OpenDroneMap through their OpenSfM code, which is a key component of the split-merge pipeline. We look forward to further pushing the limits of OpenDroneMap and seeing how big a dataset we can process.