If you don't want to stop the container while it's being backed up, and the container comes with a backup utility (this is true for most databases), you can label the container with commands to run before/after backing it up:
```yml
version: "3"
services:
database:
image: influxdb:1.5.4
volumes:
- influxdb-data:/var/lib/influxdb # This is where InfluxDB keeps its data
- influxdb-temp:/tmp/influxdb # This is our temp space for the backup
labels:
# These commands will be exec'd (in the same container) before/after the backup starts:
- /var/run/docker.sock:/var/run/docker.sock:ro # Allow use of the "pre/post exec" feature
- influxdb-temp:/backup/influxdb:ro # Mount the temp space so it gets backed up
- ./backups:/archive # Mount a local folder as the backup archive
volumes:
influxdb-data:
influxdb-temp:
```
The above configuration will perform a `docker exec` for the database container with `influxd backup`, right before the backup runs. The resulting DB snapshot is written to a temp volume (`influxdb-temp`), which is then backed up. Note that the main InfluxDB data volume (`influxdb-data`) isn't used at all, as it'd be unsafe to read while the DB process is running.
Similarly, after the temp volume has been backed up, it's cleaned up with another `docker exec` in the database container, this time just invoking `rm`.
`BACKUP_SOURCES` | `/backup` | Where to read data from. This can be a space-separated list if you need to back up multiple paths, when mounting multiple volumes for example. On the other hand, you can also just mount multiple volumes under `/backup` to have all of them backed up.
`BACKUP_CRON_EXPRESSION` | `@daily` | Standard debian-flavored `cron` expression for when the backup should run. Use e.g. `0 4 * * *` to back up at 4 AM every night. See the [man page](http://man7.org/linux/man-pages/man8/cron.8.html) or [crontab.guru](https://crontab.guru/) for more.
`BACKUP_FILENAME` | `backup-%Y-%m-%dT%H-%M-%S.tar.gz` | File name template for the backup file. Is passed through `date` for formatting. See the [man page](http://man7.org/linux/man-pages/man1/date.1.html) for more.
`BACKUP_ARCHIVE` | `/archive` | When this path is available within the container (i.e. you've mounted a Docker volume there), a finished backup file will get archived there after each run.
`BACKUP_WAIT_SECONDS` | `0` | The backup script will sleep this many seconds between re-starting stopped containers, and proceeding with archiving/uploading the backup. This can be useful if you don't want the load/network spike of a large upload immediately after the load/network spike of container startup.
`BACKUP_HOSTNAME` | `$(hostname)` | Name of the host (i.e. Docker container) in which the backup runs. Mostly useful if you want a specific hostname to be associated with backup metrics (see InfluxDB support).
`AWS_S3_BUCKET_NAME` | | When provided, the resulting backup file will be uploaded to this S3 bucket after the backup has ran.
`AWS_ACCESS_KEY_ID` | | Required when using `AWS_S3_BUCKET_NAME`.
`AWS_SECRET_ACCESS_KEY` | | Required when using `AWS_S3_BUCKET_NAME`.
`AWS_DEFAULT_REGION` | | Optional when using `AWS_S3_BUCKET_NAME`. Allows you to override the AWS CLI default region. Usually not needed.
`INFLUXDB_URL` | | When provided, backup metrics will be sent to an InfluxDB instance at this URL, e.g. `https://influxdb.example.com`.
`INFLUXDB_DB` | | Required when using `INFLUXDB_URL`; e.g. `my_database`.
`INFLUXDB_CREDENTIALS` | | Required when using `INFLUXDB_URL`; e.g. `user:pass`.
`INFLUXDB_MEASUREMENT` | `docker_volume_backup` | Required when using `INFLUXDB_URL`.
If so configured, they can also be shipped to an InfluxDB instance. This allows you to set up monitoring and/or alerts for them. Here's a sample visualization on Grafana:
Amazon S3 has [Versioning](https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html) and [Object Lifecycle Management](https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html) features that can be useful for backups.
First, you can enable versioning for your backup bucket:
![S3 versioning](doc/s3-versioning.png)
Then, you can change your backup filename to a static one, for example:
```yml
environment:
BACKUP_FILENAME: latest.tar.gz
```
This allows you to retain previous versions of the backup file, but the _most recent_ version is always available with the same filename:
$ aws s3 cp s3://my-backup-bucket/latest.tar.gz .
download: s3://my-backup-bucket/latest.tar.gz to ./latest.tar.gz
To make sure your bucket doesn't continue to grow indefinitely, you can enable some lifecycle rules:
![S3 lifecycle](doc/s3-lifecycle.png)
These rules will:
- Move non-latest backups to a cheaper, long-term storage class ([Glacier](https://aws.amazon.com/glacier/))
- Permanently remove backups after a year
- Still always keep the latest backup available (even after a year has passed)
New images can be conveniently built on [Docker Hub](https://hub.docker.com/r/futurice/docker-volume-backup/~/settings/automated-builds/). Update the tag name, save, and use the "Trigger" button: