bellingcat Python script to automatically archive social media posts, videos, and images from a Google Sheets document.
 
 
 
 
 
 
Go to file
Patrick Robertson 7e360240bf Copy ytdlp code into AA project - seems like ytdlp won't be merged anytime soon 2025-03-17 09:57:05 +00:00
.github/workflows removes fixed oscrypto dependency, it blocked pypi publishing (#195) 2025-02-14 10:51:56 +00:00
docs Remove the global _LAZY_LOADED_MODULES and allow each instance of ArchivingOrchestrator to load its own modules 2025-02-19 12:25:35 +00:00
scripts Update modules for new core structure. 2025-01-30 08:42:23 +00:00
src/auto_archiver Copy ytdlp code into AA project - seems like ytdlp won't be merged anytime soon 2025-03-17 09:57:05 +00:00
tests Create facebook dropin - working for images + text. CAVEAT: only gets the first ~100 chars of the post at the moment 2025-02-25 11:44:35 +00:00
.dockerignore docker initial files 2022-10-31 17:10:55 +00:00
.gitignore Use a script to auto-generate documentation for the core modules from the manifest file 2025-02-10 22:51:04 +00:00
.pylintrc Ignore pylint statements for manifest files 2025-01-21 17:59:52 +01:00
.readthedocs.yaml Readthedocs now requires all packages for running the pre-build scripts 2025-02-11 18:38:41 +00:00
CONTRIBUTING.md Fix links to docs 2025-02-12 11:41:54 +00:00
Dockerfile modifies base docker image to use browsertrix 1.4.2 (#182) 2025-01-24 13:59:29 +00:00
LICENSE Add LICENSE 2021-06-24 16:14:32 +02:00
README.md markdown fixes 2025-02-12 17:37:36 +00:00
docker-compose.yaml Add docker-compose for easy building and running of docker image in dev 2025-01-15 14:36:02 +01:00
poetry.lock Updates tests to use pytest-mock. 2025-02-18 23:32:03 +00:00
pyproject.toml Merge pull request #210 from bellingcat/logger_fix 2025-02-19 15:11:42 +00:00

README.md

Auto Archiver

PyPI version Docker Image Version (latest by date) Core Test Status Download Test Status

Auto Archiver is a Python tool to automatically archive content on the web in a secure and verifiable way. It takes URLs from different sources (e.g. a CSV file, Google Sheets, command line etc.) and archives the content of each one. It can archive social media posts, videos, images and webpages. Content can enriched, then saved either locally or remotely (S3 bucket, Google Drive). The status of the archiving process can be appended to a CSV report, or if using Google Sheets – back to the original sheet.

Read the article about Auto Archiver on bellingcat.com.

Installation

View the Installation Guide for full instructions

To get started quickly using Docker:

docker pull bellingcat/auto-archiver && docker run

Or pip:

pip install auto-archiver && auto-archiver --help

Contributing

We welcome contributions to the Auto Archiver project! See the Contributing Guide for how to get involved!