Wykres commitów

960 Commity (cc14e5cb9fa4e66a7d08e4b5fce99d67127a12eb)

Autor SHA1 Wiadomość Data
erinhmclark ba4b330881 Merge remote-tracking branch 'origin/more_mainifests' into more_mainifests 2025-01-24 08:04:27 +00:00
erinhmclark cbafbfab3f Revert Dockerfile changes 2025-01-24 08:04:09 +00:00
Patrick Robertson 9befb9776c Fix loading modules when entry_point isn't set 2025-01-23 21:08:54 +01:00
Patrick Robertson 06f6e34d9d Revert changes to orchestrator to avoid merge conflicts 2025-01-23 20:38:36 +01:00
Patrick Robertson b27bf8ffeb Fix up loading/storing configs + unit tests 2025-01-23 20:32:19 +01:00
erinhmclark 50f4ebcdc3 Move storage configs into individual manifests, assert format on useage. 2025-01-23 17:01:30 +00:00
erinhmclark c3403ced26 Rename storages for clarity 2025-01-23 16:51:17 +00:00
erinhmclark 1274a1b231 More manifests, base modules and rename from archiver to extractor. 2025-01-23 16:40:48 +00:00
erinhmclark 9db26cdfc2 Merge branch 'load_modules' into more_mainifests
# Conflicts:
#	src/auto_archiver/core/orchestrator.py
2025-01-23 09:19:54 +00:00
erinhmclark 79684f8348 Set up feeder manifests (not merged by source yet) 2025-01-23 09:16:42 +00:00
Patrick Robertson 65ef46d01e Fix loading already loaded modules - don't load them twice 2025-01-23 00:09:39 +01:00
Patrick Robertson 550097ab7b Get module loading working properly 2025-01-22 23:54:21 +01:00
erinhmclark c517d35bdf Merge branch 'load_modules' into more_mainifests
# Conflicts:
#	src/auto_archiver/databases/__init__.py
2025-01-22 18:19:43 +00:00
erinhmclark 99c8c69085 Manifests for databases 2025-01-22 18:18:13 +00:00
Patrick Robertson ade5ea0f6f Tidy up imports + start on loading modules - program now starts much faster 2025-01-22 18:45:58 +01:00
Patrick Robertson b6b085854c Switch back to using yaml with dot notation
(two simple helper functions to convert between dot and dict notation)
2025-01-22 17:40:51 +01:00
Patrick Robertson 54995ad6ab Further tweaks based on __manifest__.py files
Loading configs now works
2025-01-22 13:11:43 +01:00
erinhmclark 7b3a1468cd Create manifest files for archiver modules. 2025-01-22 10:21:27 +01:00
Patrick Robertson 4830f99300 Get parsing of manifest and combining with config file working 2025-01-21 20:03:10 +01:00
Patrick Robertson 241b35002c Initial changes to move to '__manifest__' format 2025-01-21 19:02:38 +01:00
Patrick Robertson 03f3770223 Add __manifest__.py for generic_extractor 2025-01-21 18:00:45 +01:00
Patrick Robertson bdfc855297 Ignore pylint statements for manifest files 2025-01-21 17:59:52 +01:00
Patrick Robertson c41d93a634 Use already implemented helper to get version 2025-01-21 17:53:37 +01:00
Patrick Robertson d4fff0b6eb
Merge pull request #175 from bellingcat/youtubedlp-rewrite
Create generic archiver for all valid youtube-dl URLs, add truthsocial extractor, unit tests for twitter_api extractor, utility methods for cleaning HTML and traversing objects
2025-01-21 17:33:39 +01:00
Patrick Robertson cd2ae3763f
Minor adjustments
Co-authored-by: Miguel Sozinho Ramalho <19508417+msramalho@users.noreply.github.com>
2025-01-21 16:24:37 +00:00
Patrick Robertson d3e3eb7639 unit tests for loading dropins 2025-01-21 16:59:45 +01:00
Patrick Robertson 9dde9b26d0 Patch in upstream changes to ytdlp for now
Seems like ytdlp may not merge https://github.com/yt-dlp/yt-dlp/pull/12098 anytime soon
2025-01-21 16:49:49 +01:00
Patrick Robertson 7c0dcbfd81 Re-add doc string to generic_archiver
(renamed from youtube_archiver)
2025-01-21 16:49:30 +01:00
Patrick Robertson 6388983815 Merge branch 'main' into youtubedlp-rewrite 2025-01-21 16:43:14 +01:00
Patrick Robertson 4bb4ebdf82 Further cleanup, abstracts 'dropins' out into generic files 2025-01-21 16:36:45 +01:00
Erin Clark 113a4db251
Merge pull request #177 from bellingcat/feat/documentation
Add Sphinx documentation and publish to RTD.
2025-01-21 09:54:41 +00:00
erinhmclark e83ccc0d7f Cleaning up configs reference and module level. 2025-01-21 09:48:46 +00:00
Patrick Robertson dff0105659 Small fixups + implement Truth code for posts with multiple media 2025-01-20 18:40:46 +01:00
Patrick Robertson fd2e7f973b Further tidy-ups, also adds some ytdlp utils to 'utils' 2025-01-20 16:31:28 +01:00
Patrick Robertson befc92deb4 Further unit test tidy ups 2025-01-17 17:29:13 +01:00
Patrick Robertson d4893ee05e Fix unit tests for base_archiver->generic_archiver rename 2025-01-17 17:08:00 +01:00
Patrick Robertson 9c5a9e1bcd Rename BaseArchiver to GenericArchiver + some other tidyups 2025-01-17 17:06:04 +01:00
Patrick Robertson 5aa717452e Quick test that the app actually runs in core tests 2025-01-17 17:02:54 +01:00
Patrick Robertson 5b20288d06 Add a 'version' arg to get the current running version 2025-01-17 16:59:57 +01:00
Patrick Robertson 59eb8f7520 Add TWITTER_BEARER_TOKEN to env for running download tests 2025-01-17 12:04:40 +01:00
Patrick Robertson 17c1c9c360 Fix up core unit tests when a twitter api key isn't provided 2025-01-17 12:02:38 +01:00
Patrick Robertson 394bcd8d47 Further refactoring of youtubedl_archiver->base_archiver
* Keep twitter_api_archiver
* Remove unit tests for obsolete archivers
* Guess filename of media using the 'Content-Type' header
* Add mechanism to run 'expensive' tests last (see conftest.py) and also flag expensive tests to fail straight off (pytest.mark.incremental)
2025-01-17 11:56:08 +01:00
erinhmclark 170f8d18a6 Add instructions to README.md, include build directories in .gitignore and do a bit more tidying, 2025-01-16 20:46:10 +00:00
Erin Clark f03ec42026
Merge pull request #174 from bellingcat/version_updates
Update versions for GH Actions and Geckodriver.
2025-01-16 14:31:26 +00:00
erinhmclark 6fabe2a189 Fixed twitter_archiver.py changes. 2025-01-16 09:56:54 +00:00
erinhmclark a6aacfa3fb Add example pre-generated configs.rst 2025-01-16 09:31:50 +00:00
erinhmclark bbb3269c2b Changes from main. 2025-01-16 09:30:32 +00:00
erinhmclark 235da33a1a Update .readthedocs.yaml path 2025-01-16 09:24:46 +00:00
erinhmclark d3eec5d90f Basic docs structure for RTD 2025-01-15 21:45:29 +00:00
Patrick Robertson 3168bed0d9 Add (skipped) test for twitter extraction with youtubedlp 2025-01-15 19:00:57 +01:00