Wykres commitów

1033 Commity (7e10040bbd7f51efa9359924f247c57aa502e89c)

Autor SHA1 Wiadomość Data
Patrick Robertson 9b596e59d6 Run expensive download tests once per week, on a month at 2:35pm
(time is offset from the hour to alleviate high load on Github
2025-01-13 18:33:02 +01:00
Patrick Robertson 528b78db85 Flag tombstone tweets for twitter_syndication method 2025-01-13 18:17:24 +01:00
Patrick Robertson 57eacdc24a Merge branch 'main' into feat/unittest 2025-01-13 18:06:55 +01:00
Patrick Robertson bbef80de4c Add unit tests for html_formatter, csv_db 2025-01-13 17:58:10 +01:00
Patrick Robertson 930d78096a
Merge pull request #162 from bellingcat/small_issues
Fix two small issues
2025-01-13 16:39:59 +01:00
Patrick Robertson 2353f9d6a5 Separate CI for download tests and core tests 2025-01-13 16:27:46 +01:00
Patrick Robertson 63973e2ce7 switch to pytest and pytest-recording 2025-01-13 16:23:20 +01:00
erinhmclark e9a7f435a3 Add package dist directory to .gitignore 2025-01-13 13:33:23 +00:00
Patrick Robertson e2bc84ccb9 Merge branch 'main' into feat/unittest 2025-01-13 13:15:13 +01:00
erinhmclark 72a8e76fbb Update README.md for usage with Poetry. 2025-01-12 20:21:23 +00:00
erinhmclark c69a5fa1c9 Refactor Dockerfile for multi-stage builds.
Combining environment and runtime stages due to Poetry's dependency on source code.
2025-01-12 12:38:12 +00:00
erinhmclark d80b4b7557 Remove snscrape and Python 3.12 restriction. 2025-01-12 12:15:56 +00:00
erinhmclark cc490f9c10 Updated Dockerfile (not optimised yet) 2025-01-12 12:15:56 +00:00
erinhmclark 08e83eb94e Update pyproject.toml configuration for Poetry version 2.0.0. 2025-01-12 12:15:56 +00:00
erinhmclark dd822b8b44 Update poetry.lock 2025-01-12 12:15:56 +00:00
erinhmclark 4a63ca7753 Update PyPi workflow to read python version from pyproject.toml. 2025-01-12 12:15:56 +00:00
erinhmclark 6d5b0090d9 Pull version from pyproject.toml file/ 2025-01-12 12:15:56 +00:00
erinhmclark 26abd6f7ae Added TODO comment for adding a version restriction. 2025-01-12 12:15:56 +00:00
erinhmclark dba8f46016 Replaced comments for python-publish.yaml workflow. 2025-01-12 12:15:56 +00:00
erinhmclark 50e8c93477 Updated workflow for python-publish.yaml to use poetry (untested), and cleanup of pipenv files. 2025-01-12 12:15:56 +00:00
erinhmclark 6da837b374 Add note to update dynamic versioning and references to version. 2025-01-12 12:15:56 +00:00
erinhmclark 660ee82c67 Update Dockerfile for poetry.
Note: Review security with curl installation. Currently locked to known version, but additional checks could be added.
2025-01-12 12:15:56 +00:00
erinhmclark 5490947657 Add packaging to Poetry. 2025-01-12 12:15:56 +00:00
erinhmclark fd9a6c26ed Create Poetry environment.
Required addition of transitive package (pyOpenSSL) and version restrictions on cryptography, boto3.
2025-01-12 12:15:56 +00:00
Patrick Robertson 3546d4ad79 Fix 'download_syndication' method for tweet archiving (now requires a token)
Plus add in unit tests for token generation + download syndication
2025-01-12 12:55:00 +01:00
Patrick Robertson c932fb7416 Improved logging when an invalid/deleted tweet is attempted to be downloaded
Plus: unit tests for non-existent tweet + invalid tweet ID
2025-01-12 12:00:45 +01:00
Patrick Robertson f29950905c Merge branch 'main' into small_issues 2025-01-12 11:47:55 +01:00
Patrick Robertson 8e99d62c97
Merge pull request #165 from bellingcat/fix/snscrape
Remove snscrape from the twitter_archiver
2025-01-09 11:06:14 +01:00
Patrick Robertson 9dc4eb35de Switch to pytest and use vcr for request storing 2025-01-08 11:25:13 +01:00
Patrick Robertson 8c044c15f0 Add base test class for archivers with boilerplate code
Plus: create test class for twitter archiver. Currently WIP
2025-01-08 10:38:56 +01:00
Patrick Robertson ab9335bb7a Merge branch 'main' into feat/unittest 2025-01-08 10:35:45 +01:00
Patrick Robertson add83c9650 Remove snscrape from twitter_archiver
1. snscrape twitter downloader no longer works (ref: https://github.com/JustAnotherArchivist/snscrape/issues/1045)
2. snscrape limits python to versions <3.12
2025-01-07 19:40:19 +01:00
Miguel Sozinho Ramalho a697f0a212
adds an unauthenticated Bluesky archiver (#160)
* adds a TODO for next code iterations

* implements bsky archiver

* adds new archiver to example orchestration file

* Fix downloading media for posts with multiple images

(Images are stored in media/images)

* Setup a basic framework for unit tests

Use 'python -m unittest' from the project root to run

---------

Co-authored-by: Patrick Robertson <robertson.patrick@gmail.com>
2025-01-07 10:28:07 +00:00
Patrick Robertson bffa3a6254
Merge pull request #159 from bellingcat/print_pdf
Add 'print_pdf' option to the screenshot enricher. Fixes #132
2025-01-06 18:13:38 +01:00
Miguel Sozinho Ramalho ef471f41e1
adds better debug for wayback failures (#161) 2025-01-06 16:49:11 +00:00
Patrick Robertson 928518cda7
Allow setting cookies for yt-dl (#158) 2025-01-06 16:19:53 +00:00
Patrick Robertson 1bd017000e Add Github CI test workflow 2024-12-31 15:20:33 +01:00
Patrick Robertson 33e967ce4b Update pipfile for:
- pyopenssl==24.2.1
- youtube-dlp==2024.09.27
- numpy==2.1.3

Fixes building/local installs. Also fixes #155
2024-12-31 15:20:11 +01:00
Patrick Robertson 30d423c8e6 Setup a basic framework for unit tests
Use 'python -m unittest' from the project root to run
2024-12-31 14:29:52 +01:00
Patrick Robertson 0c803f15a5 Fix showing preview images in the .html file when using local storage
Local storage media urls are prefixed with '/', previously only http(s) media preview src were displayed
2024-12-31 09:29:31 +01:00
Patrick Robertson a46f9997ea Better logging when there's a timestamp parse error 2024-12-31 09:28:08 +01:00
msramalho 83da9ae089 adds pdf preview support for html formatter 2024-12-23 18:19:26 +00:00
Patrick Robertson 663c8ad93a Add 'print_pdf' option to the screenshot enricher. Fixes #132 2024-12-20 07:14:03 +01:00
msramalho e49550163f adds proxy_server option to wacz 2024-10-06 10:45:34 +06:00
msramalho e6f5981afc numpy version downgrade 2024-10-06 10:10:04 +06:00
msramalho c62bf1a34d yt-dlp version bump 2024-10-05 17:43:07 +06:00
msramalho b166d57e61 v0.12.0 bump 2024-08-21 13:34:34 +01:00
msramalho 11c3288267 closes #146 2024-08-21 13:33:58 +01:00
msramalho 004143a58a version bump v0.11.6 2024-07-18 11:27:39 +01:00
msramalho 686f0027c4 adds new entries to example orchestration file 2024-07-18 11:27:15 +01:00