Wykres commitów

21 Commity (7a21ae96afb184b9e06bb19f5e5bf94af4a3396f)

Autor SHA1 Wiadomość Data
Miguel Sozinho Ramalho 7a21ae96af
V0.9.0 - closes several open issues: new enrichers and bug fixes (#133)
* clean orchestrator code, add archiver cleanup logic

* improves documentation for database.py

* telethon archivers isolate sessions into copied files

* closes #127

* closes #125

* closes #84

* meta enricher applies to all media

* closes #61 adds subtitles and comments

* minor update

* minor fixes to yt-dlp subtitles and comments

* closes #17 but logic is imperfect.

* closes #85 ssl enhancer

* minimifies html, JS refactor for preview of certificates

* closes #91 adds freetsa timestamp authority

* version bump

* simplify download_url method

* skip ssl if nothing archived

* html preview improvements

* adds retrying lib

* manual download archiver improvements

* meta only runs when relevant data available

* new metadata convenience method

* html template improvements

* removes debug message

* does not close #91 yet, will need a few more certificate chaing logging

* adds verbosity config

* new instagram api archiver

* adds proxy support we

* adds proxy/end support and bug fix for yt-dlp

* proxy support for webdriver

* adds socks proxy to wacz_enricher

* refactor recursivity in inner media and display

* infinite recursive display

* foolproofing timestamping authortities

* version to 0.9.0

* minor fixes from code-review
2024-02-20 18:05:29 +00:00
Miguel Sozinho Ramalho e6b6b83007
0.8.0 new features and dependency updates (#119)
* wacz can extract_screenshot only

* new meta enricher

* twitter api can use multiple authentication tokens in sequence

* cleanup non-dup logic

* meta info on archive duration

* minor html report update

* updated dependencies

* new version
2023-12-20 14:13:22 +00:00
Miguel Sozinho Ramalho 3e56ef137d
reduce s3 duplicating while keeping random urls via hash (#112) 2023-12-12 19:12:03 +00:00
Kai 9eb39943c7
Extract text in wacz_enricher (#110) 2023-12-05 22:24:12 +00:00
msramalho b157f9a6b1 renaming variable 2023-09-15 19:52:47 +01:00
msramalho ea38a604bb fixes #96 by not assigning to self.prop 2023-09-15 19:35:35 +01:00
Kai f7839a99cc
Add configs for path to write and read wacz archives (#93)
Co-authored-by: msramalho <19508417+msramalho@users.noreply.github.com>
2023-09-14 17:49:37 +01:00
Miguel Sozinho Ramalho 3ae25e51e7
adds flexibile setup for wacz in docker (#94) 2023-09-12 20:07:21 +01:00
msramalho 0dd45d90f1 fix: docker+wacz troubles 2023-09-08 15:09:50 +01:00
msramalho aa71c85a98 improving ignored content from waczs 2023-07-28 12:19:14 +01:00
msramalho e3a0003a47 adding WACZ screenshots 2023-07-27 21:36:25 +01:00
msramalho dd034da844 feat: WACZ enricher can now be probed for media, and used as an archiver OR enricher 2023-07-27 15:42:10 +01:00
Logan Williams c47da0a46f Fix issue with profiles in browsertrix 2023-05-11 15:08:27 +02:00
Logan Williams ac82764ffc Working, but some cleanup still necessary 2023-05-09 19:34:16 +02:00
Logan Williams 0fae7d96fb Detect running in docker container in WACZ enricher 2023-05-09 19:34:16 +02:00
msramalho 906ed0f6e0 creating global context and refactoring tmp_dir logic 2023-03-23 11:17:38 +00:00
msramalho cd81cae559 auth wall for WACZ 2023-02-20 16:08:45 +00:00
msramalho e758bd076b test 2023-02-02 12:43:23 +00:00
msramalho 9bcca427a0 wacz in gsheets 2023-02-02 12:41:06 +00:00
msramalho d1e4dde3f6 fixing imports 2023-01-27 00:19:58 +00:00
msramalho 753039240f pyproject 2023-01-21 19:01:02 +00:00