Wykres commitów

37 Commity (main)

Autor SHA1 Wiadomość Data
Miguel Sozinho Ramalho 7a21ae96af
V0.9.0 - closes several open issues: new enrichers and bug fixes (#133)
* clean orchestrator code, add archiver cleanup logic

* improves documentation for database.py

* telethon archivers isolate sessions into copied files

* closes #127

* closes #125

* closes #84

* meta enricher applies to all media

* closes #61 adds subtitles and comments

* minor update

* minor fixes to yt-dlp subtitles and comments

* closes #17 but logic is imperfect.

* closes #85 ssl enhancer

* minimifies html, JS refactor for preview of certificates

* closes #91 adds freetsa timestamp authority

* version bump

* simplify download_url method

* skip ssl if nothing archived

* html preview improvements

* adds retrying lib

* manual download archiver improvements

* meta only runs when relevant data available

* new metadata convenience method

* html template improvements

* removes debug message

* does not close #91 yet, will need a few more certificate chaing logging

* adds verbosity config

* new instagram api archiver

* adds proxy support we

* adds proxy/end support and bug fix for yt-dlp

* proxy support for webdriver

* adds socks proxy to wacz_enricher

* refactor recursivity in inner media and display

* infinite recursive display

* foolproofing timestamping authortities

* version to 0.9.0

* minor fixes from code-review
2024-02-20 18:05:29 +00:00
Kai 9eb39943c7
Extract text in wacz_enricher (#110) 2023-12-05 22:24:12 +00:00
msramalho 804fcb1204 browsertrix dependencies isolated into dockerfile 2023-08-24 16:57:58 +01:00
msramalho 92a0a92b47 closes #86 2023-08-24 12:43:28 +01:00
msramalho dd034da844 feat: WACZ enricher can now be probed for media, and used as an archiver OR enricher 2023-07-27 15:42:10 +01:00
msramalho 92569ae6be fix: telegram archiver was outdated for images 2023-07-11 12:15:56 +01:00
msramalho 485901da3c security update 2023-06-26 18:15:19 +01:00
msramalho d4f983e575 adds missing lib numpy 2023-06-26 16:55:19 +01:00
Emiel de Heij 3e340b2580 change to old status 2023-06-26 15:37:47 +02:00
Emiel de Heij f6e5a14d75 add dependencies 2023-06-26 15:24:55 +02:00
msramalho 987bbcaad0 removes conflicting unused dep 2023-05-19 11:49:29 +01:00
Logan Williams 2c5b115fbe Fix lock file issue 2023-05-09 19:34:16 +02:00
Logan Williams bda812f850 Clean up comments 2023-05-09 19:34:16 +02:00
Logan Williams ac82764ffc Working, but some cleanup still necessary 2023-05-09 19:34:16 +02:00
msramalho 7497bc08c0 Bump version to v0.4.2 for release 2023-02-23 17:14:29 +01:00
msramalho 7b9483bbf9 yt-dlp update 2023-02-22 18:28:20 +01:00
msramalho 753039240f pyproject 2023-01-21 19:01:02 +00:00
msramalho d4825196f1 html template working with jinja templates 2023-01-10 00:22:16 +00:00
msramalho b3860cfec1 telethon join channels working 2022-12-14 14:01:39 +00:00
msramalho 93be1af93f adds instagram post/profile 2022-10-18 15:45:10 +01:00
msramalho ffe1c425a0 new archiver, new hack, ready 2022-06-27 01:07:55 +02:00
msramalho 88ede91304 refactoring to use vk_url_scraper 2022-06-20 14:44:06 +02:00
msramalho 59afe7fd63 vk-archiver implemented 2022-06-15 16:38:18 +02:00
msramalho dc60bb1558 json -> yaml 2022-06-14 21:18:18 +02:00
msramalho 24544b0fe8 library updates 2022-06-07 17:28:47 +02:00
msramalho 5135e97d3f cleanup auto_archive and config 2022-06-03 18:03:49 +02:00
msramalho b58cbd2e85 package management 2022-05-25 12:19:29 +02:00
Dave Mateer b3599dee71 working 2022-05-11 14:01:22 +01:00
msramalho 0035603bfb telethon-poc 2022-03-15 18:45:53 +01:00
Logan Williams 1eb17e4de5 Add hash and screenshot methods; switch to more recent ytdl fork 2022-02-25 13:54:40 +01:00
msramalho f3ce226665 split into multiple files MVP 2022-02-21 14:19:09 +01:00
Logan Williams 009c0dd8ca Clean up dependencies 2022-02-20 11:06:47 +01:00
Logan Williams 51d448f0cb Refactor archivers to make it easier to add support for new types of URLs 2022-02-20 10:36:53 +01:00
Logan Williams ebafd1a744 Update Pipfile 2021-06-01 09:19:12 +00:00
Logan Williams 339f62fade Update auto archiver docs with new header declaration method 2021-05-12 09:01:45 +02:00
Logan Williams 9070689d95 Thumbnail and metadata extraction 2021-03-18 11:03:13 +01:00
Logan Williams d6cb20dace Combine streaming/non-streaming into one script with CLI arguments 2021-02-09 14:55:26 +01:00