msramalho
5324d562ba
cleanup wacz patch
2024-02-21 18:14:30 +00:00
msramalho
5bf0a0206d
version update
2024-02-21 17:26:07 +00:00
msramalho
4941823565
fix growing volume size in wacz_enricher
2024-02-21 17:25:55 +00:00
msramalho
27310c2911
fixes issue with api requests
2024-02-21 12:25:05 +00:00
msramalho
eb973ba42d
v0.9.1 fixes to bad parsing in ssl certificates
2024-02-20 19:31:19 +00:00
Miguel Sozinho Ramalho
7a21ae96af
V0.9.0 - closes several open issues: new enrichers and bug fixes ( #133 )
...
* clean orchestrator code, add archiver cleanup logic
* improves documentation for database.py
* telethon archivers isolate sessions into copied files
* closes #127
* closes #125
* closes #84
* meta enricher applies to all media
* closes #61 adds subtitles and comments
* minor update
* minor fixes to yt-dlp subtitles and comments
* closes #17 but logic is imperfect.
* closes #85 ssl enhancer
* minimifies html, JS refactor for preview of certificates
* closes #91 adds freetsa timestamp authority
* version bump
* simplify download_url method
* skip ssl if nothing archived
* html preview improvements
* adds retrying lib
* manual download archiver improvements
* meta only runs when relevant data available
* new metadata convenience method
* html template improvements
* removes debug message
* does not close #91 yet, will need a few more certificate chaing logging
* adds verbosity config
* new instagram api archiver
* adds proxy support we
* adds proxy/end support and bug fix for yt-dlp
* proxy support for webdriver
* adds socks proxy to wacz_enricher
* refactor recursivity in inner media and display
* infinite recursive display
* foolproofing timestamping authortities
* version to 0.9.0
* minor fixes from code-review
2024-02-20 18:05:29 +00:00
msramalho
5c49124ac6
Merge branch 'main' of https://github.com/bellingcat/auto-archiver
2024-02-13 15:44:53 +00:00
Kai
b9d71d0b3f
Change submit-archive from basic to bearer auth ( #128 )
2024-02-06 15:24:15 +00:00
msramalho
b9b831ce03
v8.0.1
2024-02-01 15:08:55 +00:00
msramalho
2a773a25e8
better handling of telethon data display
2024-02-01 15:08:23 +00:00
msramalho
719645fc2d
minor improvement to html_template
2024-02-01 15:03:00 +00:00
Chu-An, Huang
71fcf5a089
fix: Correct the path of service account in google drive settings ( #123 )
...
* fix: Correct the path of service account in yaml file
* fix: Remove redefined function
* Update src/auto_archiver/storages/gd.py
* fix: remove unwanted drafting code
---------
Co-authored-by: Miguel Sozinho Ramalho <19508417+msramalho@users.noreply.github.com>
2024-02-01 15:02:04 +00:00
Tomas Apodaca
590d3fe824
Fix typo in readme ( #121 )
2024-01-24 21:17:31 +00:00
Miguel Sozinho Ramalho
e6b6b83007
0.8.0 new features and dependency updates ( #119 )
...
* wacz can extract_screenshot only
* new meta enricher
* twitter api can use multiple authentication tokens in sequence
* cleanup non-dup logic
* meta info on archive duration
* minor html report update
* updated dependencies
* new version
2023-12-20 14:13:22 +00:00
msramalho
499832d146
fix datetime parsing
2023-12-13 18:41:48 +00:00
msramalho
fa1163532b
patching now optional value
2023-12-13 13:55:31 +00:00
msramalho
96f6ea8f09
v0.7.8
2023-12-13 13:03:39 +00:00
Miguel Sozinho Ramalho
ff17dfd0aa
enables option to toggle db api writes ( #118 )
2023-12-13 12:54:47 +00:00
msramalho
0a3053bbc7
version update
2023-12-13 11:29:13 +00:00
Miguel Sozinho Ramalho
e69660be82
chooses most complete result from api ( #117 )
2023-12-13 11:28:27 +00:00
Miguel Sozinho Ramalho
a786d4bb0e
chooses most complete result from api ( #116 )
2023-12-13 11:26:46 +00:00
Miguel Sozinho Ramalho
128d4136e3
fixes empty api search results ( #115 )
2023-12-13 10:51:25 +00:00
Miguel Sozinho Ramalho
98fb574d89
fixing older db entries formats ( #114 )
2023-12-12 22:47:54 +00:00
Miguel Sozinho Ramalho
6f36e92e02
enables api_db cache queries if configured with new option ( #113 )
2023-12-12 19:20:26 +00:00
Miguel Sozinho Ramalho
3e56ef137d
reduce s3 duplicating while keeping random urls via hash ( #112 )
2023-12-12 19:12:03 +00:00
Jett Chen
9ee323a654
Set _mimetype for final media of html formatter ( #111 )
2023-12-11 11:47:04 +00:00
Kai
9eb39943c7
Extract text in wacz_enricher ( #110 )
2023-12-05 22:24:12 +00:00
msramalho
8624e9f177
version update 0.7.1
2023-11-13 11:58:43 +01:00
Galen Reich
381940f5a8
Fix Selenium headless invokation ( #106 )
...
Co-authored-by: msramalho <19508417+msramalho@users.noreply.github.com>
2023-11-13 11:56:35 +01:00
msramalho
1382f8b795
version bump and release without commit
2023-09-22 10:18:58 +01:00
Dave Mateer
fac8364762
Updated gd.py to work with shared folders ( #102 )
...
Co-authored-by: msramalho <19508417+msramalho@users.noreply.github.com>
2023-09-22 10:17:54 +01:00
msramalho
0feeb0bd24
Bump version to v0.6.12 for release
2023-09-20 10:18:44 +01:00
msramalho
ddb9dc87d7
unfortunately needed twitter->x
2023-09-20 10:17:31 +01:00
msramalho
e8935b9a80
Bump version to v0.6.11 for release
2023-09-15 19:53:07 +01:00
msramalho
b157f9a6b1
renaming variable
2023-09-15 19:52:47 +01:00
msramalho
ea38a604bb
fixes #96 by not assigning to self.prop
2023-09-15 19:35:35 +01:00
msramalho
53494c961e
Bump version to v0.6.10 for release
2023-09-14 17:50:08 +01:00
Kai
f7839a99cc
Add configs for path to write and read wacz archives ( #93 )
...
Co-authored-by: msramalho <19508417+msramalho@users.noreply.github.com>
2023-09-14 17:49:37 +01:00
msramalho
7a2119e6e9
Bump version to v0.6.9 for release
2023-09-12 20:08:00 +01:00
Miguel Sozinho Ramalho
3ae25e51e7
adds flexibile setup for wacz in docker ( #94 )
2023-09-12 20:07:21 +01:00
msramalho
9584193d69
Bump version to v0.6.8 for release
2023-09-08 15:10:02 +01:00
msramalho
0dd45d90f1
fix: docker+wacz troubles
2023-09-08 15:09:50 +01:00
msramalho
edcb2da74a
Bump version to v0.6.7 for release
2023-09-06 17:07:14 +01:00
msramalho
17d9bf694f
fix docker image so as not to remove browsertrix files
2023-09-06 17:07:10 +01:00
Miguel Sozinho Ramalho
368395ffa8
Merge pull request #88 from djhmateer/v6-test
2023-08-28 11:09:28 +01:00
Miguel Sozinho Ramalho
21d7d2e16c
format youtubedl_archiver.py
2023-08-28 11:09:03 +01:00
Dave Mateer
0bbb4c9b08
Added noplaylist true to youtubedl so that videos in playlists will work
2023-08-27 17:26:36 +01:00
msramalho
a30607801f
Bump version to v0.6.6 for release
2023-08-24 17:10:16 +01:00
Miguel Sozinho Ramalho
c75d54a4ec
Merge pull request #87 from bellingcat/fix-wacz
2023-08-24 17:09:49 +01:00
msramalho
804fcb1204
browsertrix dependencies isolated into dockerfile
2023-08-24 16:57:58 +01:00