Wykres commitów

705 Commity (dev)

Autor SHA1 Wiadomość Data
msramalho c1506ee1cf
some wayback errors are expected and should be warnings 2025-07-05 18:31:39 +01:00
msramalho 3a34a49822
adds antibot tiktok logic for photos closes #295 2025-07-05 18:31:12 +01:00
msramalho 37c6d97275
new auth wall check logic and escaped CSS selector in selenium 2025-07-05 18:30:31 +01:00
msramalho 7234eda85f
expands Sheets API retries for really large spreadsheets 2025-07-05 18:29:33 +01:00
msramalho a8c1ef3912
generic_extractor config to use proxy only when needed to avoid overzealousness 2025-07-05 16:54:58 +01:00
msramalho 2051e8e491
adds further exponential backoff for Sheets API worksheet enumeration 2025-07-05 16:02:07 +01:00
msramalho 21255db86a
stops using service that is not up for timestamping 2025-07-05 16:00:46 +01:00
msramalho eae0da08b3
fix issue with two runs of anitbot extractor 2025-07-05 16:00:03 +01:00
msramalho 649412053e
exclude non-ready code 2025-06-30 02:27:21 +01:00
msramalho b2648fa3cd
follow docs advice on exponential backoff of SheetsAPI 2025-06-30 01:47:12 +01:00
msramalho 4ad71b3589
adds retry to worksheet read for slow worksheets 2025-06-30 01:42:34 +01:00
msramalho 7c9475cde2
allow for human readable console logs, but defaults to JSON on file logs. 2025-06-30 00:53:10 +01:00
msramalho afd9090a4c
concludes logging standardization refactor 2025-06-26 17:20:04 +01:00
msramalho ad29cb4447
adds post_data to metadata for instagram 2025-06-26 15:48:10 +01:00
msramalho ce4d7ac649
WIP refactor logging 2025-06-21 15:54:51 +01:00
msramalho 12b457706b
closes #166 adds story URL feature to telethon extractor 2025-06-18 17:37:44 +01:00
msramalho 592dc30415
closes #330 2025-06-18 16:40:55 +01:00
msramalho d46eeee9b6
docs improved 2025-06-18 13:35:51 +01:00
msramalho 302e6f4258
logs improved 2025-06-18 13:35:43 +01:00
msramalho 76fd329fe5
twitter tests fix 2025-06-17 23:51:03 +01:00
msramalho a3ae9ebbb3
log level updates 2025-06-17 20:36:33 +01:00
msramalho 23b781c866
new check for edge case 2025-06-17 20:36:22 +01:00
msramalho 2aec240128
thumbnail enricher always run probe by default 2025-06-17 20:28:20 +01:00
msramalho c5a2fd45f9
log levels updated 2025-06-17 20:04:40 +01:00
msramalho ad168785e7
retry for Google API 503s 2025-06-17 19:22:09 +01:00
msramalho 74a1561c3d
logging and clean up 2025-06-17 19:21:40 +01:00
msramalho 55d9ffaacd
typo 2025-06-17 18:51:21 +01:00
msramalho f19fb575a7
logging updates 2025-06-17 18:50:54 +01:00
msramalho f53b2075ba
fixes gdrive error 2025-06-17 18:45:55 +01:00
msramalho 6085a66c58
revert metadata json renaming 2025-06-17 16:10:24 +01:00
msramalho 33cca734d9
original_url changes still constitute empty result 2025-06-17 16:06:25 +01:00
msramalho 2f1a07abbf
renaming and code improvements to json_e richer 2025-06-17 16:06:04 +01:00
msramalho 664ee8d037
fixes bugs and limited configuration of multi-level logs 2025-06-17 14:10:46 +01:00
msramalho 1b260788de
do not add commit comments to code 2025-06-17 13:18:12 +01:00
Dave Mateer b3adc5603a metadata.json hardcode in storage. add new metadata_json_enricher. log level change in orchestrator 2025-06-17 09:51:19 +01:00
Dave Mateer ba3f1a52e8 Logging each_level_in_separate_file feature 2025-06-16 16:15:54 +01:00
Dave Mateer a60d800b31 Changed log level for media 2025-06-16 15:07:39 +01:00
msramalho dfb361e3a0
reset generic_extractor description in result 2025-06-11 19:55:54 +01:00
msramalho aaa9ead39d
adds documentation for dropins 2025-06-11 17:58:53 +01:00
msramalho 2adcf231f7
new LinkedIn Dropin for Antibot 2025-06-11 16:51:52 +01:00
msramalho cd19181d8f
minor improvements 2025-06-11 16:51:42 +01:00
msramalho b60469767a
more flexibility to antibot dropins media finding process 2025-06-11 16:51:22 +01:00
msramalho d60d02c16e
improves download_from_url 2025-06-11 16:50:31 +01:00
msramalho e567bba6f9
improves docs for how-to and migrations 2025-06-11 13:37:03 +01:00
msramalho 3cf51dd874
adds tracker remove feature and tests 2025-06-11 11:56:42 +01:00
msramalho 1039e9631f
new reddit tests with .env.test 2025-06-11 11:22:23 +01:00
msramalho 8314833ae8
removes exclude_media_extensions option 2025-06-10 18:34:33 +01:00
msramalho fc89d96517
escape sequence 2025-06-10 18:04:33 +01:00
msramalho 54fda9cad4
antibot in docker uses a different user_data_dir 2025-06-10 18:04:27 +01:00
msramalho 71636233cb
adds migration information and VkDropin info. 2025-06-10 17:07:10 +01:00