dgtlmoon
|
4ae27af511
|
Code cleanup - Browser Steps
|
2023-10-28 14:58:12 +02:00 |
dgtlmoon
|
e1860549dc
|
Fetching - Browser Step enabled watches should also identify 404/non-200 status situations (#1907)
|
2023-10-28 14:37:42 +02:00 |
dgtlmoon
|
349111eb35
|
Fetching/BrowserSteps - Going to a page was using slightly logic to the main way - make them use the same methods (#1890)
|
2023-10-26 20:19:22 +02:00 |
Marcelo Alencar
|
0aef5483d9
|
Upgrade selenium to 4.14.0 (latest) (#1783)
|
2023-10-26 10:09:03 +02:00 |
dgtlmoon
|
7debccca73
|
Fetching - Clarifying how fetchers work with SOCKS5 proxies
|
2023-10-09 16:57:30 +02:00 |
dgtlmoon
|
e30b17b8bc
|
UI + Fetching - Be more helpful when a filter contains no text, suggest ways to deal with images in filters (#1819)
|
2023-09-26 13:59:59 +02:00 |
dgtlmoon
|
57de4ffe4f
|
Page fetching - Fixed possible incorrect browser user-agent header in playwright/puppeteer/browserless fetchers (#1811)
|
2023-09-24 08:42:24 +02:00 |
dgtlmoon
|
7cb7eebbc5
|
Browser Steps - When cleaning up old screenshots, check the file exists
|
2023-07-11 10:44:54 +02:00 |
dgtlmoon
|
f9387522ee
|
Fetching - Be sure that content-type detection works when the headers are a mixed case (#1604)
|
2023-05-29 16:11:43 +02:00 |
dgtlmoon
|
1aeafef910
|
Fetcher - Puppeteer experimental fetcher wasn't returning the status-code (#1585)
|
2023-05-21 23:10:08 +02:00 |
dgtlmoon
|
e4f6d54ae2
|
BrowserSteps - Refactored to re-use playwright context which should solve some errors
|
2023-05-12 15:38:55 +02:00 |
dgtlmoon
|
d939882dde
|
Fetcher - Experimental fetcher improvements (Code TidyUp, Improve tests, revert to old playwright when using BrowserSteps for now) (#1564)
|
2023-05-11 16:36:35 +02:00 |
dgtlmoon
|
5325918f29
|
Puppeteer fetcher, adding disk cache and other fixes (#1563)
|
2023-05-10 23:23:34 +02:00 |
dgtlmoon
|
316f28a0f2
|
Fetcher - Experimental fetcher fixes, now only enabled with 'USE_EXPERIMENTAL_PUPPETEER_FETCH' env var (default off) (#1561)
|
2023-05-07 13:49:53 +02:00 |
dgtlmoon
|
94f38f052e
|
Fetcher - playwright/browserless - Use builtin node puppeteer handler in browserless, scales way better, and is faster (#1559)
|
2023-05-05 21:58:08 +02:00 |
dgtlmoon
|
6e71088cde
|
New feature - Restock / stock / out of stock monitor option/mode
|
2023-03-18 20:36:26 +01:00 |
dgtlmoon
|
41856c4ed8
|
Re #1365 - Playwright - Browser "Service Workers" should be enabled by default but unset via env var PLAYWRIGHT_SERVICE_WORKERS=block (#1367)
|
2023-02-01 20:50:40 +01:00 |
dgtlmoon
|
d47a25eb6d
|
Playwright - Removing old bug fix where playwright needed screenshot called twice to make the full screen screenshot be actually fullscreen (#1356)
|
2023-01-28 15:02:53 +01:00 |
dgtlmoon
|
fcfd1b5e10
|
Ability to configure extra proxies via the UI (#1235)
|
2022-12-19 21:48:01 +01:00 |
dgtlmoon
|
13c4121f52
|
PDF File change detection - Initial PDF fetcher support with basic text extraction (#1244)
|
2022-12-19 17:51:41 +01:00 |
dgtlmoon
|
0c380c170f
|
Playwright - Better error reporting and re-try fetch on fail once (#1238)
|
2022-12-16 18:06:14 +01:00 |
dgtlmoon
|
b76148a0f4
|
Fetcher - CPU usage - Skip processing if the previous checksum and the just fetched one was the same (#925)
|
2022-12-14 15:08:34 +01:00 |
dgtlmoon
|
93cc30437f
|
Playwright+BrowserSteps - Fetch changes - Fetch simply after page starts rendering + delay seconds, disable service workers
|
2022-12-14 12:16:04 +01:00 |
dgtlmoon
|
69756f20f2
|
VisualSelector & BrowserSteps - Scraper improvements, remove duplicate code
|
2022-11-25 10:45:38 +01:00 |
dgtlmoon
|
fde7b3fd97
|
Remove dupe xpath finder prep code
|
2022-11-25 09:25:05 +01:00 |
dgtlmoon
|
5b530ff61c
|
Configurable "Browser Steps" when Playwright/Chrome is configured (enter text, scroll, wait for text, click button etc) (#478)
|
2022-11-24 20:53:01 +01:00 |
dgtlmoon
|
df6e835035
|
Make VisualSelector show first available multiple selector, refactor to make more maintainable (#1132)
|
2022-11-17 11:52:48 +01:00 |
dgtlmoon
|
359fc48fb4
|
Filters can now accept a list/multiple filters (#1064) #623
|
2022-11-03 12:13:54 +01:00 |
dgtlmoon
|
669fd3ae0b
|
Dont use default Requests `user-agent` and `accept` headers in playwright+selenium requests, breaks sites such as united.com. (#1004)
|
2022-10-09 18:25:36 +02:00 |
dgtlmoon
|
3ebb2ab9ba
|
Selenium fetcher - screenshot should be taken after 'wait' time, not before #873
|
2022-09-25 11:05:07 +02:00 |
dgtlmoon
|
3705ce6681
|
Render Extract Configurable Delay Seconds should also apply after executing any JS #958
|
2022-09-24 23:48:03 +02:00 |
dgtlmoon
|
f7ea99412f
|
Re #958 - remove change screensize, should be in 1280x720 default, was causing "Unable to retrieve content because the page is navigating and changing the content." on some sites
|
2022-09-19 14:02:32 +02:00 |
dgtlmoon
|
1193a7f22c
|
Playwright - Support proxy auth mechanisms (#859)
|
2022-08-18 09:46:28 +02:00 |
dgtlmoon
|
e461c0b819
|
Playwright fetcher didn't report low level HTTP errors correctly (like Connection Refused) (#852)
|
2022-08-17 13:25:08 +02:00 |
dgtlmoon
|
9942107016
|
Massive improvements to error handling - show separate output for non HTTP 200 status replies
|
2022-08-15 18:56:53 +02:00 |
dgtlmoon
|
1eb5726cbf
|
Execute JS should happen after waiting seconds
|
2022-08-15 11:27:04 +02:00 |
dgtlmoon
|
e6173357a9
|
Visual Selector direct element finder fix
|
2022-07-28 09:19:10 +02:00 |
dgtlmoon
|
fae1164c0b
|
Ability to specify JS before running change-detection (#744)
|
2022-07-10 13:56:01 +02:00 |
dgtlmoon
|
169c293143
|
Playwright - log console errors to output
|
2022-07-10 13:55:29 +02:00 |
dgtlmoon
|
6553980cd5
|
Playwright - Use HTTP Request Headers override (Cookie, etc)
|
2022-06-25 23:42:48 +02:00 |
dgtlmoon
|
4a91505af5
|
Playwright screenshots - no need for high-res "bug workaround" screenshot, use lower quality/faster configurable image quality env var
|
2022-06-15 10:52:24 +02:00 |
dgtlmoon
|
82b900fbf4
|
Give more helpful error message when a page doesnt load
|
2022-06-14 08:16:22 +02:00 |
dgtlmoon
|
358a365303
|
Tweaks to playwright fetch code - better timeout handling
|
2022-06-13 23:39:43 +02:00 |
dgtlmoon
|
8294519f43
|
Content fetcher - Handle when a page doesnt load properly
|
2022-06-01 13:12:37 +02:00 |
dgtlmoon
|
8ba8a220b6
|
Playwright - Correctly close browser context/sessions on exceptions
|
2022-06-01 12:59:44 +02:00 |
dgtlmoon
|
5cefb16e52
|
Minor code cleanup
|
2022-05-25 15:38:40 +02:00 |
dgtlmoon
|
341ae24b73
|
Re #616 - content trigger - adding extra test (#620)
|
2022-05-25 15:31:59 +02:00 |
dgtlmoon
|
9d742446ab
|
Playwright - ByPass CSP for more reliable JS scraping, disable accept downloads
|
2022-05-25 11:05:18 +02:00 |
dgtlmoon
|
e3e022b0f4
|
VisualSelector - Better handling of filter targets that are no longer available in the HTML
|
2022-05-25 10:23:43 +02:00 |
dgtlmoon
|
7983675325
|
Visual Selector - be more resilient when sites interfere with the xPath scraping
|
2022-05-24 00:10:38 +02:00 |