Groups - Including "Extract text", "Text to ignore", "Trigger text" and "Text that should not be present" filters

2025-04-04 11:16:12 +02:00 · 2025-04-04 11:16:12 +02:00 · 78b7aee512
commit 78b7aee512
--- a/changedetectionio/blueprint/tags/templates/edit-tag.html
+++ b/changedetectionio/blueprint/tags/templates/edit-tag.html
@ -13,6 +13,7 @@
    /*const email_notification_prefix=JSON.parse('{{ emailprefix|tojson }}');*/
 /*{% endif %}*/

+{% set has_tag_filters_extra='' %}

 </script>

@ -46,59 +47,12 @@
            </div>

            <div class="tab-pane-inner" id="filters-and-triggers">
-                    <div class="pure-control-group">
-                        {% set field = render_field(form.include_filters,
-                            rows=5,
-                            placeholder="#example
-xpath://body/div/span[contains(@class, 'example-class')]",
-                            class="m-d")
-                        %}
-                        {{ field }}
-                        {% if '/text()' in  field %}
-                          <span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br>
-                        {% endif %}
-                        <span class="pure-form-message-inline">One CSS, xPath, JSON Path/JQ selector per line, <i>any</i> rules that matches will be used.<br>
-                    <div data-target="#advanced-help-selectors" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</div>
-                    <ul id="advanced-help-selectors">
-                        <li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
-                        <li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
-                            <ul>
-                                <li>JSONPath: Prefix with <code>json:</code>, use <code>json:$</code> to force re-formatting if required,  <a href="https://jsonpath.com/" target="new">test your JSONPath here</a>.</li>
-                                {% if jq_support %}
-                                <li>jq: Prefix with <code>jq:</code> and <a href="https://jqplay.org/" target="new">test your jq here</a>. Using <a href="https://stedolan.github.io/jq/" target="new">jq</a> allows for complex filtering and processing of JSON data with built-in functions, regex, filtering, and more. See examples and documentation <a href="https://stedolan.github.io/jq/manual/" target="new">here</a>. Prefix <code>jqraw:</code> outputs the results as text instead of a JSON list.</li>
-                                {% else %}
-                                <li>jq support not installed</li>
-                                {% endif %}
-                            </ul>
-                        </li>
-                        <li>XPath - Limit text to this XPath rule, simply start with a forward-slash. To specify XPath to be used explicitly or the XPath rule starts with an XPath function: Prefix with <code>xpath:</code>
-                            <ul>
-                                <li>Example:  <code>//*[contains(@class, 'sametext')]</code> or <code>xpath:count(//*[contains(@class, 'sametext')])</code>, <a
-                                href="http://xpather.com/" target="new">test your XPath here</a></li>
-                                <li>Example: Get all titles from an RSS feed <code>//title/text()</code></li>
-                                <li>To use XPath1.0: Prefix with <code>xpath1:</code></li>
-                            </ul>
-                            </li>
-                    </ul>
-                    Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
-                                href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br>
-                </span>
-                    </div>
-                <fieldset class="pure-control-group">
-                    {{ render_field(form.subtractive_selectors, rows=5, placeholder="header
-footer
-nav
-.stockticker
-//*[contains(text(), 'Advertisement')]") }}
-                    <span class="pure-form-message-inline">
-                        <ul>
-                          <li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
-                          <li> Don't paste HTML here, use only CSS and XPath selectors </li>
-                          <li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
-                        </ul>
-                      </span>
-                </fieldset>
-
+                <p>These settings are <strong><i>added</i></strong> to any existing watch configurations.</p>
+                {% include "edit/include_subtract.html" %}
+                <div class="text-filtering border-fieldset">
+                    <h3>Text filtering</h3>
+                    {% include "edit/text-options.html" %}
+                </div>
            </div>

        {# rendered sub Template #}
--- a/changedetectionio/processors/text_json_diff/processor.py
+++ b/changedetectionio/processors/text_json_diff/processor.py
@ -252,6 +252,7 @@ class perform_site_check(difference_detection_processor):

        # 615 Extract text by regex
        extract_text = watch.get('extract_text', [])
+        extract_text += self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='extract_text')
        if len(extract_text) > 0:
            regex_matched_output = []
            for s_re in extract_text:
@ -296,6 +297,8 @@ class perform_site_check(difference_detection_processor):
 ### CALCULATE MD5
        # If there's text to ignore
        text_to_ignore = watch.get('ignore_text', []) + self.datastore.data['settings']['application'].get('global_ignore_text', [])
+        text_to_ignore += self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='ignore_text')
+
        text_for_checksuming = stripped_text_from_html
        if text_to_ignore:
            text_for_checksuming = html_tools.strip_ignore_text(stripped_text_from_html, text_to_ignore)
@ -308,8 +311,8 @@ class perform_site_check(difference_detection_processor):

        ############ Blocking rules, after checksum #################
        blocked = False
-
        trigger_text = watch.get('trigger_text', [])
+        trigger_text += self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='trigger_text')
        if len(trigger_text):
            # Assume blocked
            blocked = True
@ -324,6 +327,7 @@ class perform_site_check(difference_detection_processor):
                blocked = False

        text_should_not_be_present = watch.get('text_should_not_be_present', [])
+        text_should_not_be_present += self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='text_should_not_be_present')
        if len(text_should_not_be_present):
            # If anything matched, then we should block a change from happening
            result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
--- a/changedetectionio/templates/edit.html
+++ b/changedetectionio/templates/edit.html
@ -314,61 +314,8 @@ Math: {{ 1 + 1 }}") }}
                                </li>
                            </ul>
                    </div>
-                    <div class="pure-control-group">
-                        {% set field = render_field(form.include_filters,
-                            rows=5,
-                            placeholder=has_tag_filters_extra+"#example
-xpath://body/div/span[contains(@class, 'example-class')]",
-                            class="m-d")
-                        %}
-                        {{ field }}
-                        {% if '/text()' in  field %}
-                          <span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br>
-                        {% endif %}
-                        <span class="pure-form-message-inline">One CSS, xPath 1 &amp; 2, JSON Path/JQ selector per line, <i>any</i> rules that matches will be used.<br>
-                        <span data-target="#advanced-help-selectors" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
-                    <ul id="advanced-help-selectors" style="display: none;">
-                        <li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
-                        <li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
-                            <ul>
-                                <li>JSONPath: Prefix with <code>json:</code>, use <code>json:$</code> to force re-formatting if required,  <a href="https://jsonpath.com/" target="new">test your JSONPath here</a>.</li>
-                                {% if jq_support %}
-                                <li>jq: Prefix with <code>jq:</code> and <a href="https://jqplay.org/" target="new">test your jq here</a>. Using <a href="https://stedolan.github.io/jq/" target="new">jq</a> allows for complex filtering and processing of JSON data with built-in functions, regex, filtering, and more. See examples and documentation <a href="https://stedolan.github.io/jq/manual/" target="new">here</a>. Prefix <code>jqraw:</code> outputs the results as text instead of a JSON list.</li>
-                                {% else %}
-                                <li>jq support not installed</li>
-                                {% endif %}
-                            </ul>
-                        </li>
-                        <li>XPath - Limit text to this XPath rule, simply start with a forward-slash. To specify XPath to be used explicitly or the XPath rule starts with an XPath function: Prefix with <code>xpath:</code>
-                            <ul>
-                                <li>Example:  <code>//*[contains(@class, 'sametext')]</code> or <code>xpath:count(//*[contains(@class, 'sametext')])</code>, <a
-                                href="http://xpather.com/" target="new">test your XPath here</a></li>
-                                <li>Example: Get all titles from an RSS feed <code>//title/text()</code></li>
-                                <li>To use XPath1.0: Prefix with <code>xpath1:</code></li>
-                            </ul>
-                            </li>
-                    <li>
-                        Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
-                                href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br>
-                    </li>
-                    </ul>

-                </span>
-                    </div>
-                <fieldset class="pure-control-group">
-                    {{ render_field(form.subtractive_selectors, rows=5, placeholder=has_tag_filters_extra+"header
-footer
-nav
-.stockticker
-//*[contains(text(), 'Advertisement')]") }}
-                    <span class="pure-form-message-inline">
-                        <ul>
-                          <li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
-                          <li> Don't paste HTML here, use only CSS and XPath selectors </li>
-                          <li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
-                        </ul>
-                      </span>
-                </fieldset>
+{% include "edit/include_subtract.html" %}
                <div class="text-filtering border-fieldset">
                <fieldset class="pure-group" id="text-filtering-type-options">
                    <h3>Text filtering</h3>
@ -396,76 +343,9 @@ nav
                    {{ render_checkbox_field(form.trim_text_whitespace) }}
                    <span class="pure-form-message-inline">Remove any whitespace before and after each line of text</span>
                </fieldset>
-                <fieldset>
-                    <div class="pure-control-group">
-                        {{ render_field(form.trigger_text, rows=5, placeholder="Some text to wait for in a line
-/some.regex\d{2}/ for case-INsensitive regex
-") }}
-                        <span class="pure-form-message-inline">
-                    <ul>
-                        <li>Text to wait for before triggering a change/notification, all text and regex are tested <i>case-insensitive</i>.</li>
-                        <li>Trigger text is processed from the result-text that comes out of any CSS/JSON Filters for this watch</li>
-                        <li>Each line is processed separately (think of each line as "OR")</li>
-                        <li>Note: Wrap in forward slash / to use regex  example: <code>/foo\d/</code></li>
-                    </ul>
-                        </span>
-                    </div>
-                </fieldset>
-                <fieldset class="pure-group">
-                    {{ render_field(form.ignore_text, rows=5, placeholder="Some text to ignore in a line
-/some.regex\d{2}/ for case-INsensitive regex
-") }}
-                    <span class="pure-form-message-inline">
-                        <ul>
-                            <li>Matching text will be <strong>ignored</strong> in the text snapshot (you can still see it but it wont trigger a change)</li>
-                            <li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li>
-                            <li>Regular Expression support, wrap the entire line in forward slash <code>/regex/</code></li>
-                            <li>Changing this will affect the comparison checksum which may trigger an alert</li>
-                        </ul>
-                </span>
-
-                </fieldset>
-
-                <fieldset>
-                    <div class="pure-control-group">
-                        {{ render_field(form.text_should_not_be_present, rows=5, placeholder="For example: Out of stock
-Sold out
-Not in stock
-Unavailable") }}
-                        <span class="pure-form-message-inline">
-                            <ul>
-                                <li>Block change-detection while this text is on the page, all text and regex are tested <i>case-insensitive</i>, good for waiting for when a product is available again</li>
-                                <li>Block text is processed from the result-text that comes out of any CSS/JSON Filters for this watch</li>
-                                <li>All lines here must not exist (think of each line as "OR")</li>
-                                <li>Note: Wrap in forward slash / to use regex  example: <code>/foo\d/</code></li>
-                            </ul>
-                        </span>
-                    </div>
-                </fieldset>
-                <fieldset>
-                    <div class="pure-control-group">
-                        {{ render_field(form.extract_text, rows=5, placeholder="/.+?\d+ comments.+?/
- or
-keyword") }}
-                        <span class="pure-form-message-inline">
-                    <ul>
-                        <li>Extracts text in the final output (line by line) after other filters using regular expressions or string match;
-                            <ul>
-                                <li>Regular expression &dash; example <code>/reports.+?2022/i</code></li>
-                                <li>Don't forget to consider the white-space at the start of a line <code>/.+?reports.+?2022/i</code></li>
-                                <li>Use <code>//(?aiLmsux))</code> type flags (more <a href="https://docs.python.org/3/library/re.html#index-15">information here</a>)<br></li>
-                                <li>Keyword example &dash; example <code>Out of stock</code></li>
-                                <li>Use groups to extract just that text &dash; example <code>/reports.+?(\d+)/i</code> returns a list of years only</li>
-                                <li>Example - match lines containing a keyword <code>/.*icecream.*/</code></li>
-                            </ul>
-                        </li>
-                        <li>One line per regular-expression/string match</li>
-                    </ul>
-                        </span>
-                    </div>
-                </fieldset>
+                {% include "edit/text-options.html" %}
                </div>
-            </div>
+              </div>
              <div id="text-preview" style="display: none;" >
                    <script>
                        const preview_text_edit_filters_url="{{url_for('ui.ui_edit.watch_get_preview_rendered', uuid=uuid)}}";
--- a/changedetectionio/templates/edit/include_subtract.html
+++ b/changedetectionio/templates/edit/include_subtract.html
@ -0,0 +1,55 @@
+                    <div class="pure-control-group">
+                        {% set field = render_field(form.include_filters,
+                            rows=5,
+                            placeholder=has_tag_filters_extra+"#example
+xpath://body/div/span[contains(@class, 'example-class')]",
+                            class="m-d")
+                        %}
+                        {{ field }}
+                        {% if '/text()' in  field %}
+                          <span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br>
+                        {% endif %}
+                        <span class="pure-form-message-inline">One CSS, xPath 1 &amp; 2, JSON Path/JQ selector per line, <i>any</i> rules that matches will be used.<br>
+                        <span data-target="#advanced-help-selectors" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
+                    <ul id="advanced-help-selectors" style="display: none;">
+                        <li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
+                        <li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
+                            <ul>
+                                <li>JSONPath: Prefix with <code>json:</code>, use <code>json:$</code> to force re-formatting if required,  <a href="https://jsonpath.com/" target="new">test your JSONPath here</a>.</li>
+                                {% if jq_support %}
+                                <li>jq: Prefix with <code>jq:</code> and <a href="https://jqplay.org/" target="new">test your jq here</a>. Using <a href="https://stedolan.github.io/jq/" target="new">jq</a> allows for complex filtering and processing of JSON data with built-in functions, regex, filtering, and more. See examples and documentation <a href="https://stedolan.github.io/jq/manual/" target="new">here</a>. Prefix <code>jqraw:</code> outputs the results as text instead of a JSON list.</li>
+                                {% else %}
+                                <li>jq support not installed</li>
+                                {% endif %}
+                            </ul>
+                        </li>
+                        <li>XPath - Limit text to this XPath rule, simply start with a forward-slash. To specify XPath to be used explicitly or the XPath rule starts with an XPath function: Prefix with <code>xpath:</code>
+                            <ul>
+                                <li>Example:  <code>//*[contains(@class, 'sametext')]</code> or <code>xpath:count(//*[contains(@class, 'sametext')])</code>, <a
+                                href="http://xpather.com/" target="new">test your XPath here</a></li>
+                                <li>Example: Get all titles from an RSS feed <code>//title/text()</code></li>
+                                <li>To use XPath1.0: Prefix with <code>xpath1:</code></li>
+                            </ul>
+                            </li>
+                    <li>
+                        Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
+                                href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br>
+                    </li>
+                    </ul>
+
+                </span>
+                    </div>
+                <fieldset class="pure-control-group">
+                    {{ render_field(form.subtractive_selectors, rows=5, placeholder=has_tag_filters_extra+"header
+footer
+nav
+.stockticker
+//*[contains(text(), 'Advertisement')]") }}
+                    <span class="pure-form-message-inline">
+                        <ul>
+                          <li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
+                          <li> Don't paste HTML here, use only CSS and XPath selectors </li>
+                          <li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
+                        </ul>
+                      </span>
+                </fieldset>
--- a/changedetectionio/templates/edit/text-options.html
+++ b/changedetectionio/templates/edit/text-options.html
@ -0,0 +1,69 @@
+
+                <fieldset>
+                    <div class="pure-control-group">
+                        {{ render_field(form.trigger_text, rows=5, placeholder="Some text to wait for in a line
+/some.regex\d{2}/ for case-INsensitive regex
+") }}
+                        <span class="pure-form-message-inline">
+                    <ul>
+                        <li>Text to wait for before triggering a change/notification, all text and regex are tested <i>case-insensitive</i>.</li>
+                        <li>Trigger text is processed from the result-text that comes out of any CSS/JSON Filters for this watch</li>
+                        <li>Each line is processed separately (think of each line as "OR")</li>
+                        <li>Note: Wrap in forward slash / to use regex  example: <code>/foo\d/</code></li>
+                    </ul>
+                        </span>
+                    </div>
+                </fieldset>
+                <fieldset class="pure-group">
+                    {{ render_field(form.ignore_text, rows=5, placeholder="Some text to ignore in a line
+/some.regex\d{2}/ for case-INsensitive regex
+") }}
+                    <span class="pure-form-message-inline">
+                        <ul>
+                            <li>Matching text will be <strong>ignored</strong> in the text snapshot (you can still see it but it wont trigger a change)</li>
+                            <li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li>
+                            <li>Regular Expression support, wrap the entire line in forward slash <code>/regex/</code></li>
+                            <li>Changing this will affect the comparison checksum which may trigger an alert</li>
+                        </ul>
+                </span>
+
+                </fieldset>
+
+                <fieldset>
+                    <div class="pure-control-group">
+                        {{ render_field(form.text_should_not_be_present, rows=5, placeholder="For example: Out of stock
+Sold out
+Not in stock
+Unavailable") }}
+                        <span class="pure-form-message-inline">
+                            <ul>
+                                <li>Block change-detection while this text is on the page, all text and regex are tested <i>case-insensitive</i>, good for waiting for when a product is available again</li>
+                                <li>Block text is processed from the result-text that comes out of any CSS/JSON Filters for this watch</li>
+                                <li>All lines here must not exist (think of each line as "OR")</li>
+                                <li>Note: Wrap in forward slash / to use regex  example: <code>/foo\d/</code></li>
+                            </ul>
+                        </span>
+                    </div>
+                </fieldset>
+                <fieldset>
+                    <div class="pure-control-group">
+                        {{ render_field(form.extract_text, rows=5, placeholder="/.+?\d+ comments.+?/
+ or
+keyword") }}
+                        <span class="pure-form-message-inline">
+                    <ul>
+                        <li>Extracts text in the final output (line by line) after other filters using regular expressions or string match;
+                            <ul>
+                                <li>Regular expression &dash; example <code>/reports.+?2022/i</code></li>
+                                <li>Don't forget to consider the white-space at the start of a line <code>/.+?reports.+?2022/i</code></li>
+                                <li>Use <code>//(?aiLmsux))</code> type flags (more <a href="https://docs.python.org/3/library/re.html#index-15">information here</a>)<br></li>
+                                <li>Keyword example &dash; example <code>Out of stock</code></li>
+                                <li>Use groups to extract just that text &dash; example <code>/reports.+?(\d+)/i</code> returns a list of years only</li>
+                                <li>Example - match lines containing a keyword <code>/.*icecream.*/</code></li>
+                            </ul>
+                        </li>
+                        <li>One line per regular-expression/string match</li>
+                    </ul>
+                        </span>
+                    </div>
+                </fieldset>