Add search_index option to control search indexing of StreamField blocks (#11135)

pull/11163/head
Vedant Pandey 2023-10-26 14:41:49 +05:30 zatwierdzone przez Matt Westcott
rodzic 5cd46fe203
commit 837d733097
11 zmienionych plików z 91 dodań i 9 usunięć

Wyświetl plik

@ -4,6 +4,7 @@ Changelog
6.0 (xx.xx.xxxx) - IN DEVELOPMENT
~~~~~~~~~~~~~~~~
* Added `search_index` option to StreamField blocks to control whether the block is indexed for searching (Vedant Pandey)
* Fix: Update system check for overwriting storage backends to recognise the `STORAGES` setting introduced in Django 4.2 (phijma-leukeleu)
* Fix: Prevent password change form from raising a validation error when browser autocomplete fills in the "Old password" field (Chiemezuo Akujobi)
* Fix: Ensure that the legacy dropdown options, when closed, do not get accidentally clicked by other interactions wide viewports (CheesyPhoenix, Christer Jensen)

Wyświetl plik

@ -756,6 +756,7 @@
* scott-8
* phijma-leukeleu
* CheesyPhoenix
* Vedant Pandey
## Translators

Wyświetl plik

@ -66,6 +66,7 @@ All block definitions accept the following optional keyword arguments:
:param max_length: The maximum allowed length of the field.
:param min_length: The minimum allowed length of the field.
:param help_text: Help text to display alongside the field.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param validators: A list of validation functions for the field (see `Django Validators <https://docs.djangoproject.com/en/stable/ref/validators/>`__).
:param form_classname: A value to add to the form field's ``class`` attribute when rendered on the page editing form.
@ -79,6 +80,7 @@ All block definitions accept the following optional keyword arguments:
:param max_length: The maximum allowed length of the field.
:param min_length: The minimum allowed length of the field.
:param help_text: Help text to display alongside the field.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param rows: Number of rows to show on the textarea (defaults to 1).
:param validators: A list of validation functions for the field (see `Django Validators <https://docs.djangoproject.com/en/stable/ref/validators/>`__).
:param form_classname: A value to add to the form field's ``class`` attribute when rendered on the page editing form.
@ -225,6 +227,7 @@ All block definitions accept the following optional keyword arguments:
:param features: Specifies the set of features allowed (see :ref:`rich_text_features`).
:param required: If true (the default), the field cannot be left blank.
:param max_length: The maximum allowed length of the field. Only text is counted; rich text formatting, embedded content and paragraph / line breaks do not count towards the limit.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param help_text: Help text to display alongside the field.
:param validators: A list of validation functions for the field (see `Django Validators <https://docs.djangoproject.com/en/stable/ref/validators/>`__).
:param form_classname: A value to add to the form field's ``class`` attribute when rendered on the page editing form.
@ -267,6 +270,7 @@ All block definitions accept the following optional keyword arguments:
:param choices: A list of choices, in any format accepted by Django's :attr:`~django.db.models.Field.choices` parameter for model fields, or a callable returning such a list.
:param required: If true (the default), the field cannot be left blank.
:param help_text: Help text to display alongside the field.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param widget: The form widget to render the field with (see `Django Widgets <https://docs.djangoproject.com/en/stable/ref/forms/widgets/>`__).
:param validators: A list of validation functions for the field (see `Django Validators <https://docs.djangoproject.com/en/stable/ref/validators/>`__).
:param form_classname: A value to add to the form field's ``class`` attribute when rendered on the page editing form.
@ -311,6 +315,7 @@ All block definitions accept the following optional keyword arguments:
:param choices: A list of choices, in any format accepted by Django's :attr:`~django.db.models.Field.choices` parameter for model fields, or a callable returning such a list.
:param required: If true (the default), the field cannot be left blank.
:param help_text: Help text to display alongside the field.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param widget: The form widget to render the field with (see `Django Widgets <https://docs.djangoproject.com/en/stable/ref/forms/widgets/>`__).
:param validators: A list of validation functions for the field (see `Django Validators <https://docs.djangoproject.com/en/stable/ref/validators/>`__).
:param form_classname: A value to add to the form field's ``class`` attribute when rendered on the page editing form.
@ -446,6 +451,7 @@ All block definitions accept the following optional keyword arguments:
:param form_classname: An HTML ``class`` attribute to set on the root element of this block as displayed in the editing interface. Defaults to ``struct-block``; note that the admin interface has CSS styles defined on this class, so it is advised to include ``struct-block`` in this value when overriding. See :ref:`custom_editing_interfaces_for_structblock`.
:param form_template: Path to a Django template to use to render this block's form. See :ref:`custom_editing_interfaces_for_structblock`.
:param value_class: A subclass of ``wagtail.blocks.StructValue`` to use as the type of returned values for this block. See :ref:`custom_value_class_for_structblock`.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param label_format:
Determines the label shown when the block is collapsed in the editing interface. By default, the value of the first sub-block in the StructBlock is shown, but this can be customised by setting a string here with block names contained in braces - for example ``label_format = "Profile for {first_name} {surname}"``
@ -482,6 +488,7 @@ All block definitions accept the following optional keyword arguments:
:param form_classname: An HTML ``class`` attribute to set on the root element of this block as displayed in the editing interface.
:param min_num: Minimum number of sub-blocks that the list must have.
:param max_num: Maximum number of sub-blocks that the list may have.
:param search_index: If false (default true) , the content of this block will not be indexed for searching.
:param collapsed: When true, all sub-blocks are initially collapsed.
@ -538,6 +545,7 @@ All block definitions accept the following optional keyword arguments:
:param required: If true (the default), at least one sub-block must be supplied. This is ignored when using the ``StreamBlock`` as the top-level block of a StreamField; in this case the StreamField's ``blank`` property is respected instead.
:param min_num: Minimum number of sub-blocks that the stream must have.
:param max_num: Maximum number of sub-blocks that the stream may have.
:param search_index: If false (default true), the content of this block will not be indexed for searching.
:param block_counts: Specifies the minimum and maximum number of each block type, as a dictionary mapping block names to dicts with (optional) ``min_num`` and ``max_num`` fields.
:param collapsed: When true, all sub-blocks are initially collapsed.
:param form_classname: An HTML ``class`` attribute to set on the root element of this block as displayed in the editing interface.

Wyświetl plik

@ -14,7 +14,7 @@ depth: 1
### Other features
* ...
* Added `search_index` option to StreamField blocks to control whether the block is indexed for searching (Vedant Pandey)
### Bug fixes

Wyświetl plik

@ -575,6 +575,18 @@ hero_image = my_page.body.first_block_by_name('image')
<div class="hero-image">{{ page.body.first_block_by_name.image }}</div>
```
## Search considerations
Like any other field, content in a StreamField can be made searchable by adding the field to the model's search_fields definition - see {ref}`wagtailsearch_indexing_fields`. By default, all text content from the stream will be added to the search index. If you wish to exclude certain block types from being indexed, pass the keyword argument `search_index=False` as part of the block's definition. For example:
```python
body = StreamField([
('normal_text', blocks.RichTextBlock()),
('pull_quote', blocks.RichTextBlock(search_index=False)),
('footnotes', blocks.ListBlock(blocks.CharBlock(), search_index=False)),
], use_json_field=True)
```
## Custom validation
Custom validation logic can be added to blocks by overriding the block's `clean` method. For more information, see [](streamfield_validation).

Wyświetl plik

@ -146,10 +146,12 @@ class CharBlock(FieldBlock):
max_length=None,
min_length=None,
validators=(),
search_index=True,
**kwargs,
):
# CharField's 'label' and 'initial' parameters are not exposed, as Block handles that functionality natively
# (via 'label' and 'default')
self.search_index = search_index
self.field = forms.CharField(
required=required,
help_text=help_text,
@ -160,7 +162,7 @@ class CharBlock(FieldBlock):
super().__init__(**kwargs)
def get_searchable_content(self, value):
return [force_str(value)]
return [force_str(value)] if self.search_index else []
class TextBlock(FieldBlock):
@ -171,6 +173,7 @@ class TextBlock(FieldBlock):
rows=1,
max_length=None,
min_length=None,
search_index=True,
validators=(),
**kwargs,
):
@ -182,6 +185,7 @@ class TextBlock(FieldBlock):
"validators": validators,
}
self.rows = rows
self.search_index = search_index
super().__init__(**kwargs)
@cached_property
@ -193,7 +197,7 @@ class TextBlock(FieldBlock):
return forms.CharField(**field_kwargs)
def get_searchable_content(self, value):
return [force_str(value)]
return [force_str(value)] if self.search_index else []
class Meta:
icon = "pilcrow"
@ -482,6 +486,7 @@ class BaseChoiceBlock(FieldBlock):
default=None,
required=True,
help_text=None,
search_index=True,
widget=None,
validators=(),
**kwargs,
@ -489,6 +494,7 @@ class BaseChoiceBlock(FieldBlock):
self._required = required
self._default = default
self.search_index = search_index
if choices is None:
# no choices specified, so pick up the choice defined at the class level
@ -599,6 +605,8 @@ class ChoiceBlock(BaseChoiceBlock):
def get_searchable_content(self, value):
# Return the display value as the searchable value
if not self.search_index:
return []
text_value = force_str(value)
for k, v in self.field.choices:
if isinstance(v, (list, tuple)):
@ -633,6 +641,8 @@ class MultipleChoiceBlock(BaseChoiceBlock):
def get_searchable_content(self, value):
# Return the display value as the searchable value
if not self.search_index:
return []
content = []
text_value = force_str(value)
for k, v in self.field.choices:
@ -657,6 +667,7 @@ class RichTextBlock(FieldBlock):
features=None,
max_length=None,
validators=(),
search_index=True,
**kwargs,
):
if max_length is not None:
@ -670,6 +681,7 @@ class RichTextBlock(FieldBlock):
}
self.editor = editor
self.features = features
self.search_index = search_index
super().__init__(**kwargs)
def get_default(self):
@ -707,8 +719,10 @@ class RichTextBlock(FieldBlock):
return RichText(value)
def get_searchable_content(self, value):
# Strip HTML tags to prevent search backend from indexing them
if not self.search_index:
return []
source = force_str(value.source)
# Strip HTML tags to prevent search backend from indexing them
return [get_text_for_indexing(source)]
def extract_references(self, value):

Wyświetl plik

@ -138,9 +138,9 @@ class ListValue(MutableSequence):
class ListBlock(Block):
def __init__(self, child_block, **kwargs):
def __init__(self, child_block, search_index=True, **kwargs):
super().__init__(**kwargs)
self.search_index = search_index
if isinstance(child_block, type):
# child_block was passed as a class, so convert it to a block instance
self.child_block = child_block()
@ -343,8 +343,9 @@ class ListBlock(Block):
return format_html("<ul>{0}</ul>", children)
def get_searchable_content(self, value):
if not self.search_index:
return []
content = []
for child_value in value:
content.extend(self.child_block.get_searchable_content(child_value))

Wyświetl plik

@ -75,8 +75,9 @@ class StreamBlockValidationError(ValidationError):
class BaseStreamBlock(Block):
def __init__(self, local_blocks=None, **kwargs):
def __init__(self, local_blocks=None, search_index=True, **kwargs):
self._constructor_kwargs = kwargs
self.search_index = search_index
super().__init__(**kwargs)
@ -340,6 +341,8 @@ class BaseStreamBlock(Block):
)
def get_searchable_content(self, value):
if not self.search_index:
return []
content = []
for child in value:

Wyświetl plik

@ -106,8 +106,9 @@ class PlaceholderBoundBlock(BoundBlock):
class BaseStructBlock(Block):
def __init__(self, local_blocks=None, **kwargs):
def __init__(self, local_blocks=None, search_index=True, **kwargs):
self._constructor_kwargs = kwargs
self.search_index = search_index
super().__init__(**kwargs)
@ -253,6 +254,8 @@ class BaseStructBlock(Block):
}
def get_searchable_content(self, value):
if not self.search_index:
return []
content = []
for name, block in self.child_blocks.items():

Wyświetl plik

@ -252,6 +252,7 @@ class StreamField(models.Field):
return self.get_prep_value(value)
def get_searchable_content(self, value):
return self.stream_block.get_searchable_content(value)
def extract_references(self, value):

Wyświetl plik

@ -125,6 +125,12 @@ class TestFieldBlock(WagtailTestUtils, SimpleTestCase):
self.assertEqual(content, ["Hello world!"])
def test_search_index_searchable_content(self):
block = blocks.CharBlock(search_index=False)
content = block.get_searchable_content("Hello world!")
self.assertEqual(content, [])
def test_charfield_with_validator(self):
def validate_is_foo(value):
if value != "foo":
@ -665,6 +671,18 @@ class TestRichTextBlock(TestCase):
],
)
def test_search_index_get_searchable_content(self):
block = blocks.RichTextBlock(search_index=False)
value = RichText(
'<p>Merry <a linktype="page" id="4">Christmas</a>! &amp; a happy new year</p>\n'
"<p>Our Santa pet <b>Wagtail</b> has some cool stuff in store for you all!</p>"
)
result = block.get_searchable_content(value)
self.assertEqual(
result,
[],
)
def test_get_searchable_content_whitespace(self):
block = blocks.RichTextBlock()
value = RichText("<p>mashed</p><p>po<i>ta</i>toes</p>")
@ -928,6 +946,16 @@ class TestChoiceBlock(WagtailTestUtils, SimpleTestCase):
)
self.assertEqual(block.get_searchable_content("choice-1"), ["Choice 1"])
def test_search_index_searchable_content(self):
block = blocks.ChoiceBlock(
choices=[
("choice-1", "Choice 1"),
("choice-2", "Choice 2"),
],
search_index=False,
)
self.assertEqual(block.get_searchable_content("choice-1"), [])
def test_searchable_content_with_callable_choices(self):
def callable_choices():
return [
@ -1305,6 +1333,16 @@ class TestMultipleChoiceBlock(WagtailTestUtils, SimpleTestCase):
)
self.assertEqual(block.get_searchable_content("choice-1"), ["Choice 1"])
def test_search_index_searchable_content(self):
block = blocks.MultipleChoiceBlock(
choices=[
("choice-1", "Choice 1"),
("choice-2", "Choice 2"),
],
search_index=False,
)
self.assertEqual(block.get_searchable_content("choice-1"), [])
def test_searchable_content_with_callable_choices(self):
def callable_choices():
return [