kopia lustrzana https://github.com/wagtail/wagtail
Remove 'db' and 'postgres_search' search backends
rodzic
43edd0c187
commit
00582ba35a
|
@ -59,18 +59,6 @@ Add a `WAGTAIL_SITE_NAME` - this will be displayed on the main dashboard of the
|
|||
WAGTAIL_SITE_NAME = 'My Example Site'
|
||||
```
|
||||
|
||||
<!--- RemovedInWagtail217Warning (wagtail.search.backends.database will be made the default and will not need to be added explicitly here) -->
|
||||
|
||||
Add the `WAGTAILSEARCH_BACKENDS` setting to enable full-text searching:
|
||||
|
||||
```python
|
||||
WAGTAILSEARCH_BACKENDS = {
|
||||
'default': {
|
||||
'BACKEND': 'wagtail.search.backends.database',
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Various other settings are available to configure Wagtail's behaviour - see [Settings](/reference/settings).
|
||||
|
||||
## URL configuration
|
||||
|
|
|
@ -13,7 +13,6 @@ Wagtail ships with a variety of extra optional modules.
|
|||
frontendcache
|
||||
routablepage
|
||||
modeladmin/index
|
||||
postgres_search
|
||||
searchpromotions
|
||||
simple_translation
|
||||
table_block
|
||||
|
|
|
@ -1,130 +0,0 @@
|
|||
.. _postgres_search:
|
||||
|
||||
========================
|
||||
PostgreSQL search engine
|
||||
========================
|
||||
|
||||
.. warning::
|
||||
|
||||
| This search backend is deprecated, and has been replaced by ``wagtail.search.backends.database``. See :ref:`wagtailsearch_backends`.
|
||||
|
||||
|
||||
This contrib module provides a search engine backend using
|
||||
`PostgreSQL full-text search capabilities <https://www.postgresql.org/docs/current/static/textsearch.html>`_.
|
||||
|
||||
.. warning::
|
||||
|
||||
| You can only use this module to index data from a PostgreSQL database.
|
||||
|
||||
**Features**:
|
||||
|
||||
- It supports all the search features available in Wagtail.
|
||||
- Easy to install and adds no external dependency or service.
|
||||
- Excellent performance for sites with up to 200 000 pages and stays decent for sites up to a million pages.
|
||||
- Faster to reindex than Elasticsearch, if you use PostgreSQL 9.5 or higher.
|
||||
|
||||
**Drawbacks**:
|
||||
|
||||
- Partial matching (``SearchField(partial_match=True)``) is not supported
|
||||
- ``SearchField(boost=…)`` is only partially respected as PostgreSQL only supports four different boosts.
|
||||
So if you use five or more distinct values for the boost in your site, slight inaccuracies may occur.
|
||||
- When :ref:`wagtailsearch_specifying_fields`, the index is not used,
|
||||
so it will be slow on huge sites.
|
||||
- Still when :ref:`wagtailsearch_specifying_fields`, you cannot search
|
||||
on a specific method.
|
||||
|
||||
|
||||
Installation
|
||||
============
|
||||
|
||||
Add ``'wagtail.contrib.postgres_search',`` anywhere in your ``INSTALLED_APPS``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
INSTALLED_APPS = [
|
||||
...
|
||||
'wagtail.contrib.postgres_search',
|
||||
...
|
||||
]
|
||||
|
||||
Then configure Wagtail to use it as a search backend.
|
||||
Give it the alias `'default'` if you want it to be the default search backend:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
WAGTAILSEARCH_BACKENDS = {
|
||||
'default': {
|
||||
'BACKEND': 'wagtail.contrib.postgres_search.backend',
|
||||
},
|
||||
}
|
||||
|
||||
After installing the module, run ``python manage.py migrate`` to create the necessary ``postgres_search_indexentry`` table.
|
||||
|
||||
You then need to index data inside this backend using
|
||||
the :ref:`update_index` command. You can reuse this command whenever
|
||||
you want. However, it should not be needed after a first usage since
|
||||
the search engine is automatically updated when data is modified.
|
||||
To disable this behaviour, see :ref:`wagtailsearch_backends_auto_update`.
|
||||
|
||||
|
||||
Configuration
|
||||
=============
|
||||
|
||||
Language / PostgreSQL search configuration
|
||||
------------------------------------------
|
||||
|
||||
Use the additional ``'SEARCH_CONFIG'`` key to define which PostgreSQL
|
||||
search configuration should be used. For example:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
WAGTAILSEARCH_BACKENDS = {
|
||||
'default': {
|
||||
'BACKEND': 'wagtail.contrib.postgres_search.backend',
|
||||
'SEARCH_CONFIG': 'english',
|
||||
}
|
||||
}
|
||||
|
||||
As you can deduce, a PostgreSQL search configuration is mostly used to define
|
||||
rules for a language, English in this case. A search configuration consists
|
||||
in a compilation of algorithms (parsers & analysers)
|
||||
and language specifications (stop words, stems, dictionaries, synonyms,
|
||||
thesauruses, etc.).
|
||||
|
||||
A few search configurations are already defined by default in PostgreSQL.
|
||||
You can list them using ``sudo -u postgres psql -c "\dF"`` in a Unix shell
|
||||
or by using this SQL query: ``SELECT cfgname FROM pg_catalog.pg_ts_config``.
|
||||
|
||||
These already-defined search configurations are decent, but they’re basic
|
||||
compared to commercial search engines.
|
||||
If you want better support for your language, you will have to create
|
||||
your own PostgreSQL search configuration. See the PostgreSQL documentation for
|
||||
`an example <https://www.postgresql.org/docs/current/static/textsearch-configuration.html>`_,
|
||||
`the list of parsers <https://www.postgresql.org/docs/current/static/textsearch-parsers.html>`_,
|
||||
and `a guide to use dictionaries <https://www.postgresql.org/docs/current/static/textsearch-dictionaries.html>`_.
|
||||
|
||||
Atomic rebuild
|
||||
--------------
|
||||
|
||||
Like the Elasticsearch backend, this backend supports
|
||||
:ref:`wagtailsearch_backends_atomic_rebuild`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
WAGTAILSEARCH_BACKENDS = {
|
||||
'default': {
|
||||
'BACKEND': 'wagtail.contrib.postgres_search.backend',
|
||||
'ATOMIC_REBUILD': True,
|
||||
}
|
||||
}
|
||||
|
||||
This is nearly useless with this backend. In Elasticsearch, all data
|
||||
is removed before rebuilding the index. But in this PostgreSQL backend,
|
||||
only objects no longer in the database are removed. Then the index is
|
||||
progressively updated, with no moment where the index is empty.
|
||||
|
||||
However, if you want to be extra sure that nothing wrong happens while updating
|
||||
the index, you can use atomic rebuild. The index will be rebuilt, but nobody
|
||||
will have access to it until reindexing is complete. If any error occurs during
|
||||
the operation, all changes to the index are reverted
|
||||
as if reindexing was never started.
|
|
@ -1,8 +0,0 @@
|
|||
import django
|
||||
|
||||
|
||||
if django.VERSION >= (3, 2):
|
||||
# The declaration is only needed for older Django versions
|
||||
pass
|
||||
else:
|
||||
default_app_config = 'wagtail.contrib.postgres_search.apps.PostgresSearchConfig'
|
|
@ -1,35 +0,0 @@
|
|||
import warnings
|
||||
|
||||
from django.apps import AppConfig
|
||||
from django.core.checks import Error, Tags, register
|
||||
|
||||
from wagtail.utils.deprecation import RemovedInWagtail217Warning
|
||||
|
||||
from .utils import get_postgresql_connections, set_weights
|
||||
|
||||
|
||||
class PostgresSearchConfig(AppConfig):
|
||||
name = 'wagtail.contrib.postgres_search'
|
||||
default_auto_field = 'django.db.models.AutoField'
|
||||
|
||||
def ready(self):
|
||||
|
||||
warnings.warn(
|
||||
"The wagtail.contrib.postgres_search backend is deprecated and has been replaced by "
|
||||
"wagtail.search.backends.database. "
|
||||
"See https://docs.wagtail.org/en/stable/releases/2.15.html#database-search-backends-replaced",
|
||||
category=RemovedInWagtail217Warning
|
||||
)
|
||||
|
||||
@register(Tags.compatibility, Tags.database)
|
||||
def check_if_postgresql(app_configs, **kwargs):
|
||||
if get_postgresql_connections():
|
||||
return []
|
||||
return [Error('You must use a PostgreSQL database '
|
||||
'to use PostgreSQL search.',
|
||||
id='wagtail.contrib.postgres_search.E001')]
|
||||
|
||||
set_weights()
|
||||
|
||||
from .models import IndexEntry
|
||||
IndexEntry.add_generic_relations()
|
|
@ -1,710 +0,0 @@
|
|||
import warnings
|
||||
|
||||
from collections import OrderedDict
|
||||
from functools import reduce
|
||||
|
||||
from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
|
||||
from django.db import DEFAULT_DB_ALIAS, NotSupportedError, connections, transaction
|
||||
from django.db.models import Avg, Count, F, Manager, Q, TextField, Value
|
||||
from django.db.models.constants import LOOKUP_SEP
|
||||
from django.db.models.functions import Cast, Length
|
||||
from django.db.models.sql.subqueries import InsertQuery
|
||||
from django.utils.encoding import force_str
|
||||
from django.utils.functional import cached_property
|
||||
|
||||
from wagtail.search.backends.base import (
|
||||
BaseSearchBackend, BaseSearchQueryCompiler, BaseSearchResults, FilterFieldError)
|
||||
from wagtail.search.index import AutocompleteField, RelatedFields, SearchField, get_indexed_models
|
||||
from wagtail.search.query import And, Boost, MatchAll, Not, Or, Phrase, PlainText
|
||||
from wagtail.search.utils import ADD, MUL, OR
|
||||
from wagtail.utils.deprecation import RemovedInWagtail217Warning
|
||||
|
||||
from .models import IndexEntry
|
||||
from .query import Lexeme
|
||||
from .utils import (
|
||||
get_content_type_pk, get_descendants_content_types_pks, get_postgresql_connections,
|
||||
get_sql_weights, get_weight)
|
||||
|
||||
|
||||
warnings.warn(
|
||||
"The wagtail.contrib.postgres_search backend is deprecated and has been replaced by "
|
||||
"wagtail.search.backends.database. "
|
||||
"See https://docs.wagtail.org/en/stable/releases/2.15.html#database-search-backends-replaced",
|
||||
category=RemovedInWagtail217Warning
|
||||
)
|
||||
|
||||
|
||||
EMPTY_VECTOR = SearchVector(Value('', output_field=TextField()))
|
||||
|
||||
|
||||
class ObjectIndexer:
|
||||
"""
|
||||
Responsible for extracting data from an object to be inserted into the index.
|
||||
"""
|
||||
def __init__(self, obj, backend):
|
||||
self.obj = obj
|
||||
self.search_fields = obj.get_search_fields()
|
||||
self.config = backend.config
|
||||
self.autocomplete_config = backend.autocomplete_config
|
||||
|
||||
def prepare_value(self, value):
|
||||
if isinstance(value, str):
|
||||
return value
|
||||
|
||||
elif isinstance(value, list):
|
||||
return ', '.join(self.prepare_value(item) for item in value)
|
||||
|
||||
elif isinstance(value, dict):
|
||||
return ', '.join(self.prepare_value(item)
|
||||
for item in value.values())
|
||||
|
||||
return force_str(value)
|
||||
|
||||
def prepare_field(self, obj, field):
|
||||
if isinstance(field, SearchField):
|
||||
yield (field, get_weight(field.boost),
|
||||
self.prepare_value(field.get_value(obj)))
|
||||
|
||||
elif isinstance(field, AutocompleteField):
|
||||
# AutocompleteField does not define a boost parameter, so use a base weight of 'D'
|
||||
yield (field, 'D', self.prepare_value(field.get_value(obj)))
|
||||
|
||||
elif isinstance(field, RelatedFields):
|
||||
sub_obj = field.get_value(obj)
|
||||
if sub_obj is None:
|
||||
return
|
||||
|
||||
if isinstance(sub_obj, Manager):
|
||||
sub_objs = sub_obj.all()
|
||||
|
||||
else:
|
||||
if callable(sub_obj):
|
||||
sub_obj = sub_obj()
|
||||
|
||||
sub_objs = [sub_obj]
|
||||
|
||||
for sub_obj in sub_objs:
|
||||
for sub_field in field.fields:
|
||||
yield from self.prepare_field(sub_obj, sub_field)
|
||||
|
||||
def as_vector(self, texts, for_autocomplete=False):
|
||||
"""
|
||||
Converts an array of strings into a SearchVector that can be indexed.
|
||||
"""
|
||||
texts = [(text.strip(), weight) for text, weight in texts]
|
||||
texts = [(text, weight) for text, weight in texts if text]
|
||||
|
||||
if not texts:
|
||||
return EMPTY_VECTOR
|
||||
|
||||
search_config = self.autocomplete_config if for_autocomplete else self.config
|
||||
|
||||
return ADD([
|
||||
SearchVector(Value(text, output_field=TextField()), weight=weight, config=search_config)
|
||||
for text, weight in texts
|
||||
])
|
||||
|
||||
@cached_property
|
||||
def id(self):
|
||||
"""
|
||||
Returns the value to use as the ID of the record in the index
|
||||
"""
|
||||
return force_str(self.obj.pk)
|
||||
|
||||
@cached_property
|
||||
def title(self):
|
||||
"""
|
||||
Returns all values to index as "title". This is the value of all SearchFields that have the field_name 'title'
|
||||
"""
|
||||
texts = []
|
||||
for field in self.search_fields:
|
||||
for current_field, boost, value in self.prepare_field(self.obj, field):
|
||||
if isinstance(current_field, SearchField) and current_field.field_name == 'title':
|
||||
texts.append((value, boost))
|
||||
|
||||
return self.as_vector(texts)
|
||||
|
||||
@cached_property
|
||||
def body(self):
|
||||
"""
|
||||
Returns all values to index as "body". This is the value of all SearchFields excluding the title
|
||||
"""
|
||||
texts = []
|
||||
for field in self.search_fields:
|
||||
for current_field, boost, value in self.prepare_field(self.obj, field):
|
||||
if isinstance(current_field, SearchField) and not current_field.field_name == 'title':
|
||||
texts.append((value, boost))
|
||||
|
||||
return self.as_vector(texts)
|
||||
|
||||
@cached_property
|
||||
def autocomplete(self):
|
||||
"""
|
||||
Returns all values to index as "autocomplete". This is the value of all AutocompleteFields
|
||||
"""
|
||||
texts = []
|
||||
for field in self.search_fields:
|
||||
for current_field, boost, value in self.prepare_field(self.obj, field):
|
||||
if isinstance(current_field, AutocompleteField):
|
||||
texts.append((value, boost))
|
||||
|
||||
return self.as_vector(texts, for_autocomplete=True)
|
||||
|
||||
|
||||
class Index:
|
||||
def __init__(self, backend, db_alias=None):
|
||||
self.backend = backend
|
||||
self.name = self.backend.index_name
|
||||
self.db_alias = DEFAULT_DB_ALIAS if db_alias is None else db_alias
|
||||
self.connection = connections[self.db_alias]
|
||||
if self.connection.vendor != 'postgresql':
|
||||
raise NotSupportedError(
|
||||
'You must select a PostgreSQL database '
|
||||
'to use PostgreSQL search.')
|
||||
|
||||
# Whether to allow adding items via the faster upsert method available in Postgres >=9.5
|
||||
self._enable_upsert = (self.connection.pg_version >= 90500)
|
||||
|
||||
self.entries = IndexEntry._default_manager.using(self.db_alias)
|
||||
|
||||
def add_model(self, model):
|
||||
pass
|
||||
|
||||
def refresh(self):
|
||||
pass
|
||||
|
||||
def _refresh_title_norms(self, full=False):
|
||||
"""
|
||||
Refreshes the value of the title_norm field.
|
||||
|
||||
This needs to be set to 'lavg/ld' where:
|
||||
- lavg is the average length of titles in all documents (also in terms)
|
||||
- ld is the length of the title field in this document (in terms)
|
||||
"""
|
||||
|
||||
lavg = self.entries.annotate(title_length=Length('title')).filter(title_length__gt=0).aggregate(Avg('title_length'))['title_length__avg']
|
||||
|
||||
if full:
|
||||
# Update the whole table
|
||||
# This is the most accurate option but requires a full table rewrite
|
||||
# so we can't do it too often as it could lead to locking issues.
|
||||
entries = self.entries
|
||||
|
||||
else:
|
||||
# Only update entries where title_norm is 1.0
|
||||
# This is the default value set on new entries.
|
||||
# It's possible that other entries could have this exact value but there shouldn't be too many of those
|
||||
entries = self.entries.filter(title_norm=1.0)
|
||||
|
||||
entries.annotate(title_length=Length('title')).filter(title_length__gt=0).update(title_norm=lavg / F('title_length'))
|
||||
|
||||
def delete_stale_model_entries(self, model):
|
||||
existing_pks = (
|
||||
model._default_manager.using(self.db_alias)
|
||||
.annotate(object_id=Cast('pk', TextField()))
|
||||
.values('object_id')
|
||||
)
|
||||
content_types_pks = get_descendants_content_types_pks(model)
|
||||
stale_entries = (
|
||||
self.entries.filter(content_type_id__in=content_types_pks)
|
||||
.exclude(object_id__in=existing_pks)
|
||||
)
|
||||
stale_entries.delete()
|
||||
|
||||
def delete_stale_entries(self):
|
||||
for model in get_indexed_models():
|
||||
# We don’t need to delete stale entries for non-root models,
|
||||
# since we already delete them by deleting roots.
|
||||
if not model._meta.parents:
|
||||
self.delete_stale_model_entries(model)
|
||||
|
||||
def add_item(self, obj):
|
||||
self.add_items(obj._meta.model, [obj])
|
||||
|
||||
def add_items_upsert(self, content_type_pk, indexers):
|
||||
compiler = InsertQuery(IndexEntry).get_compiler(connection=self.connection)
|
||||
title_sql = []
|
||||
autocomplete_sql = []
|
||||
body_sql = []
|
||||
data_params = []
|
||||
|
||||
for indexer in indexers:
|
||||
data_params.extend((content_type_pk, indexer.id))
|
||||
|
||||
# Compile title value
|
||||
value = compiler.prepare_value(IndexEntry._meta.get_field('title'), indexer.title)
|
||||
sql, params = value.as_sql(compiler, self.connection)
|
||||
title_sql.append(sql)
|
||||
data_params.extend(params)
|
||||
|
||||
# Compile autocomplete value
|
||||
value = compiler.prepare_value(IndexEntry._meta.get_field('autocomplete'), indexer.autocomplete)
|
||||
sql, params = value.as_sql(compiler, self.connection)
|
||||
autocomplete_sql.append(sql)
|
||||
data_params.extend(params)
|
||||
|
||||
# Compile body value
|
||||
value = compiler.prepare_value(IndexEntry._meta.get_field('body'), indexer.body)
|
||||
sql, params = value.as_sql(compiler, self.connection)
|
||||
body_sql.append(sql)
|
||||
data_params.extend(params)
|
||||
|
||||
data_sql = ', '.join([
|
||||
'(%%s, %%s, %s, %s, %s, 1.0)' % (a, b, c)
|
||||
for a, b, c in zip(title_sql, autocomplete_sql, body_sql)
|
||||
])
|
||||
|
||||
with self.connection.cursor() as cursor:
|
||||
cursor.execute("""
|
||||
INSERT INTO %s (content_type_id, object_id, title, autocomplete, body, title_norm)
|
||||
(VALUES %s)
|
||||
ON CONFLICT (content_type_id, object_id)
|
||||
DO UPDATE SET title = EXCLUDED.title,
|
||||
title_norm = 1.0,
|
||||
autocomplete = EXCLUDED.autocomplete,
|
||||
body = EXCLUDED.body
|
||||
""" % (IndexEntry._meta.db_table, data_sql), data_params)
|
||||
|
||||
self._refresh_title_norms()
|
||||
|
||||
def add_items_update_then_create(self, content_type_pk, indexers):
|
||||
ids_and_data = {}
|
||||
for indexer in indexers:
|
||||
ids_and_data[indexer.id] = (indexer.title, indexer.autocomplete, indexer.body)
|
||||
|
||||
index_entries_for_ct = self.entries.filter(content_type_id=content_type_pk)
|
||||
indexed_ids = frozenset(
|
||||
index_entries_for_ct.filter(object_id__in=ids_and_data.keys()).values_list('object_id', flat=True)
|
||||
)
|
||||
for indexed_id in indexed_ids:
|
||||
title, autocomplete, body = ids_and_data[indexed_id]
|
||||
index_entries_for_ct.filter(object_id=indexed_id).update(title=title, autocomplete=autocomplete, body=body)
|
||||
|
||||
to_be_created = []
|
||||
for object_id in ids_and_data.keys():
|
||||
if object_id not in indexed_ids:
|
||||
title, autocomplete, body = ids_and_data[object_id]
|
||||
to_be_created.append(IndexEntry(
|
||||
content_type_id=content_type_pk,
|
||||
object_id=object_id,
|
||||
title=title,
|
||||
autocomplete=autocomplete,
|
||||
body=body
|
||||
))
|
||||
|
||||
self.entries.bulk_create(to_be_created)
|
||||
|
||||
self._refresh_title_norms()
|
||||
|
||||
def add_items(self, model, objs):
|
||||
search_fields = model.get_search_fields()
|
||||
if not search_fields:
|
||||
return
|
||||
|
||||
indexers = [ObjectIndexer(obj, self.backend) for obj in objs]
|
||||
|
||||
# TODO: Delete unindexed objects while dealing with proxy models.
|
||||
if indexers:
|
||||
content_type_pk = get_content_type_pk(model)
|
||||
|
||||
update_method = (
|
||||
self.add_items_upsert if self._enable_upsert
|
||||
else self.add_items_update_then_create)
|
||||
update_method(content_type_pk, indexers)
|
||||
|
||||
def delete_item(self, item):
|
||||
item.index_entries.using(self.db_alias).delete()
|
||||
|
||||
def __str__(self):
|
||||
return self.name
|
||||
|
||||
|
||||
class PostgresSearchQueryCompiler(BaseSearchQueryCompiler):
|
||||
DEFAULT_OPERATOR = 'and'
|
||||
LAST_TERM_IS_PREFIX = False
|
||||
TARGET_SEARCH_FIELD_TYPE = SearchField
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
|
||||
local_search_fields = self.get_search_fields_for_model()
|
||||
|
||||
# Due to a Django bug, arrays are not automatically converted
|
||||
# when we use WEIGHTS_VALUES.
|
||||
self.sql_weights = get_sql_weights()
|
||||
|
||||
if self.fields is None:
|
||||
# search over the fields defined on the current model
|
||||
self.search_fields = local_search_fields
|
||||
else:
|
||||
# build a search_fields set from the passed definition,
|
||||
# which may involve traversing relations
|
||||
self.search_fields = {
|
||||
field_lookup: self.get_search_field(field_lookup, fields=local_search_fields)
|
||||
for field_lookup in self.fields
|
||||
}
|
||||
|
||||
def get_config(self, backend):
|
||||
return backend.config
|
||||
|
||||
def get_search_fields_for_model(self):
|
||||
return self.queryset.model.get_searchable_search_fields()
|
||||
|
||||
def get_search_field(self, field_lookup, fields=None):
|
||||
if fields is None:
|
||||
fields = self.search_fields
|
||||
|
||||
if LOOKUP_SEP in field_lookup:
|
||||
field_lookup, sub_field_name = field_lookup.split(LOOKUP_SEP, 1)
|
||||
else:
|
||||
sub_field_name = None
|
||||
|
||||
for field in fields:
|
||||
if isinstance(field, self.TARGET_SEARCH_FIELD_TYPE) and field.field_name == field_lookup:
|
||||
return field
|
||||
|
||||
# Note: Searching on a specific related field using
|
||||
# `.search(fields=…)` is not yet supported by Wagtail.
|
||||
# This method anticipates by already implementing it.
|
||||
if isinstance(field, RelatedFields) and field.field_name == field_lookup:
|
||||
return self.get_search_field(sub_field_name, field.fields)
|
||||
|
||||
def build_tsquery_content(self, query, config=None, invert=False):
|
||||
if isinstance(query, PlainText):
|
||||
terms = query.query_string.split()
|
||||
if not terms:
|
||||
return None
|
||||
|
||||
last_term = terms.pop()
|
||||
|
||||
lexemes = Lexeme(last_term, invert=invert, prefix=self.LAST_TERM_IS_PREFIX)
|
||||
for term in terms:
|
||||
new_lexeme = Lexeme(term, invert=invert)
|
||||
|
||||
if query.operator == 'and':
|
||||
lexemes &= new_lexeme
|
||||
else:
|
||||
lexemes |= new_lexeme
|
||||
|
||||
return SearchQuery(lexemes, search_type='raw', config=config)
|
||||
|
||||
elif isinstance(query, Phrase):
|
||||
return SearchQuery(query.query_string, search_type='phrase')
|
||||
|
||||
elif isinstance(query, Boost):
|
||||
# Not supported
|
||||
msg = "The Boost query is not supported by the PostgreSQL search backend."
|
||||
warnings.warn(msg, RuntimeWarning)
|
||||
|
||||
return self.build_tsquery_content(query.subquery, config=config, invert=invert)
|
||||
|
||||
elif isinstance(query, Not):
|
||||
return self.build_tsquery_content(query.subquery, config=config, invert=not invert)
|
||||
|
||||
elif isinstance(query, (And, Or)):
|
||||
# If this part of the query is inverted, we swap the operator and
|
||||
# pass down the inversion state to the child queries.
|
||||
# This works thanks to De Morgan's law.
|
||||
#
|
||||
# For example, the following query:
|
||||
#
|
||||
# Not(And(Term("A"), Term("B")))
|
||||
#
|
||||
# Is equivalent to:
|
||||
#
|
||||
# Or(Not(Term("A")), Not(Term("B")))
|
||||
#
|
||||
# It's simpler to code it this way as we only need to store the
|
||||
# invert status of the terms rather than all the operators.
|
||||
|
||||
subquery_lexemes = [
|
||||
self.build_tsquery_content(subquery, config=config, invert=invert)
|
||||
for subquery in query.subqueries
|
||||
]
|
||||
|
||||
is_and = isinstance(query, And)
|
||||
|
||||
if invert:
|
||||
is_and = not is_and
|
||||
|
||||
if is_and:
|
||||
return reduce(lambda a, b: a & b, subquery_lexemes)
|
||||
else:
|
||||
return reduce(lambda a, b: a | b, subquery_lexemes)
|
||||
|
||||
raise NotImplementedError(
|
||||
'`%s` is not supported by the PostgreSQL search backend.'
|
||||
% query.__class__.__name__)
|
||||
|
||||
def build_tsquery(self, query, config=None):
|
||||
return self.build_tsquery_content(query, config=config)
|
||||
|
||||
def build_tsrank(self, vector, query, config=None, boost=1.0):
|
||||
if isinstance(query, (Phrase, PlainText, Not)):
|
||||
rank_expression = SearchRank(
|
||||
vector,
|
||||
self.build_tsquery(query, config=config),
|
||||
weights=self.sql_weights
|
||||
)
|
||||
|
||||
if boost != 1.0:
|
||||
rank_expression *= boost
|
||||
|
||||
return rank_expression
|
||||
|
||||
elif isinstance(query, Boost):
|
||||
boost *= query.boost
|
||||
return self.build_tsrank(vector, query.subquery, config=config, boost=boost)
|
||||
|
||||
elif isinstance(query, And):
|
||||
return MUL(
|
||||
1 + self.build_tsrank(vector, subquery, config=config, boost=boost)
|
||||
for subquery in query.subqueries
|
||||
) - 1
|
||||
|
||||
elif isinstance(query, Or):
|
||||
return ADD(
|
||||
self.build_tsrank(vector, subquery, config=config, boost=boost)
|
||||
for subquery in query.subqueries
|
||||
) / (len(query.subqueries) or 1)
|
||||
|
||||
raise NotImplementedError(
|
||||
'`%s` is not supported by the PostgreSQL search backend.'
|
||||
% query.__class__.__name__)
|
||||
|
||||
def get_index_vectors(self, search_query):
|
||||
return [
|
||||
(F('postgres_index_entries__title'), F('postgres_index_entries__title_norm')),
|
||||
(F('postgres_index_entries__body'), 1.0),
|
||||
]
|
||||
|
||||
def get_fields_vectors(self, search_query):
|
||||
return [
|
||||
(SearchVector(
|
||||
field_lookup,
|
||||
config=search_query.config,
|
||||
), search_field.boost)
|
||||
for field_lookup, search_field in self.search_fields.items()
|
||||
]
|
||||
|
||||
def get_search_vectors(self, search_query):
|
||||
if self.fields is None:
|
||||
return self.get_index_vectors(search_query)
|
||||
|
||||
else:
|
||||
return self.get_fields_vectors(search_query)
|
||||
|
||||
def _build_rank_expression(self, vectors, config):
|
||||
rank_expressions = [
|
||||
self.build_tsrank(vector, self.query, config=config) * boost
|
||||
for vector, boost in vectors
|
||||
]
|
||||
|
||||
rank_expression = rank_expressions[0]
|
||||
for other_rank_expression in rank_expressions[1:]:
|
||||
rank_expression += other_rank_expression
|
||||
|
||||
return rank_expression
|
||||
|
||||
def search(self, config, start, stop, score_field=None):
|
||||
# TODO: Handle MatchAll nested inside other search query classes.
|
||||
if isinstance(self.query, MatchAll):
|
||||
return self.queryset[start:stop]
|
||||
|
||||
elif isinstance(self.query, Not) and isinstance(self.query.subquery, MatchAll):
|
||||
return self.queryset.none()
|
||||
|
||||
search_query = self.build_tsquery(self.query, config=config)
|
||||
vectors = self.get_search_vectors(search_query)
|
||||
rank_expression = self._build_rank_expression(vectors, config)
|
||||
|
||||
combined_vector = vectors[0][0]
|
||||
for vector, boost in vectors[1:]:
|
||||
combined_vector = combined_vector._combine(vector, '||', False)
|
||||
|
||||
queryset = self.queryset.annotate(_vector_=combined_vector).filter(_vector_=search_query)
|
||||
|
||||
if self.order_by_relevance:
|
||||
queryset = queryset.order_by(rank_expression.desc(), '-pk')
|
||||
|
||||
elif not queryset.query.order_by:
|
||||
# Adds a default ordering to avoid issue #3729.
|
||||
queryset = queryset.order_by('-pk')
|
||||
rank_expression = F('pk')
|
||||
|
||||
if score_field is not None:
|
||||
queryset = queryset.annotate(**{score_field: rank_expression})
|
||||
|
||||
return queryset[start:stop]
|
||||
|
||||
def _process_lookup(self, field, lookup, value):
|
||||
lhs = field.get_attname(self.queryset.model) + '__' + lookup
|
||||
return Q(**{lhs: value})
|
||||
|
||||
def _connect_filters(self, filters, connector, negated):
|
||||
if connector == 'AND':
|
||||
q = Q(*filters)
|
||||
|
||||
elif connector == 'OR':
|
||||
q = OR([Q(fil) for fil in filters])
|
||||
|
||||
else:
|
||||
return
|
||||
|
||||
if negated:
|
||||
q = ~q
|
||||
|
||||
return q
|
||||
|
||||
|
||||
class PostgresAutocompleteQueryCompiler(PostgresSearchQueryCompiler):
|
||||
LAST_TERM_IS_PREFIX = True
|
||||
TARGET_SEARCH_FIELD_TYPE = AutocompleteField
|
||||
|
||||
def get_config(self, backend):
|
||||
return backend.autocomplete_config
|
||||
|
||||
def get_search_fields_for_model(self):
|
||||
return self.queryset.model.get_autocomplete_search_fields()
|
||||
|
||||
def get_index_vectors(self, search_query):
|
||||
return [(F('postgres_index_entries__autocomplete'), 1.0)]
|
||||
|
||||
def get_fields_vectors(self, search_query):
|
||||
return [
|
||||
(SearchVector(
|
||||
field_lookup,
|
||||
config=search_query.config,
|
||||
weight='D',
|
||||
), 1.0)
|
||||
for field_lookup, search_field in self.search_fields.items()
|
||||
]
|
||||
|
||||
|
||||
class PostgresSearchResults(BaseSearchResults):
|
||||
def get_queryset(self, for_count=False):
|
||||
if for_count:
|
||||
start = None
|
||||
stop = None
|
||||
else:
|
||||
start = self.start
|
||||
stop = self.stop
|
||||
|
||||
return self.query_compiler.search(
|
||||
self.query_compiler.get_config(self.backend),
|
||||
start,
|
||||
stop,
|
||||
score_field=self._score_field
|
||||
)
|
||||
|
||||
def _do_search(self):
|
||||
return list(self.get_queryset())
|
||||
|
||||
def _do_count(self):
|
||||
return self.get_queryset(for_count=True).count()
|
||||
|
||||
supports_facet = True
|
||||
|
||||
def facet(self, field_name):
|
||||
# Get field
|
||||
field = self.query_compiler._get_filterable_field(field_name)
|
||||
if field is None:
|
||||
raise FilterFieldError(
|
||||
'Cannot facet search results with field "' + field_name + '". Please add index.FilterField(\''
|
||||
+ field_name + '\') to ' + self.query_compiler.queryset.model.__name__ + '.search_fields.',
|
||||
field_name=field_name
|
||||
)
|
||||
|
||||
query = self.query_compiler.search(self.query_compiler.get_config(self.backend), None, None)
|
||||
results = query.values(field_name).annotate(count=Count('pk')).order_by('-count')
|
||||
|
||||
return OrderedDict([
|
||||
(result[field_name], result['count'])
|
||||
for result in results
|
||||
])
|
||||
|
||||
|
||||
class PostgresSearchRebuilder:
|
||||
def __init__(self, index):
|
||||
self.index = index
|
||||
|
||||
def start(self):
|
||||
self.index.delete_stale_entries()
|
||||
return self.index
|
||||
|
||||
def finish(self):
|
||||
self.index._refresh_title_norms(full=True)
|
||||
|
||||
|
||||
class PostgresSearchAtomicRebuilder(PostgresSearchRebuilder):
|
||||
def __init__(self, index):
|
||||
super().__init__(index)
|
||||
self.transaction = transaction.atomic(using=index.db_alias)
|
||||
self.transaction_opened = False
|
||||
|
||||
def start(self):
|
||||
self.transaction.__enter__()
|
||||
self.transaction_opened = True
|
||||
return super().start()
|
||||
|
||||
def finish(self):
|
||||
self.index._refresh_title_norms(full=True)
|
||||
|
||||
self.transaction.__exit__(None, None, None)
|
||||
self.transaction_opened = False
|
||||
|
||||
def __del__(self):
|
||||
# TODO: Implement a cleaner way to close the connection on failure.
|
||||
if self.transaction_opened:
|
||||
self.transaction.needs_rollback = True
|
||||
self.finish()
|
||||
|
||||
|
||||
class PostgresSearchBackend(BaseSearchBackend):
|
||||
query_compiler_class = PostgresSearchQueryCompiler
|
||||
autocomplete_query_compiler_class = PostgresAutocompleteQueryCompiler
|
||||
results_class = PostgresSearchResults
|
||||
rebuilder_class = PostgresSearchRebuilder
|
||||
atomic_rebuilder_class = PostgresSearchAtomicRebuilder
|
||||
|
||||
def __init__(self, params):
|
||||
super().__init__(params)
|
||||
self.index_name = params.get('INDEX', 'default')
|
||||
self.config = params.get('SEARCH_CONFIG')
|
||||
|
||||
# Use 'simple' config for autocomplete to disable stemming
|
||||
# A good description for why this is important can be found at:
|
||||
# https://www.postgresql.org/docs/9.1/datatype-textsearch.html#DATATYPE-TSQUERY
|
||||
self.autocomplete_config = params.get('AUTOCOMPLETE_SEARCH_CONFIG', 'simple')
|
||||
|
||||
if params.get('ATOMIC_REBUILD', False):
|
||||
self.rebuilder_class = self.atomic_rebuilder_class
|
||||
|
||||
def get_index_for_model(self, model, db_alias=None):
|
||||
return Index(self, db_alias)
|
||||
|
||||
def get_index_for_object(self, obj):
|
||||
return self.get_index_for_model(obj._meta.model, obj._state.db)
|
||||
|
||||
def reset_index(self):
|
||||
for connection in get_postgresql_connections():
|
||||
IndexEntry._default_manager.using(connection.alias).delete()
|
||||
|
||||
def add_type(self, model):
|
||||
pass # Not needed.
|
||||
|
||||
def refresh_index(self):
|
||||
pass # Not needed.
|
||||
|
||||
def add(self, obj):
|
||||
self.get_index_for_object(obj).add_item(obj)
|
||||
|
||||
def add_bulk(self, model, obj_list):
|
||||
if obj_list:
|
||||
self.get_index_for_object(obj_list[0]).add_items(model, obj_list)
|
||||
|
||||
def delete(self, obj):
|
||||
self.get_index_for_object(obj).delete_item(obj)
|
||||
|
||||
|
||||
SearchBackend = PostgresSearchBackend
|
|
@ -1,46 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
# Generated by Django 1.10.1 on 2017-03-22 14:53
|
||||
import django.db.models.deletion
|
||||
|
||||
from django.db import migrations, models
|
||||
|
||||
import django.contrib.postgres.fields.jsonb
|
||||
import django.contrib.postgres.search
|
||||
from ..models import IndexEntry
|
||||
|
||||
|
||||
table = IndexEntry._meta.db_table
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
|
||||
initial = True
|
||||
|
||||
dependencies = [
|
||||
('contenttypes', '0002_remove_content_type_name'),
|
||||
]
|
||||
|
||||
operations = [
|
||||
migrations.CreateModel(
|
||||
name='IndexEntry',
|
||||
fields=[
|
||||
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
|
||||
('object_id', models.TextField()),
|
||||
('body_search', django.contrib.postgres.search.SearchVectorField()),
|
||||
('content_type', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='contenttypes.ContentType')),
|
||||
],
|
||||
options={
|
||||
'verbose_name_plural': 'index entries',
|
||||
'verbose_name': 'index entry',
|
||||
},
|
||||
),
|
||||
migrations.AlterUniqueTogether(
|
||||
name='indexentry',
|
||||
unique_together=set([('content_type', 'object_id')]),
|
||||
),
|
||||
migrations.RunSQL(
|
||||
'CREATE INDEX {0}_body_search ON {0} '
|
||||
'USING GIN(body_search);'.format(table),
|
||||
'DROP INDEX {}_body_search;'.format(table),
|
||||
),
|
||||
]
|
|
@ -1,49 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
# Generated by Django 1.11.5 on 2017-10-19 14:53
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import django.contrib.postgres.search
|
||||
from django.db import migrations
|
||||
|
||||
from ..models import IndexEntry
|
||||
|
||||
|
||||
table = IndexEntry._meta.db_table
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
|
||||
dependencies = [
|
||||
('postgres_search', '0001_initial'),
|
||||
]
|
||||
|
||||
operations = [
|
||||
migrations.RunSQL(
|
||||
'DROP INDEX {}_body_search;'.format(table),
|
||||
'CREATE INDEX {0}_body_search ON {0} '
|
||||
'USING GIN(body_search);'.format(table),
|
||||
),
|
||||
migrations.RenameField(
|
||||
model_name='indexentry',
|
||||
old_name='body_search',
|
||||
new_name='body',
|
||||
),
|
||||
migrations.AddField(
|
||||
model_name='indexentry',
|
||||
name='autocomplete',
|
||||
field=django.contrib.postgres.search.SearchVectorField(default=''),
|
||||
preserve_default=False,
|
||||
),
|
||||
migrations.AddIndex(
|
||||
model_name='indexentry',
|
||||
index=django.contrib.postgres.indexes.GinIndex(
|
||||
fields=['autocomplete'],
|
||||
name='postgres_se_autocom_ee48c8_gin'),
|
||||
),
|
||||
migrations.AddIndex(
|
||||
model_name='indexentry',
|
||||
index=django.contrib.postgres.indexes.GinIndex(
|
||||
fields=['body'],
|
||||
name='postgres_se_body_aaaa99_gin'),
|
||||
),
|
||||
]
|
|
@ -1,30 +0,0 @@
|
|||
# Generated by Django 3.0.6 on 2020-04-24 13:00
|
||||
|
||||
import django.contrib.postgres.indexes
|
||||
import django.contrib.postgres.search
|
||||
from django.db import migrations, models
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
|
||||
dependencies = [
|
||||
('postgres_search', '0002_add_autocomplete'),
|
||||
]
|
||||
|
||||
operations = [
|
||||
migrations.AddField(
|
||||
model_name='indexentry',
|
||||
name='title',
|
||||
field=django.contrib.postgres.search.SearchVectorField(default=''),
|
||||
preserve_default=False,
|
||||
),
|
||||
migrations.AddIndex(
|
||||
model_name='indexentry',
|
||||
index=django.contrib.postgres.indexes.GinIndex(fields=['title'], name='postgres_se_title_b56f33_gin'),
|
||||
),
|
||||
migrations.AddField(
|
||||
model_name='indexentry',
|
||||
name='title_norm',
|
||||
field=models.FloatField(default=1.0),
|
||||
),
|
||||
]
|
|
@ -1,22 +0,0 @@
|
|||
# Generated by Django 3.1.3 on 2020-11-13 16:55
|
||||
|
||||
from django.db import migrations
|
||||
|
||||
from wagtail.contrib.postgres_search.models import IndexEntry
|
||||
|
||||
table = IndexEntry._meta.db_table
|
||||
|
||||
|
||||
class Migration(migrations.Migration):
|
||||
|
||||
dependencies = [
|
||||
('postgres_search', '0003_title'),
|
||||
]
|
||||
|
||||
operations = [
|
||||
migrations.RunSQL(
|
||||
'CREATE INDEX {0}_title_body_concat_search ON {0} '
|
||||
'USING GIN(( title || body));'.format(table),
|
||||
'DROP INDEX IF EXISTS {0}_title_body_concat_search;'.format(table),
|
||||
),
|
||||
]
|
|
@ -1,90 +0,0 @@
|
|||
from django import VERSION as DJANGO_VERSION
|
||||
from django.apps import apps
|
||||
from django.contrib.contenttypes.fields import GenericForeignKey, GenericRelation
|
||||
from django.contrib.contenttypes.models import ContentType
|
||||
from django.contrib.postgres.indexes import GinIndex
|
||||
from django.contrib.postgres.search import SearchVectorField
|
||||
from django.db import models
|
||||
from django.db.models.functions import Cast
|
||||
from django.db.models.sql.where import WhereNode
|
||||
from django.utils.translation import gettext_lazy as _
|
||||
|
||||
from wagtail.search.index import class_is_indexed
|
||||
|
||||
from .utils import get_descendants_content_types_pks
|
||||
|
||||
|
||||
class TextIDGenericRelation(GenericRelation):
|
||||
auto_created = True
|
||||
|
||||
def get_content_type_lookup(self, alias, remote_alias):
|
||||
field = self.remote_field.model._meta.get_field(
|
||||
self.content_type_field_name)
|
||||
return field.get_lookup('in')(
|
||||
field.get_col(remote_alias),
|
||||
get_descendants_content_types_pks(self.model))
|
||||
|
||||
def get_object_id_lookup(self, alias, remote_alias):
|
||||
from_field = self.remote_field.model._meta.get_field(
|
||||
self.object_id_field_name)
|
||||
to_field = self.model._meta.pk
|
||||
return from_field.get_lookup('exact')(
|
||||
from_field.get_col(remote_alias),
|
||||
Cast(to_field.get_col(alias), from_field))
|
||||
|
||||
if DJANGO_VERSION >= (4, 0):
|
||||
def get_extra_restriction(self, alias, remote_alias):
|
||||
cond = WhereNode()
|
||||
cond.add(self.get_content_type_lookup(alias, remote_alias), 'AND')
|
||||
cond.add(self.get_object_id_lookup(alias, remote_alias), 'AND')
|
||||
return cond
|
||||
else:
|
||||
def get_extra_restriction(self, where_class, alias, remote_alias):
|
||||
cond = where_class()
|
||||
cond.add(self.get_content_type_lookup(alias, remote_alias), 'AND')
|
||||
cond.add(self.get_object_id_lookup(alias, remote_alias), 'AND')
|
||||
return cond
|
||||
|
||||
def resolve_related_fields(self):
|
||||
return []
|
||||
|
||||
|
||||
class IndexEntry(models.Model):
|
||||
content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
|
||||
# We do not use an IntegerField since primary keys are not always integers.
|
||||
object_id = models.TextField()
|
||||
content_object = GenericForeignKey()
|
||||
|
||||
# TODO: Add per-object boosting.
|
||||
autocomplete = SearchVectorField()
|
||||
title = SearchVectorField()
|
||||
# This field stores the "Title Normalisation Factor"
|
||||
# This factor is multiplied onto the the rank of the title field.
|
||||
# This allows us to apply a boost to results with shorter titles
|
||||
# elevating more specific matches to the top.
|
||||
title_norm = models.FloatField(default=1.0)
|
||||
body = SearchVectorField()
|
||||
|
||||
class Meta:
|
||||
unique_together = ('content_type', 'object_id')
|
||||
verbose_name = _('index entry')
|
||||
verbose_name_plural = _('index entries')
|
||||
# An additional computed GIN index on 'title || body' is created in a SQL migration
|
||||
# covers the default case of PostgresSearchQueryCompiler.get_index_vectors.
|
||||
indexes = [GinIndex(fields=['autocomplete']),
|
||||
GinIndex(fields=['title']),
|
||||
GinIndex(fields=['body'])]
|
||||
|
||||
def __str__(self):
|
||||
return '%s: %s' % (self.content_type.name, self.content_object)
|
||||
|
||||
@property
|
||||
def model(self):
|
||||
return self.content_type.model
|
||||
|
||||
@classmethod
|
||||
def add_generic_relations(cls):
|
||||
for model in apps.get_models():
|
||||
if class_is_indexed(model):
|
||||
TextIDGenericRelation(cls).contribute_to_class(model,
|
||||
'postgres_index_entries')
|
|
@ -1,87 +0,0 @@
|
|||
# Originally from https://github.com/django/django/pull/8313
|
||||
# Resubmitted in https://github.com/django/django/pull/12727
|
||||
|
||||
# If that PR gets merged, we should be able to replace this with the version in Django.
|
||||
|
||||
from django.contrib.postgres.search import SearchQueryField
|
||||
from django.db.models.expressions import Expression, Value
|
||||
|
||||
|
||||
class LexemeCombinable(Expression):
|
||||
BITAND = '&'
|
||||
BITOR = '|'
|
||||
|
||||
def _combine(self, other, connector, reversed, node=None):
|
||||
if not isinstance(other, LexemeCombinable):
|
||||
raise TypeError(
|
||||
'Lexeme can only be combined with other Lexemes, '
|
||||
'got {}.'.format(type(other))
|
||||
)
|
||||
if reversed:
|
||||
return CombinedLexeme(other, connector, self)
|
||||
return CombinedLexeme(self, connector, other)
|
||||
|
||||
# On Combinable, these are not implemented to reduce confusion with Q. In
|
||||
# this case we are actually (ab)using them to do logical combination so
|
||||
# it's consistent with other usage in Django.
|
||||
def bitand(self, other):
|
||||
return self._combine(other, self.BITAND, False)
|
||||
|
||||
def bitor(self, other):
|
||||
return self._combine(other, self.BITOR, False)
|
||||
|
||||
def __or__(self, other):
|
||||
return self._combine(other, self.BITOR, False)
|
||||
|
||||
def __and__(self, other):
|
||||
return self._combine(other, self.BITAND, False)
|
||||
|
||||
|
||||
class Lexeme(LexemeCombinable, Value):
|
||||
_output_field = SearchQueryField()
|
||||
|
||||
def __init__(self, value, output_field=None, *, invert=False, prefix=False, weight=None):
|
||||
self.prefix = prefix
|
||||
self.invert = invert
|
||||
self.weight = weight
|
||||
super().__init__(value, output_field=output_field)
|
||||
|
||||
def as_sql(self, compiler, connection):
|
||||
param = "'%s'" % self.value.replace("'", "''").replace("\\", "\\\\")
|
||||
|
||||
template = "%s"
|
||||
|
||||
label = ''
|
||||
if self.prefix:
|
||||
label += '*'
|
||||
if self.weight:
|
||||
label += self.weight
|
||||
|
||||
if label:
|
||||
param = '{}:{}'.format(param, label)
|
||||
if self.invert:
|
||||
param = '!{}'.format(param)
|
||||
|
||||
return template, [param]
|
||||
|
||||
|
||||
class CombinedLexeme(LexemeCombinable):
|
||||
_output_field = SearchQueryField()
|
||||
|
||||
def __init__(self, lhs, connector, rhs, output_field=None):
|
||||
super().__init__(output_field=output_field)
|
||||
self.connector = connector
|
||||
self.lhs = lhs
|
||||
self.rhs = rhs
|
||||
|
||||
def as_sql(self, compiler, connection):
|
||||
value_params = []
|
||||
lsql, params = compiler.compile(self.lhs)
|
||||
value_params.extend(params)
|
||||
|
||||
rsql, params = compiler.compile(self.rhs)
|
||||
value_params.extend(params)
|
||||
|
||||
combined_sql = '({} {} {})'.format(lsql, self.connector, rsql)
|
||||
combined_value = combined_sql % tuple(value_params)
|
||||
return '%s', [combined_value]
|
|
@ -1,151 +0,0 @@
|
|||
from django.test import TestCase
|
||||
|
||||
from wagtail.search.tests.test_backends import BackendTests
|
||||
from wagtail.tests.search import models
|
||||
|
||||
from ..utils import BOOSTS_WEIGHTS, WEIGHTS_VALUES, determine_boosts_weights, get_weight
|
||||
|
||||
|
||||
class TestPostgresSearchBackend(BackendTests, TestCase):
|
||||
backend_path = 'wagtail.contrib.postgres_search.backend'
|
||||
|
||||
def test_weights(self):
|
||||
self.assertListEqual(BOOSTS_WEIGHTS,
|
||||
[(10, 'A'), (2, 'B'), (0.5, 'C'), (0.25, 'D')])
|
||||
self.assertListEqual(WEIGHTS_VALUES, [0.025, 0.05, 0.2, 1.0])
|
||||
|
||||
self.assertEqual(get_weight(15), 'A')
|
||||
self.assertEqual(get_weight(10), 'A')
|
||||
self.assertEqual(get_weight(9.9), 'B')
|
||||
self.assertEqual(get_weight(2), 'B')
|
||||
self.assertEqual(get_weight(1.9), 'C')
|
||||
self.assertEqual(get_weight(0), 'D')
|
||||
self.assertEqual(get_weight(-1), 'D')
|
||||
|
||||
self.assertListEqual(determine_boosts_weights([1]),
|
||||
[(1, 'A'), (0, 'B'), (0, 'C'), (0, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([-1]),
|
||||
[(-1, 'A'), (-1, 'B'), (-1, 'C'), (-1, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([-1, 1, 2]),
|
||||
[(2, 'A'), (1, 'B'), (-1, 'C'), (-1, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([0, 1, 2, 3]),
|
||||
[(3, 'A'), (2, 'B'), (1, 'C'), (0, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([0, 0.25, 0.75, 1, 1.5]),
|
||||
[(1.5, 'A'), (1, 'B'), (0.5, 'C'), (0, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([0, 1, 2, 3, 4, 5, 6]),
|
||||
[(6, 'A'), (4, 'B'), (2, 'C'), (0, 'D')])
|
||||
self.assertListEqual(determine_boosts_weights([-2, -1, 0, 1, 2, 3, 4]),
|
||||
[(4, 'A'), (2, 'B'), (0, 'C'), (-2, 'D')])
|
||||
|
||||
def test_search_tsquery_chars(self):
|
||||
"""
|
||||
Checks that tsquery characters are correctly escaped
|
||||
and do not generate a PostgreSQL syntax error.
|
||||
"""
|
||||
|
||||
# Simple quote should be escaped inside each tsquery term.
|
||||
results = self.backend.search("L'amour piqué par une abeille",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("'starting quote",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("ending quote'",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("double quo''te",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("triple quo'''te",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now suffixes.
|
||||
results = self.backend.search("Something:B", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("Something:*", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.search("Something:A*BCD", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the AND operator.
|
||||
results = self.backend.search("first & second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the OR operator.
|
||||
results = self.backend.search("first | second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the NOT operator.
|
||||
results = self.backend.search("first & !second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the phrase operator.
|
||||
results = self.backend.search("first <-> second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
def test_autocomplete_tsquery_chars(self):
|
||||
"""
|
||||
Checks that tsquery characters are correctly escaped
|
||||
and do not generate a PostgreSQL syntax error.
|
||||
"""
|
||||
|
||||
# Simple quote should be escaped inside each tsquery term.
|
||||
results = self.backend.autocomplete("L'amour piqué par une abeille",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("'starting quote",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("ending quote'",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("double quo''te",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("triple quo'''te",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Backslashes should be escaped inside each tsquery term.
|
||||
results = self.backend.autocomplete("backslash\\",
|
||||
models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now suffixes.
|
||||
results = self.backend.autocomplete("Something:B", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("Something:*", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
results = self.backend.autocomplete("Something:A*BCD", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the AND operator.
|
||||
results = self.backend.autocomplete("first & second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the OR operator.
|
||||
results = self.backend.autocomplete("first | second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the NOT operator.
|
||||
results = self.backend.autocomplete("first & !second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
# Now the phrase operator.
|
||||
results = self.backend.autocomplete("first <-> second", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [])
|
||||
|
||||
def test_index_without_upsert(self):
|
||||
# Test the add_items code path for Postgres 9.4, where upsert is not available
|
||||
self.backend.reset_index()
|
||||
|
||||
index = self.backend.get_index_for_model(models.Book)
|
||||
index._enable_upsert = False
|
||||
index.add_items(models.Book, models.Book.objects.all())
|
||||
|
||||
results = self.backend.search("JavaScript", models.Book)
|
||||
self.assertUnsortedListEqual([r.title for r in results], [
|
||||
"JavaScript: The good parts",
|
||||
"JavaScript: The Definitive Guide"
|
||||
])
|
|
@ -1,43 +0,0 @@
|
|||
import unittest
|
||||
|
||||
from django.conf import settings
|
||||
from django.db import connection
|
||||
from django.test import TestCase
|
||||
|
||||
from wagtail.search.backends import get_search_backend
|
||||
from wagtail.tests.search import models
|
||||
|
||||
|
||||
class TestPostgresStemming(TestCase):
|
||||
def setUp(self):
|
||||
backend_name = "wagtail.contrib.postgres_search.backend"
|
||||
for conf in settings.WAGTAILSEARCH_BACKENDS.values():
|
||||
if conf['BACKEND'] == backend_name:
|
||||
break
|
||||
else:
|
||||
raise unittest.SkipTest("Only for %s" % backend_name)
|
||||
|
||||
self.backend = get_search_backend(backend_name)
|
||||
|
||||
def test_ru_stemming(self):
|
||||
with connection.cursor() as cursor:
|
||||
cursor.execute(
|
||||
"SET default_text_search_config TO 'pg_catalog.russian'"
|
||||
)
|
||||
|
||||
ru_book = models.Book.objects.create(
|
||||
title="Голубое сало", publication_date="1999-05-01",
|
||||
number_of_pages=352
|
||||
)
|
||||
self.backend.add(ru_book)
|
||||
|
||||
results = self.backend.search("Голубое", models.Book)
|
||||
self.assertEqual(list(results), [ru_book])
|
||||
|
||||
results = self.backend.search("Голубая", models.Book)
|
||||
self.assertEqual(list(results), [ru_book])
|
||||
|
||||
results = self.backend.search("Голубой", models.Book)
|
||||
self.assertEqual(list(results), [ru_book])
|
||||
|
||||
ru_book.delete()
|
|
@ -1,113 +0,0 @@
|
|||
from itertools import zip_longest
|
||||
|
||||
from django.apps import apps
|
||||
from django.db import connections
|
||||
|
||||
from wagtail.search.index import Indexed, RelatedFields, SearchField
|
||||
|
||||
|
||||
def get_postgresql_connections():
|
||||
return [connection for connection in connections.all()
|
||||
if connection.vendor == 'postgresql']
|
||||
|
||||
|
||||
def get_descendant_models(model):
|
||||
"""
|
||||
Returns all descendants of a model, including the model itself.
|
||||
"""
|
||||
descendant_models = {other_model for other_model in apps.get_models()
|
||||
if issubclass(other_model, model)}
|
||||
descendant_models.add(model)
|
||||
return descendant_models
|
||||
|
||||
|
||||
def get_content_type_pk(model):
|
||||
# We import it locally because this file is loaded before apps are ready.
|
||||
from django.contrib.contenttypes.models import ContentType
|
||||
return ContentType.objects.get_for_model(model).pk
|
||||
|
||||
|
||||
def get_ancestors_content_types_pks(model):
|
||||
"""
|
||||
Returns content types ids for the ancestors of this model, excluding it.
|
||||
"""
|
||||
from django.contrib.contenttypes.models import ContentType
|
||||
return [ct.pk for ct in
|
||||
ContentType.objects.get_for_models(*model._meta.get_parent_list())
|
||||
.values()]
|
||||
|
||||
|
||||
def get_descendants_content_types_pks(model):
|
||||
"""
|
||||
Returns content types ids for the descendants of this model, including it.
|
||||
"""
|
||||
from django.contrib.contenttypes.models import ContentType
|
||||
return [ct.pk for ct in
|
||||
ContentType.objects.get_for_models(*get_descendant_models(model))
|
||||
.values()]
|
||||
|
||||
|
||||
def get_search_fields(search_fields):
|
||||
for search_field in search_fields:
|
||||
if isinstance(search_field, SearchField):
|
||||
yield search_field
|
||||
elif isinstance(search_field, RelatedFields):
|
||||
for sub_field in get_search_fields(search_field.fields):
|
||||
yield sub_field
|
||||
|
||||
|
||||
WEIGHTS = 'ABCD'
|
||||
WEIGHTS_COUNT = len(WEIGHTS)
|
||||
# These are filled when apps are ready.
|
||||
BOOSTS_WEIGHTS = []
|
||||
WEIGHTS_VALUES = []
|
||||
|
||||
|
||||
def get_boosts():
|
||||
boosts = set()
|
||||
for model in apps.get_models():
|
||||
if issubclass(model, Indexed):
|
||||
for search_field in get_search_fields(model.get_search_fields()):
|
||||
boost = search_field.boost
|
||||
if boost is not None:
|
||||
boosts.add(boost)
|
||||
return boosts
|
||||
|
||||
|
||||
def determine_boosts_weights(boosts=()):
|
||||
if not boosts:
|
||||
boosts = get_boosts()
|
||||
boosts = list(sorted(boosts, reverse=True))
|
||||
min_boost = boosts[-1]
|
||||
if len(boosts) <= WEIGHTS_COUNT:
|
||||
return list(zip_longest(boosts, WEIGHTS, fillvalue=min(min_boost, 0)))
|
||||
max_boost = boosts[0]
|
||||
boost_step = (max_boost - min_boost) / (WEIGHTS_COUNT - 1)
|
||||
return [(max_boost - (i * boost_step), weight)
|
||||
for i, weight in enumerate(WEIGHTS)]
|
||||
|
||||
|
||||
def set_weights():
|
||||
BOOSTS_WEIGHTS.extend(determine_boosts_weights())
|
||||
weights = [w for w, c in BOOSTS_WEIGHTS]
|
||||
min_weight = min(weights)
|
||||
if min_weight <= 0:
|
||||
if min_weight == 0:
|
||||
min_weight = -0.1
|
||||
weights = [w - min_weight for w in weights]
|
||||
max_weight = max(weights)
|
||||
WEIGHTS_VALUES.extend([w / max_weight
|
||||
for w in reversed(weights)])
|
||||
|
||||
|
||||
def get_weight(boost):
|
||||
if boost is None:
|
||||
return WEIGHTS[-1]
|
||||
for max_boost, weight in BOOSTS_WEIGHTS:
|
||||
if boost >= max_boost:
|
||||
return weight
|
||||
return weight
|
||||
|
||||
|
||||
def get_sql_weights():
|
||||
return '{' + ','.join(map(str, WEIGHTS_VALUES)) + '}'
|
|
@ -18,8 +18,7 @@ def get_search_backend_config():
|
|||
|
||||
# Make sure the default backend is always defined
|
||||
search_backends.setdefault('default', {
|
||||
# RemovedInWagtail217Warning - will switch to wagtail.search.backends.database
|
||||
'BACKEND': 'wagtail.search.backends.db',
|
||||
'BACKEND': 'wagtail.search.backends.database',
|
||||
})
|
||||
|
||||
return search_backends
|
||||
|
|
|
@ -14,14 +14,9 @@ from wagtail.search.utils import AND, OR
|
|||
|
||||
# This file implements a database search backend using basic substring matching, and no
|
||||
# database-specific full-text search capabilities. It will be used in the following cases:
|
||||
# * The current default database is SQLite <3.19, or something other than PostgreSQL, MySQL or
|
||||
# SQLite
|
||||
# * The current default database is SQLite <3.19, or SQLite built without fulltext
|
||||
# extensions, or something other than PostgreSQL, MySQL or SQLite
|
||||
# * 'wagtail.search.backends.database.fallback' is specified directly as the search backend
|
||||
# * The deprecated 'wagtail.search.backends.db' backend is active; this is the default when no
|
||||
# WAGTAILSEARCH_BACKENDS setting is present.
|
||||
#
|
||||
# RemovedInWagtail217Warning - the default will be switched to wagtail.search.backends.database
|
||||
# and wagtail.search.backends.db will be dropped.
|
||||
|
||||
|
||||
MATCH_ALL = "_ALL_"
|
||||
|
|
|
@ -1,13 +0,0 @@
|
|||
from warnings import warn
|
||||
|
||||
from wagtail.search.backends.database.fallback import ( # noqa
|
||||
DatabaseSearchBackend, DatabaseSearchQueryCompiler, DatabaseSearchResults, SearchBackend)
|
||||
from wagtail.utils.deprecation import RemovedInWagtail217Warning
|
||||
|
||||
|
||||
warn(
|
||||
"The wagtail.search.backends.db search backend is deprecated and has been replaced by "
|
||||
"wagtail.search.backends.database. "
|
||||
"See https://docs.wagtail.org/en/stable/releases/2.15.html#database-search-backends-replaced",
|
||||
category=RemovedInWagtail217Warning
|
||||
)
|
|
@ -196,9 +196,8 @@ else:
|
|||
WAGTAIL_USER_CUSTOM_FIELDS = ['country', 'attachment']
|
||||
|
||||
if os.environ.get('DATABASE_ENGINE') == 'django.db.backends.postgresql':
|
||||
INSTALLED_APPS += ('wagtail.contrib.postgres_search',)
|
||||
WAGTAILSEARCH_BACKENDS['postgresql'] = {
|
||||
'BACKEND': 'wagtail.contrib.postgres_search.backend',
|
||||
'BACKEND': 'wagtail.search.backends.database',
|
||||
'AUTO_UPDATE': False,
|
||||
'SEARCH_CONFIG': 'english'
|
||||
}
|
||||
|
|
Ładowanie…
Reference in New Issue