datasette/docs/facets.rst

295 wiersze
8.9 KiB
ReStructuredText

.. _facets:
Facets
======
Datasette facets can be used to add a faceted browse interface to any database table.
With facets, tables are displayed along with a summary showing the most common values in specified columns.
These values can be selected to further filter the table.
Here's `an example <https://congress-legislators.datasettes.com/legislators/legislator_terms?_facet=type&_facet=party&_facet=state&_facet_size=10>`__:
.. image:: https://raw.githubusercontent.com/simonw/datasette-screenshots/0.62/non-retina/faceting-details.png
:alt: Screenshot showing facets against a table of congressional legislators. Suggested facets include state_rank and start and end dates, and the displayed facets are state, party and type. Each facet lists values along with a count of rows for each value.
Facets can be specified in two ways: using query string parameters, or in ``metadata.json`` configuration for the table.
Facets in query strings
-----------------------
To turn on faceting for specific columns on a Datasette table view, add one or more ``_facet=COLUMN`` parameters to the URL.
For example, if you want to turn on facets for the ``city_id`` and ``state`` columns, construct a URL that looks like this::
2018-05-16 15:45:12 +00:00
/dbname/tablename?_facet=state&_facet=city_id
This works for both the HTML interface and the ``.json`` view.
When enabled, facets will cause a ``facet_results`` block to be added to the JSON output, looking something like this:
.. code-block:: json
{
"state": {
"name": "state",
"results": [
{
"value": "CA",
"label": "CA",
"count": 10,
"toggle_url": "http://...?_facet=city_id&_facet=state&state=CA",
"selected": false
},
{
"value": "MI",
"label": "MI",
"count": 4,
"toggle_url": "http://...?_facet=city_id&_facet=state&state=MI",
"selected": false
},
{
"value": "MC",
"label": "MC",
"count": 1,
"toggle_url": "http://...?_facet=city_id&_facet=state&state=MC",
"selected": false
}
],
"truncated": false
}
"city_id": {
"name": "city_id",
"results": [
{
"value": 1,
"label": "San Francisco",
"count": 6,
"toggle_url": "http://...?_facet=city_id&_facet=state&city_id=1",
"selected": false
},
{
"value": 2,
"label": "Los Angeles",
"count": 4,
"toggle_url": "http://...?_facet=city_id&_facet=state&city_id=2",
"selected": false
},
{
"value": 3,
"label": "Detroit",
"count": 4,
"toggle_url": "http://...?_facet=city_id&_facet=state&city_id=3",
"selected": false
},
{
"value": 4,
"label": "Memnonia",
"count": 1,
"toggle_url": "http://...?_facet=city_id&_facet=state&city_id=4",
"selected": false
}
],
"truncated": false
}
}
2018-05-16 15:15:56 +00:00
If Datasette detects that a column is a foreign key, the ``"label"`` property will be automatically derived from the detected label column on the referenced table.
The default number of facet results returned is 30, controlled by the :ref:`setting_default_facet_size` setting.
You can increase this on an individual page by adding ``?_facet_size=100`` to the query string, up to a maximum of :ref:`setting_max_returned_rows` (which defaults to 1000).
2021-05-31 02:39:14 +00:00
.. _facets_metadata:
Facets in metadata
------------------
You can turn facets on by default for specific tables by adding them to a ``"facets"`` key in a Datasette :ref:`metadata` file.
Here's an example that turns on faceting by default for the ``qLegalStatus`` column in the ``Street_Tree_List`` table in the ``sf-trees`` database:
.. [[[cog
from metadata_doc import metadata_example
metadata_example(cog, {
2018-05-16 15:45:12 +00:00
"databases": {
"sf-trees": {
"tables": {
"Street_Tree_List": {
"facets": ["qLegalStatus"]
}
2018-05-16 15:45:12 +00:00
}
}
2018-05-16 15:45:12 +00:00
}
})
.. ]]]
.. tab:: metadata.yaml
.. code-block:: yaml
databases:
sf-trees:
tables:
Street_Tree_List:
facets:
- qLegalStatus
.. tab:: metadata.json
.. code-block:: json
{
"databases": {
"sf-trees": {
"tables": {
"Street_Tree_List": {
"facets": [
"qLegalStatus"
]
}
}
}
}
}
.. [[[end]]]
Facets defined in this way will always be shown in the interface and returned in the API, regardless of the ``_facet`` arguments passed to the view.
You can specify :ref:`array <facet_by_json_array>` or :ref:`date <facet_by_date>` facets in metadata using JSON objects with a single key of ``array`` or ``date`` and a value specifying the column, like this:
.. [[[cog
metadata_example(cog, {
"facets": [
{"array": "tags"},
{"date": "created"}
]
})
.. ]]]
.. tab:: metadata.yaml
.. code-block:: yaml
facets:
- array: tags
- date: created
.. tab:: metadata.json
.. code-block:: json
{
"facets": [
{
"array": "tags"
},
{
"date": "created"
}
]
}
.. [[[end]]]
You can change the default facet size (the number of results shown for each facet) for a table using ``facet_size``:
.. [[[cog
metadata_example(cog, {
"databases": {
"sf-trees": {
"tables": {
"Street_Tree_List": {
"facets": ["qLegalStatus"],
"facet_size": 10
}
}
}
}
})
.. ]]]
.. tab:: metadata.yaml
.. code-block:: yaml
databases:
sf-trees:
tables:
Street_Tree_List:
facets:
- qLegalStatus
facet_size: 10
.. tab:: metadata.json
.. code-block:: json
{
"databases": {
"sf-trees": {
"tables": {
"Street_Tree_List": {
"facets": [
"qLegalStatus"
],
"facet_size": 10
}
}
}
}
}
.. [[[end]]]
Suggested facets
----------------
Datasette's table UI will suggest facets for the user to apply, based on the following criteria:
For the currently filtered data are there any columns which, if applied as a facet...
2018-05-18 15:50:37 +00:00
* Will return 30 or less unique options
* Will return more than one unique option
* Will return less unique options than the total number of filtered rows
2018-05-18 15:50:37 +00:00
* And the query used to evaluate this criteria can be completed in under 50ms
That last point is particularly important: Datasette runs a query for every column that is displayed on a page, which could get expensive - so to avoid slow load times it sets a time limit of just 50ms for each of those queries.
This means suggested facets are unlikely to appear for tables with millions of records in them.
2018-05-17 04:08:50 +00:00
Speeding up facets with indexes
-------------------------------
The performance of facets can be greatly improved by adding indexes on the columns you wish to facet by.
Adding indexes can be performed using the ``sqlite3`` command-line utility. Here's how to add an index on the ``state`` column in a table called ``Food_Trucks``::
2018-05-17 04:08:50 +00:00
sqlite3 mydatabase.db
::
2018-05-17 04:08:50 +00:00
SQLite version 3.19.3 2017-06-27 16:48:08
Enter ".help" for usage hints.
sqlite> CREATE INDEX Food_Trucks_state ON Food_Trucks("state");
Or using the `sqlite-utils <https://sqlite-utils.datasette.io/en/stable/cli.html#creating-indexes>`__ command-line utility::
sqlite-utils create-index mydatabase.db Food_Trucks state
2019-05-19 21:24:12 +00:00
.. _facet_by_json_array:
Facet by JSON array
-------------------
If your SQLite installation provides the ``json1`` extension (you can check using :ref:`JsonDataView_versions`) Datasette will automatically detect columns that contain JSON arrays of values and offer a faceting interface against those columns.
This is useful for modelling things like tags without needing to break them out into a new table.
2019-05-25 16:05:52 +00:00
Example here: `latest.datasette.io/fixtures/facetable?_facet_array=tags <https://latest.datasette.io/fixtures/facetable?_facet_array=tags>`__
.. _facet_by_date:
2019-05-21 06:09:22 +00:00
Facet by date
-------------
If Datasette finds any columns that contain dates in the first 100 values, it will offer a faceting interface against the dates of those values.
This works especially well against timestamp values such as ``2019-03-01 12:44:00``.
2019-05-21 06:09:22 +00:00
2019-05-25 16:05:52 +00:00
Example here: `latest.datasette.io/fixtures/facetable?_facet_date=created <https://latest.datasette.io/fixtures/facetable?_facet_date=created>`__