datasette/docs/metadata.rst

.. _metadata:

Metadata
========

Data loves metadata. Any time you run Datasette you can optionally include a
JSON file with metadata about your databases and tables. Datasette will then
display that information in the web UI.

Run Datasette like this::

    datasette database1.db database2.db --metadata metadata.json

Your ``metadata.json`` file can look something like this::

    {
        "title": "Custom title for your index page",
        "description": "Some description text can go here",
        "license": "ODbL",
        "license_url": "https://opendatacommons.org/licenses/odbl/",
        "source": "Original Data Source",
        "source_url": "http://example.com/"
    }

The above metadata will be displayed on the index page of your Datasette-powered
site. The source and license information will also be included in the footer of
every page served by Datasette.

Any special HTML characters in ``description`` will be escaped. If you want to
include HTML in your description, you can use a ``description_html`` property
instead.

Per-database and per-table metadata
-----------------------------------

Metadata at the top level of the JSON will be shown on the index page and in the
footer on every page of the site. The license and source is expected to apply to
all of your data.

You can also provide metadata at the per-database or per-table level, like this::

    {
        "databases": {
            "database1": {
                "source": "Alternative source",
                "source_url": "http://example.com/",
                "tables": {
                    "example_table": {
                        "description_html": "Custom <em>table</em> description",
                        "license": "CC BY 3.0 US",
                        "license_url": "https://creativecommons.org/licenses/by/3.0/us/"
                    }
                }
            }
        }
    }

Each of the top-level metadata fields can be used at the database and table level.

Specifying units for a column
-----------------------------

Datasette supports attaching units to a column, which will be used when displaying
values from that column. SI prefixes will be used where appropriate.

Column units are configured in the metadata like so::

    {
        "databases": {
            "database1": {
                "tables": {
                    "example_table": {
                        "units": {
                            "column1": "metres",
                            "column2": "Hz"
                        }
                    }
                }
            }
        }
    }

Units are interpreted using Pint_, and you can see the full list of available units in
Pint's `unit registry`_. You can also add `custom units`_ to the metadata, which will be
registered with Pint::

    {
        "custom_units": [
            "decibel = [] = dB"
        ]
    }

.. _Pint: https://pint.readthedocs.io/
.. _unit registry: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt
.. _custom units: http://pint.readthedocs.io/en/latest/defining.html

Setting which columns can be used for sorting
---------------------------------------------

Datasette allows any column to be used for sorting by default. If you need to
control which columns are available for sorting you can do so using the optional
``sortable_columns`` key::

    {
        "databases": {
            "database1": {
                "tables": {
                    "example_table": {
                        "sortable_columns": [
                            "height",
                            "weight"
                        ]
                    }
                }
            }
        }
    }

This will restrict sorting of ``example_table`` to just the ``height`` and
``weight`` columns.

You can also disable sorting entirely by setting ``"sortable_columns": []``

By default, database views in Datasette do not support sorting. You can use ``sortable_columns`` to enable specific sort orders for a view called ``name_of_view`` in the database ``my_database`` like so::

    {
        "databases": {
            "my_database": {
                "tables": {
                    "name_of_view": {
                        "sortable_columns": [
                            "clicks",
                            "impressions"
                        ]
                    }
                }
            }
        }
    }

.. _label_columns:

Specifying the label column for a table
---------------------------------------

Datasette's HTML interface attempts to display foreign key references as
labelled hyperlinks. By default, it looks for referenced tables that only have
two columns: a primary key column and one other. It assumes that the second
column should be used as the link label.

If your table has more than two columns you can specify which column should be
used for the link label with the ``label_column`` property::

    {
        "databases": {
            "database1": {
                "tables": {
                    "example_table": {
                        "label_column": "title"
                    }
                }
            }
        }
    }

Hiding tables
-------------

You can hide tables from the database listing view (in the same way that FTS and
Spatialite tables are automatically hidden) using ``"hidden": true``::

    {
        "databases": {
            "database1": {
                "tables": {
                    "example_table": {
                        "hidden": true
                    }
                }
            }
        }
    }

Generating a metadata skeleton
------------------------------

Tracking down the names of all of your databases and tables and formatting them
as JSON can be a little tedious, so Datasette provides a tool to help you
generate a "skeleton" JSON file::

    datasette skeleton database1.db database2.db

This will create a ``metadata.json`` file looking something like this::

    {
        "title": null,
        "description": null,
        "description_html": null,
        "license": null,
        "license_url": null,
        "source": null,
        "source_url": null,
        "databases": {
            "database1": {
                "title": null,
                "description": null,
                "description_html": null,
                "license": null,
                "license_url": null,
                "source": null,
                "source_url": null,
                "queries": {},
                "tables": {
                    "example_table": {
                        "title": null,
                        "description": null,
                        "description_html": null,
                        "license": null,
                        "license_url": null,
                        "source": null,
                        "source_url": null,
                        "units": {}
                    }
                }
            },
            "database2": ...
        }
    }

You can replace any of the ``null`` values with a JSON string to populate that
piece of metadata.
Initial implementation of facets, plus tests and docs Refs #255 2018-05-12 22:29:06 +00:00			`.. _metadata:`

Documentation for metadata.json and "datasette skeleton" command http://datasette.readthedocs.io/en/latest/metadata.html Closes #166 2017-12-07 17:19:35 +00:00			`Metadata`
			`========`

			`Data loves metadata. Any time you run Datasette you can optionally include a`
			`JSON file with metadata about your databases and tables. Datasette will then`
			`display that information in the web UI.`

			`Run Datasette like this::`

			`datasette database1.db database2.db --metadata metadata.json`

			Your ``metadata.json`` file can look something like this::

			`{`
			`"title": "Custom title for your index page",`
			`"description": "Some description text can go here",`
			`"license": "ODbL",`
			`"license_url": "https://opendatacommons.org/licenses/odbl/",`
			`"source": "Original Data Source",`
			`"source_url": "http://example.com/"`
			`}`

			`The above metadata will be displayed on the index page of your Datasette-powered`
			`site. The source and license information will also be included in the footer of`
			`every page served by Datasette.`

			Any special HTML characters in ``description`` will be escaped. If you want to
			include HTML in your description, you can use a ``description_html`` property
			`instead.`

			`Per-database and per-table metadata`
			`-----------------------------------`

			`Metadata at the top level of the JSON will be shown on the index page and in the`
			`footer on every page of the site. The license and source is expected to apply to`
			`all of your data.`

			`You can also provide metadata at the per-database or per-table level, like this::`

			`{`
			`"databases": {`
			`"database1": {`
			`"source": "Alternative source",`
			`"source_url": "http://example.com/",`
			`"tables": {`
			`"example_table": {`
			`"description_html": "Custom <em>table</em> description",`
			`"license": "CC BY 3.0 US",`
			`"license_url": "https://creativecommons.org/licenses/by/3.0/us/"`
			`}`
			`}`
			`}`
			`}`
			`}`

			`Each of the top-level metadata fields can be used at the database and table level.`

Tidy up units support * Add units to exported JSON * Units key in metadata skeleton * Docs 2018-04-14 10:16:09 +00:00			`Specifying units for a column`
			`-----------------------------`

			`Datasette supports attaching units to a column, which will be used when displaying`
			`values from that column. SI prefixes will be used where appropriate.`

			`Column units are configured in the metadata like so::`

			`{`
			`"databases": {`
			`"database1": {`
			`"tables": {`
			`"example_table": {`
			`"units": {`
			`"column1": "metres",`
			`"column2": "Hz"`
			`}`
			`}`
			`}`
			`}`
			`}`
			`}`

			`Units are interpreted using Pint_, and you can see the full list of available units in`
Add link to pint custom units page to docs 2018-04-14 14:08:20 +00:00			Pint's `unit registry`_. You can also add `custom units`_ to the metadata, which will be
Allow custom units to be registered with Pint 2018-04-14 11:27:06 +00:00			`registered with Pint::`
label_column option in metadata.json - closes #234 2018-04-22 17:51:43 +00:00
Allow custom units to be registered with Pint 2018-04-14 11:27:06 +00:00			`{`
			`"custom_units": [`
			`"decibel = [] = dB"`
			`]`
			`}`
Tidy up units support * Add units to exported JSON * Units key in metadata skeleton * Docs 2018-04-14 10:16:09 +00:00
			`.. _Pint: https://pint.readthedocs.io/`
			`.. _unit registry: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt`
Add link to pint custom units page to docs 2018-04-14 14:08:20 +00:00			`.. _custom units: http://pint.readthedocs.io/en/latest/defining.html`
Tidy up units support * Add units to exported JSON * Units key in metadata skeleton * Docs 2018-04-14 10:16:09 +00:00
New sortable_columns option in metadata.json to control sort options You can now explicitly set which columns in a table can be used for sorting using the _sort and _sort_desc arguments using metadata.json: { "databases": { "database1": { "tables": { "example_table": { "sortable_columns": [ "height", "weight" ] } } } } } Refs #189 2018-04-09 04:58:25 +00:00			`Setting which columns can be used for sorting`
			`---------------------------------------------`

			`Datasette allows any column to be used for sorting by default. If you need to`
			`control which columns are available for sorting you can do so using the optional`
			``sortable_columns`` key::

			`{`
			`"databases": {`
			`"database1": {`
			`"tables": {`
			`"example_table": {`
			`"sortable_columns": [`
			`"height",`
			`"weight"`
			`]`
			`}`
			`}`
			`}`
			`}`
			`}`

			This will restrict sorting of ``example_table`` to just the ``height`` and
			``weight`` columns.

			You can also disable sorting entirely by setting ``"sortable_columns": []``

sortable_columns also now works with views 2018-08-06 00:29:23 +00:00			By default, database views in Datasette do not support sorting. You can use ``sortable_columns`` to enable specific sort orders for a view called ``name_of_view`` in the database ``my_database`` like so::

Corrected indentation in metadata.rst 2018-08-28 09:56:34 +00:00			`{`
			`"databases": {`
			`"my_database": {`
			`"tables": {`
			`"name_of_view": {`
			`"sortable_columns": [`
			`"clicks",`
			`"impressions"`
			`]`
			`}`
sortable_columns also now works with views 2018-08-06 00:29:23 +00:00			`}`
			`}`
			`}`
			`}`

?_labels= and ?_label=COL to expand foreign keys in JSON/CSV These new querystring arguments can be used to request expanded foreign keys in both JSON and CSV formats. ?_labels=on turns on expansions for ALL foreign key columns ?_label=COLUMN1&_label=COLUMN2 can be used to pick specific columns to expand e.g. `Street_Tree_List.json?_label=qSpecies&_label=qLegalStatus` { "rowid": 233, "TreeID": 121240, "qLegalStatus": { "value" 2, "label": "Private" } "qSpecies": { "value": 16, "label": "Sycamore" } "qAddress": "91 Commonwealth Ave", ... } The labels option also works for the HTML and CSV views. HTML defaults to `?_labels=on`, so if you pass `?_labels=off` you can disable foreign key expansion entirely - or you can use `?_label=COLUMN` to request just specific columns. If you expand labels on CSV you get additional columns in the output: `/Street_Tree_List.csv?_label=qLegalStatus` rowid,TreeID,qLegalStatus,qLegalStatus_label... 1,141565,1,Permitted Site... 2,232565,2,Undocumented... I also refactored the existing foreign key expansion code. Closes #233. Refs #266. 2018-06-16 22:18:57 +00:00			`.. _label_columns:`
Streaming mode for downloading all rows as a CSV (#315) * table.csv?_stream=1 to download all rows - refs #266 This option causes Datasette to serve ALL rows in the table, by internally following the _next= pagination links and serving everything out as a stream. Also added new config option, allow_csv_stream, which can be used to disable this feature. * New config option max_csv_mb limiting size of CSV export 2018-06-18 03:21:02 +00:00
label_column option in metadata.json - closes #234 2018-04-22 17:51:43 +00:00			`Specifying the label column for a table`
			`---------------------------------------`

			`Datasette's HTML interface attempts to display foreign key references as`
			`labelled hyperlinks. By default, it looks for referenced tables that only have`
			`two columns: a primary key column and one other. It assumes that the second`
			`column should be used as the link label.`

			`If your table has more than two columns you can specify which column should be`
			used for the link label with the ``label_column`` property::

			`{`
			`"databases": {`
			`"database1": {`
			`"tables": {`
			`"example_table": {`
			`"label_column": "title"`
			`}`
			`}`
			`}`
			`}`
			`}`

New hidden: True option for table metadat, closes #239 2018-04-26 03:42:57 +00:00			`Hiding tables`
			`-------------`

			`You can hide tables from the database listing view (in the same way that FTS and`
			Spatialite tables are automatically hidden) using ``"hidden": true``::

			`{`
			`"databases": {`
			`"database1": {`
			`"tables": {`
			`"example_table": {`
			`"hidden": true`
			`}`
			`}`
			`}`
			`}`
			`}`

Documentation for metadata.json and "datasette skeleton" command http://datasette.readthedocs.io/en/latest/metadata.html Closes #166 2017-12-07 17:19:35 +00:00			`Generating a metadata skeleton`
			`------------------------------`

			`Tracking down the names of all of your databases and tables and formatting them`
			`as JSON can be a little tedious, so Datasette provides a tool to help you`
			`generate a "skeleton" JSON file::`

			`datasette skeleton database1.db database2.db`

			This will create a ``metadata.json`` file looking something like this::

			`{`
			`"title": null,`
			`"description": null,`
			`"description_html": null,`
			`"license": null,`
			`"license_url": null,`
			`"source": null,`
			`"source_url": null,`
			`"databases": {`
			`"database1": {`
			`"title": null,`
			`"description": null,`
			`"description_html": null,`
			`"license": null,`
			`"license_url": null,`
			`"source": null,`
			`"source_url": null,`
			`"queries": {},`
			`"tables": {`
			`"example_table": {`
			`"title": null,`
			`"description": null,`
			`"description_html": null,`
			`"license": null,`
			`"license_url": null,`
			`"source": null,`
Tidy up units support * Add units to exported JSON * Units key in metadata skeleton * Docs 2018-04-14 10:16:09 +00:00			`"source_url": null,`
			`"units": {}`
Documentation for metadata.json and "datasette skeleton" command http://datasette.readthedocs.io/en/latest/metadata.html Closes #166 2017-12-07 17:19:35 +00:00			`}`
			`}`
			`},`
			`"database2": ...`
			`}`
			`}`

			You can replace any of the ``null`` values with a JSON string to populate that
			`piece of metadata.`