More documentation unit tests. These ones check that every single **View class
imported into the datasette/app.py module are covered by our documentation.
Just one problem: they aren't documented yet. So I'm using the xfail pytest
decorator to mark these tests as allowed-to-fail. When you run the test suite
you now get a report of how many views still need to be documented, but it
doesn't fail the tests.
The output looks something like this:
$ pytest tests/test_docs.py
collected 31 items
tests/test_docs.py ..........................XXXxx. [100%]
============ 26 passed, 2 xfailed, 3 xpassed in 1.06 seconds ============
Once I have documented all the views I will remove the xfail so any future
views that are added without documentation will cause a test failure.
We can detect that a view is documented by looking for ReST label in the docs,
for example:
.. _IndexView:
Some view classes can be used to power multiple URLs - the JsonDataView class
for example is used to power /-/metadata and /-/config and /-/plugins
In this case, the second part of the label can indicate the variety of page, e.g:
.. _JsonDataView_metadata:
The test will pass as long as there is at least one label that starts with
_JsonDataView.
This change introduces a new plugin hook, publish_subcommand, which can be
used to implement new subcommands for the "datasette publish" command family.
I've used this new hook to refactor out the "publish now" and "publish heroku"
implementations into separate modules. I've also added unit tests for these
two publishers, mocking the subprocess.call and subprocess.check_output
functions.
As part of this, I introduced a mechanism for loading default plugins. These
are defined in the new "default_plugins" list inside datasette/app.py
Closes#217 (Plugin support for datasette publish)
Closes#348 (Unit tests for "datasette publish")
Refs #14, #59, #102, #103, #146, #236, #347
Unit tests now check that docs/*.txt help examples are all up-to-date.
I ran into a problem here in that the terminal_width needed to be more
accurately defined - so I replaced update-docs-help.sh with update-docs-
help.py which hard-codes the terminal width.
* table.csv?_stream=1 to download all rows - refs #266
This option causes Datasette to serve ALL rows in the table, by internally
following the _next= pagination links and serving everything out as a stream.
Also added new config option, allow_csv_stream, which can be used to disable
this feature.
* New config option max_csv_mb limiting size of CSV export
This is a relatively obscure new command-line argument that helps solve the
problem of showing accurate version information in deployed instances of
Datasette even if they were deployed directly from source code.
You can pass --version-note to datasette publish and package and it will then
in turn be passed to datasette when it starts:
datasette --version-note=hello fixtures.db
Now if you visit /-/versions.json you will see this:
{
"datasette": {
"note": "hello",
"version": "0+unknown"
},
"python": {
"full": "3.6.5 (default, Jun 6 2018, 19:19:24) \n[GCC 6.3.0 20170516]",
"version": "3.6.5"
},
...
}
I plan to use this in some Travis CI configuration, refs #313
The fixtures database created by our unit tests makes for a good "live" demo
of Datasette in action.
I've improved the metadata it ships with to better support this use-case.
I've also improved the mechanism for writing out fixtures: you can do this:
python tests/fixtures.py fixtures.db
To get just the fixtures database written out... or you can do this:
python tests/fixtures.py fixtures.db fixtures.json
To get metadata which you can then serve like so:
datasette fixtures.db -m fixtures.json
Refs #313
These new querystring arguments can be used to request expanded foreign keys
in both JSON and CSV formats.
?_labels=on turns on expansions for ALL foreign key columns
?_label=COLUMN1&_label=COLUMN2 can be used to pick specific columns to expand
e.g. `Street_Tree_List.json?_label=qSpecies&_label=qLegalStatus`
{
"rowid": 233,
"TreeID": 121240,
"qLegalStatus": {
"value" 2,
"label": "Private"
}
"qSpecies": {
"value": 16,
"label": "Sycamore"
}
"qAddress": "91 Commonwealth Ave",
...
}
The labels option also works for the HTML and CSV views.
HTML defaults to `?_labels=on`, so if you pass `?_labels=off` you can disable
foreign key expansion entirely - or you can use `?_label=COLUMN` to request
just specific columns.
If you expand labels on CSV you get additional columns in the output:
`/Street_Tree_List.csv?_label=qLegalStatus`
rowid,TreeID,qLegalStatus,qLegalStatus_label...
1,141565,1,Permitted Site...
2,232565,2,Undocumented...
I also refactored the existing foreign key expansion code.
Closes#233. Refs #266.
The test used to expect CSV to come back like this:
hello
world
""
With the final blank value encoded in quotes.
Judging by Travis failures, this behaviour changed between Python 3.6.3 and
3.6.5:
https://travis-ci.org/simonw/datasette/jobs/392586661
Tables and custom SQL query results can now be exported as CSV.
The easiest way to do this is to use the .csv extension, e.g.
/test_tables/facet_cities.csv
By default this is served as Content-Type: text/plain so you can see it in
your browser. If you want to download the file (using text/csv and with an
appropriate Content-Disposition: attachment header) you can do so like this:
/test_tables/facet_cities.csv?_dl=1
We link to the CSV and downloadable CSV URLs from the table and query pages.
The links use ?_size=max and so by default will return 1,000 rows.
Also fixes#303 - table names ending in .json or .csv are now detected and
URLs are generated that look like this instead:
/test_tables/table%2Fwith%2Fslashes.csv?_format=csv
The ?_format= option is available for everything else too, but we link to the
.csv / .json versions in most cases because they are aesthetically pleasing.
https://github.com/pytest-dev/pytest/issues/1875 made it impossible to declare
a function as a fixture multiple times, which we were doing across different
modules. The fix was to move our @pytest.fixture calls into decorators in the
tests/fixtures.py module.
Removed the --page_size= argument to datasette serve in favour of:
datasette serve --config default_page_size:50 mydb.db
Added new help section:
$ datasette --help-config
Config options:
default_page_size Default page size for the table view
(default=100)
max_returned_rows Maximum rows that can be returned from a table
or custom query (default=1000)
sql_time_limit_ms Time limit for a SQL query in milliseconds
(default=1000)
default_facet_size Number of values to return for requested facets
(default=30)
facet_time_limit_ms Time limit for calculating a requested facet
(default=200)
facet_suggest_time_limit_ms Time limit for calculating a suggested facet
(default=50)
Replaced the --max_returned_rows and --sql_time_limit_ms options to
"datasette serve" with a new --limit option, which supports a larger
list of limits.
Example usage:
datasette serve --limit max_returned_rows:1000 \
--limit sql_time_limit_ms:2500 \
--limit default_facet_size:50 \
--limit facet_time_limit_ms:1000 \
--limit facet_suggest_time_limit_ms:500
New docs: https://datasette.readthedocs.io/en/latest/limits.htmlCloses#270Closes#264
Every now and then a test will fail in Travis CI on Python 3.5 because it hit
the default 20ms SQL time limit.
Test fixtures now default to a 200ms time limit, and we only use the 20ms time
limit for the specific test that tests query interruption. This should make
our tests on Python 3.5 in Travis much more stable.
* Default is now ?_shape=arrays (renamed from lists)
* New ?_shape=array returns an array of objects as the root object
* Changed ?_shape=object to return the object as the root
* Updated docs
* New --plugins-dir=plugins/ option
New option causing Datasette to load and evaluate all of the Python files in
the specified directory and register any plugins that are defined in those
files.
This new option is available for the following commands:
datasette serve mydb.db --plugins-dir=plugins/
datasette publish now/heroku mydb.db --plugins-dir=plugins/
datasette package mydb.db --plugins-dir=plugins/
* Unit tests for --plugins-dir=plugins/
Closes#211
We now display sort options as a select box plus a descending checkbox, which
means you can apply sort orders even in portrait mode on a mobile phone where
the column headers are hidden.
Closes#199
You can now explicitly set which columns in a table can be used for sorting
using the _sort and _sort_desc arguments using metadata.json:
{
"databases": {
"database1": {
"tables": {
"example_table": {
"sortable_columns": [
"height",
"weight"
]
}
}
}
}
}
Refs #189
Verifies that they match an existing column, and only one or the other option
is provided - refs #189
Eses a new DatasetteError exception that closes#193
New _shape= parameter replacing old .jsono extension
Now instead of this:
/database/table.jsono
We use the _shape parameter like this:
/database/table.json?_shape=objects
Also introduced a new _shape called 'object' which looks like this:
/database/table.json?_shape=object
Returning an object for the rows key:
...
"rows": {
"pk1": {
...
},
"pk2": {
...
}
}
Refs #122
If you set the source_url/license_url/source/license fields in your root
metadata those values will now be inherited all the way down to the database
and table templates.
The title/description are NOT inherited.
Also added unit tests for the HTML generated by the metadata.
Refs #185
Refs #153
Every template now gets CSS classes in the body designed to support custom
styling.
The index template (the top level page at /) gets this:
<body class="index">
The database template (/dbname/) gets this:
<body class="db db-dbname">
The table template (/dbname/tablename) gets:
<body class="table db-dbname table-tablename">
The row template (/dbname/tablename/rowid) gets:
<body class="row db-dbname table-tablename">
The db-x and table-x classes use the database or table names themselves IF
they are valid CSS identifiers. If they aren't, we strip any invalid
characters out and append a 6 character md5 digest of the original name, in
order to ensure that multiple tables which resolve to the same stripped
character version still have different CSS classes.
Some examples (extracted from the unit tests):
"simple" => "simple"
"MixedCase" => "MixedCase"
"-no-leading-hyphens" => "no-leading-hyphens-65bea6"
"_no-leading-underscores" => "no-leading-underscores-b921bc"
"no spaces" => "no-spaces-7088d7"
"-" => "336d5e"
"no $ characters" => "no--characters-59e024"
Rows page for rows that linked to the same table in more
than one columns were display incorrectly. Fixed that and added a test.
Also introduced /db/table/row-pk.json?_extras=foreign_key_tables
This is used by the new unit test, but is the first example of a new
?_extras=comma-separated-list pattern I am introducing.
This:
?_filter_column_1=name&_filter_op_1=contains&_filter_value_1=hello
&_filter_column_2=age&_filter_op_2=gte&_filter_value_2=12
Now redirects to this:
?name__contains=hello&age__gte=12
This is needed for the filter editing interface, refs #86
if filter_op contains a __ the value is set to the right hand side.
e.g.
?_filter_column=col&_filter_op=isnull__1&_filter_value=x
Redirects to:
?col__isnull=1
Refs #86
Part of implementing the filters UI (refs #86) - the following:
/trees/Trees?_filter_column=SiteOrder&_filter_op=gt&_filter_value=2
Now redirects to this;
/trees/Trees?SiteOrder__gt=2
Added a unit test for the sql_time_limit_ms option.
To test this, I needed to add a custom SQLite sleep() function. I've added a
simple mechanism to the Datasette class for registering custom functions.
I also had to modify the sqlite_timelimit() function. It makes use of a magic
value, N, which is the number of SQLite virtual machine instructions that
should execute in between calls to my termination decision function.
The value of N was not finely grained enough for my test to work - so I've
added logic that says that if the time limit is less than 50ms, N is set to 1.
This got the tests working.
Refs #95
If someone executes 'select * from table' against a table with a million rows
in it, we could run into problems: just serializing that much data as JSON is
likely to lock up the server.
Solution: we now have a hard limit on the maximum number of rows that can be
returned by a query. If that limit is exceeded, the server will return a
`"truncated": true` field in the JSON.
This limit can be optionally controlled by the new `--max_returned_rows`
option. Setting that option to 0 disables the limit entirely.
Closes#69