Simon Willison’s Weblog

Subscribe

Running Datasette on Glitch

23rd April 2019

The worst part of any software project is setting up a development environment. It’s by far the biggest barrier for anyone trying to get started learning to code. I’ve been a developer for more than twenty years and I still feel the pain any time I want to do something new.

Glitch is the most promising attempt I’ve ever seen at tackling this problem. It provides an entirely browser-based development environment that allows you to edit code, see the results instantly and view and remix the source code of other people’s projects.

It’s developed into a really fun, super-creative community and a fantastic resource for people looking to get started in the ever-evolving world of software development.

This evening I decided to get Datasette running on it. I’m really impressed with how well it works, and I think Glitch provides an excellent environment for experimenting with Datasette and related tools.

TLDR version: visit https://glitch.com/edit/#!/remix/datasette-csvs right now, drag-and-drop in a CSV file and watch it get served by Datasette on Glitch just a few seconds later.

Running Python on Glitch

The Glitch documentation is all about Node.js and JavaScript, but they actually have very solid Python support as well.

Every Glitch project runs in a container that includes Python 2.7.12 and Python 3.5.2, and you can use pip install --user or pip3 install --user to install Python dependencies.

The key to running non-JavaScript projects on Glitch is the glitch.json file format. You can use this to specify an install script, which sets up your container, and a start script, which starts your application running. Glitch will route HTTP traffic to port 3000, so your application server needs to listen on that port.

This means the most basic Glitch project to run Datasette looks like this:

https://datasette-basic.glitch.me/ (view source)

It contains a single glitch.json file:

{
    "install": "pip3 install --user datasette",
    "start": "datasette -p 3000"
}

This installs Datasette using pip3, then runs it on port 3000.

Since there’s no actual data to serve, this is a pretty boring demo. The most interesting page is this one, which shows the installed versions of the software:

https://datasette-basic.glitch.me/-/versions

Something more interesting: datasette-csvs

Let’s build one with some actual data.

My csvs-to-sqlite tool converts CSV files into a SQLite database. Since it’s also written in Python we can run it against CSV files as part of the Glitch install script.

Glitch provides a special directory called .data/ which can be used as a persistent file storage space that won’t be cleared in between restarts. The following "install" script installs datasette and csvs-to-sqlite, then runs the latter to create a SQLite database from all available CSV files:

{
    "install":  "pip3 install --user datasette csvs-to-sqlite && csvs-to-sqlite *.csv .data/csv-data.db",
    "start": "datasette .data/csv-data.db -p 3000"
}

Now we can simply drag and drop CSV files into the root of the Glitch project and they will be automatically converted into a SQLite database and served using Datasette!

We need a couple of extra details. Firstly, we want Datasette to automatically re-build the database file any time a new CSV file is added or an existing CSV file is changed. We can do that by adding a "watch" block to glitch.json:

"watch": {
    "install": {
        "include": [
            "\\.csv$"
        ]
    }
}

This ensures that our "install" script will run again any time a CSV file changes.

Let’s tone down the rate at which the scripts execute, by using throttle to set the polling interval to once a second:

"throttle": 1000

The above almost worked, but I started seeing errors if I changed the number of columns in a CSV file, since doing so clashed with the schema that had already been created in the database.

My solution was to add code to the install script that would delete the SQLite database file before attempting to recreate it—using the rm ... || true idiom to prevent Glitch from failing the installation if the file it attempted to remove did not already exist.

My final glitch.json file looks like this:

{
  "install": "pip3 install --user datasette csvs-to-sqlite && rm .data/csv-data.db || true && csvs-to-sqlite *.csv .data/csv-data.db",
  "start": "datasette .data/csv-data.db -p 3000 -m metadata.json",
  "watch": {
    "install": {
      "include": [
        "\\.csv$"
      ]
    },
    "restart": {
      "include": [
        "^metadata.json$"
      ]
    },
    "throttle": 1000
  }
}

I also set it up to use Datasette’s metadata.json format, and automatically restart the server any time the contents of that file changes.

https://datasette-csvs.glitch.me/ (view source) shows the results, running against a simple example.csv file I created.

Remixing!

Here’s where things get really fun: Glitch projects support “remixing”, whereby anyone can click a link to create their own editable copy of a project.

Remixing works even if you aren’t logged in to Glitch! Anonymous projects expire after five days, so be sure to sign in with GitHub or Facebook if you want to keep yours around.

Try it out now: Visit https://glitch.com/edit/#!/remix/datasette-csvs to create your own remix of my project. Then drag a new CSV file directly into the editor and within a few seconds Datasette on Glitch will be up and running against a converted copy of your file!

Limitations

The Glitch help center article What technical restrictions are in place? describes their limits. Most importantly, projects are limited to 4,000 requests an hour—and there’s currently no way to increase that limit. They also limit projects to 200MB of disk space—easily enough to get started exploring some interesting CSV files with Datasette.

Next steps

I’m delighted at how easy this was to setup, and how much power the ability to remix these Datasette demos provides. I’m tempted to start creating remixable Glitch demos that illustrate other aspects of Datasette’s functionality such as plugins or full-text search.

Glitch is an exceptionally cool piece of software. I look forward to seeing their Python support continue to evolve.