Summary: I made a map of the council’s assets

I made a map showing all land and assets owned by Cheltenham Borough Council1.

You can view it in fullscreen at chrismytton.com/cheltenham-council-land-and-assets.

Getting to that point was a fun learning process, and as a bonus I ended up with a free web interface where you can query the data as well! More details on that below.

Data! But not very visual

I was browsing through Cheltenham Council’s Open data pages the other day and came across their “Local authority land and assets open data” dataset. From what I understand reading that webpage, the council are required to maintain this list of assets, “in accordance with the mandatory requirements of the local government transparency code”.

The list of assets is provided in XLSX and CSV format. These spreadsheet-like formats are fine to quickly go through if you know what you’re looking for. Not so great for browsing the data in a more visual format.

I’ve been wanting to sharpen my GIS skills for a while, so I got to work processing the data into something that I could plot on a map.

Loading the data into Postgis with Ruby

For my first iteration I wanted to use Postgis so I could get more familiar with the various features it offers. I wrote a Ruby script that would take the CSV data, do some mild cleaning on it, load it into Postgis, and then get the data back out of Postgis as GeoJSON.

You can see that script over on GitHub.

Re-projecting British National Grid coordinates to “GPS” coordinates

The coordinates from the CSV’s GeoX and GeoY columns are re-projected in Postgis from EPSG:27700 (British National Grid, which is used on Ordnance Survey maps) to EPSG:4326 (World Geodetic System, which is what GPS uses), which is then stored in the coordinates column. Re-projection isn’t a required step. GeoJSON can handle data with a 27700 projection, and Leaflet can handle 27700 projections, but it requires a 27700 basemap as well. I could get from Ordnance Survey, but it costs money, and I wanted this project to run without an ongoing cost, if possible.

In my experience, working with 4326 data gives you a wider range of options.

This is the SQL I used to re-project the GeoX and GeoY columns and store the result in the coordinates column.

UPDATE land_and_assets
  SET coordinates = ST_Transform(
    ST_SetSRID(
      ST_MakePoint(GeoX, GeoY),
      27700
    ),
    4326
  )

SQL query to get GeoJSON out of Postgis

The other thing of note is that I used a fancy Postgis query to get the GeoJSON directly from the database, rather than having to generate it with Ruby.

SELECT json_build_object(
  'type',     'FeatureCollection',
  'features', json_agg(feature)
) AS geojson
FROM (
SELECT json_build_object(
  'type',       'Feature',
  'id',         AssetCode,
  'geometry',   ST_AsGeoJSON(coordinates)::json,
  'properties', json_build_object(
    'organisation', organisation,
    'organisationlabel', organisationlabel,
    'uprn', uprn,
    'assetcode', assetcode,
    'propertyname', propertyname,
    'streetname', streetname,
    'posttown', posttown,
    'postcode', postcode,
    'tenuretype', tenuretype,
    'tenuredetail', tenuredetail,
    'holdingtype', holdingtype
  )
) AS feature
FROM (SELECT * FROM land_and_assets) row) features;

Pretty cool that you can conjure up a GeoJSON object using SQL without any further processing.

So this Ruby script worked fine and produced some usable GeoJSON output. You can actually view the output directly on GitHub (because GitHub has support for rendering GeoJSON files).

Second attempt with GDAL/OGR

A few weeks later I was working on another project and started to get a deeper understanding and appreciation for the power of ogr2ogr from the GDAL/OGR package. From what I understand GDAL handles raster data, which is essentially geographically-labelled image-type data. OGR handles vector data, which is essentially some kind of coordinates representing points, lines, polygons (and potentially more) in a 2d or 3d space.

This tool allows you to convert between a wide range of vector GIS formats. Among those formats, to my surprise, was CSV! So you can use ogr2ogr to convert a CSV file to GeoJSON directly, without having to load things into Postgres as an intermediate step.

Because ogr2ogr lets you write SQL, we can use this for processing the rows in the input dataset into a format appropriate for the output dataset. In this case replacing some line breaks in a couple of fields, and remapping the names from the CSV to different output names.

You can see the new build script over on GitHub along with the SQL that’s used for processing.

Moving from the Ruby script to ogr2ogr has dramatically reduced the amount of code needed for this process. Now it’s just one (admittedly quite complex) SQL query and a couple of lines of bash to glue things together.

Putting it on a map

As mentioned GitHub has built in support for viewing GeoJSON files. This used to be powered by Mapbox and was quite slick, but it’s now powered by Azure Maps and is quite frankly inferior to the Mapbox version and a bit clunky.

So I wanted to put the markers on a map that I controlled.

The GeoJSON file was acceptably small once gzipped, so I was happy to just pull in the whole file on page load, rather than having to do any dynamic bounding box queries.

I used Leaflet to display the map, and OpenStreetMap’s tile server for the base map.

You can browse the map by visiting chrismytton.com/cheltenham-council-land-and-assets.

Exploring the data with Datasette

Browsing the data visually on the map was the main goal for me, but I still found myself wanting to run some SQL queries on this data. I wanted to know what the distinct values were in the “Tenure” column. I wanted to know what kind of holdings there were. These are things that would be trivial to answer using SQL.

I could have just used ogr2ogr’s built-in SQL to query the data, but this is a bit clunky because it’s designed specifically for looking at geodata.

Enter Datasette. Specifically Datasette Lite, which allows you to load in a CSV file from a URL and run SQL against it.

In this case there was some kind of encoding issue with the CSV file which meant it couldn’t be loaded by Datasette Lite directly. One way to work around this was to convert the CSV to a SQLite database and fix the encodings along the way.

Because I was now using org2ogr, I had everything I needed to convert the CSV data into a multitude of other vector data formats. A simple one line addition to the bin/build script was all it took to output a SQLite database alongside the GeoJSON.

With that change in place it’s now possible to query the data using SQL via Datasette Lite’s web interface, without having to download anything to your computer.

Future enhancements

This has definitely been a learning experience! This process follows the classic ETL (Extract, Transform, Load) pattern, but using tools that are specifically designed for geodata. While I’m pleased with how things are working there are definitely some things I’d like to improve.

  • Have a simpler base map than the OpenStreetMap one, which can look a bit cluttered at times
  • Switch to using Maplibre. It’s fast, it’s fun, and having used Mapbox for a while it’s familiar. It also offers smoother zooming, built-in hash-tracking on location change, as well as changing bearing and angle.
  1. Well, all the land and assets they are legally required to declare, at least. Some things like operational railways, canals and highways, as well as “assets of national security” are excluded from this list. For more details see the “Local authority land” section of the Local government transparency code 2015