Category: File Conversion

Converting a transit agency’s GTFS to shapefile and GeoJSON with QGIS

Many years ago I wrote a tutorial on how to use an ArcGIS plugin to convert a transit agency’s GTFS package – a group of files that describe when and where their buses and trains stop – into files that could easily be manipulated by popular GIS desktop software.

That was so long ago, before I became an expert in using QGIS, a free and open source alternative to ArcGIS.

This tutorial will show you how to convert GTFS to a shapefile and to GeoJSON so you can edit and visualize the transit data in QGIS.

Prerequisites

First you’ll need to have QGIS installed on your computer (it works with Linux, Mac, and Windows). Second you’ll need a GTFS package for the transit agency of your choice (here’s the one for Pace Suburban Bus*, which operates all suburban transit buses in Chicagoland). You can find another transit agency around the world on the GTFS Data Exchange website.

Section 1: Let’s start

  1. Open QGIS.
  2. Load your GTFS data into the QGIS table of contents (also called the Layers Panel). Click Layer>Add Layer>Add Delimited Text Layer. You will be adding one or two files depending on which ones are provided.

    QGIS add delimited text layer

    Add delimited text layer.

  3. Now, here it can get tricky. Not all transit agencies provide a “shapes.txt” file. The shapes.txt file draws out the routes of buses and trains. If it’s not provided, that’s fine, but if you turn them into routes based on the stops.txt data, then you will have funny looking and impossible routes.

    QGIs browse for the stops.txt file

    Browse for the stops.txt file

  4. Click on “Browse…” and find the “stops.txt”. QGIS will read the file very quickly and determine which fields hold the latitude and longitude coordinates. If its determination is wrong, you can choose a different “X field” (longitude) and “Y field” (latitude).
  5. Click “OK”. A new dialog box will appear asking you to choose a coordinate reference system (EPSG). Choose or filter for “WGS 84, EPSG:4326”. Then click “OK”.
  6. The Pace bus stops in the Chicagoland region are now drawn in QGIS!

    Pace bus stops are shown

    Pace bus stops are shown

  7. If the GTFS package you downloaded includes a “shapes.txt” file (that represents the physical routes and paths that the buses or trains take), import that file also by repeating steps 4 and 5.

Section 2: Converting the stops

It’s really easy now to convert the bus or train stops into a shapefile or GeoJSON representing all of those points.

  1. Right-click the layer “stops” in the table of contents (Layers Panel) and click “Save As…”.
  2. In the “Save vector layer as…” dialog box, choose the format you want, either “ESRI Shapefile” or “GeoJSON”. **
  3. Then click “Browse” to tell QGIS where in your computer’s file browser you want to save the file. Leave the “CRS” as-is (EPSG:4326).

    Convert the bus stops to a shapefile or GeoJSON.

    Convert the Pace bus stops to a shapefile or GeoJSON.

  4. Then click “OK” and QGIS will quickly report that the file has been converted and saved where you specified in step 3.

Section 3: Converting the bus or train routes

The “shapes.txt” file is a collection of points that when grouped by their route number, show the physical routes and paths that buses and trains take. You’ll need a plugin to make the lines from this data.

  1. Install the plugin “Points to Paths”. Click on Plugins>Manage and Install Plugins… Then click “All” and search for “points”. Click the “Points to Paths” plugin and then click the “Install plugin” button. Then click “Close”.

    Install the Points to Paths plugin.

    Install the Points to Paths plugin.

  2. Pace bus doesn’t provide the “shapes.txt” file so we’ll need to find a new GTFS package. Download the GTFS package provided by the Chicago Transit Authority, which has bus and rail service in Chicago and the surrounding municipalities.
  3. Load the CTA’s “shapes.txt” file into the table of contents (Layers Panel) by following steps 4 and 5 in the first section of this tutorial.  Note that this data includes both the bus routes and the train routes.

    QGIS load CTA bus and train stops

    Import CTA bus and train stops into QGIS

  4. Now let’s start the conversion process. Click on Plugins>Points to Paths. In the next dialog box choose the “shapes” layer as your “Input point layer”.
  5. Select “shape_id” as the field with which you want to “Point group field”. This tells the plugin how to distinguish one bus route from the next.
  6. Select “shape_pt_sequence” as the field with which you want to “Point order field”. This tells the plugin in what order the points should be connected to form the route’s line.
  7. Click “Browse” to give the converted output shapefile a name and a location with your computer’s file browser.
  8. Make sure all  of the options look like the one in this screenshot and then click “OK”. QGIS and the plugin will start working to piece together the points into lines and create a new shapefile from this work.

    These are the options you need to set to convert the CTA points (stops) to paths (routes).

    These are the options you need to set to convert the CTA points to paths (routes).

  9. You’ll know it’s finished when the hourglass or “waiting” cursor returns to a pointer, and when you see a question asking if you would like the resulting shapefile added to your table of contents (Layers Panel). Go ahead and choose “Yes”.

    QGIS: CTA bus and train points are converted to paths (routes)

    The CTA bus and train points, provided in a GTFS package, have been converted to paths (routes/lines).

  10. Now follow steps 1-4 from Section 3 to convert the routes/lines data to a shapefile or GeoJSON file.**

Notes

* As of this writing, the schedules in Pace’s GTFS package are accurate as of January 18, 2016. It appears their download link always points to the latest version. Transit schedules typically change several times each year. Pace says, “Only one package is posted at any given time, typically representing Pace service from now until a couple of months in the future. Use the Calendar table to see on which days and dates service in the Trips table are effective.”

** Choose GeoJSON if you want to show this data on a web map (like in Leaflet or the Google Maps API), or if you want to share the data on GitHub.

How to convert bike-share JSON data to CSV and then to shapefile

Update January 4, 2013: The easiest way to do this is to use Ian Dees’s Divvy API as it outputs straight to GeoJSON (which QGIS likes). See below.

For Michael Carney’s Divvy bike-share stations + Census tract + unbanked Chicagoans analysis and map he needed the Divvy station locations as a shapefile. I copied the JSON-formatted text of the Divvy real-time station API, converted it to CSV with OpenRefine, and then created a shapefile with QGIS.

Here’s how to create a shapefile of any bike-share system that uses hardware from Public Bike System Co based on Montréal and is operated by Alta Bicycle Share (this includes New York City, Chattanooga, Bay Area, Melbourne, and Chicago): Continue reading

Results of my personal #editathon this weekend

I added a lot of parking lots to OpenStreetMap this weekend, but I also added the Willow Creek Community Church (South Barrington, Illinois) parking lots, driveways, buildings, and retention ponds. I’ll let my before and after screenshots show you what I did. Ian Dees, local organizer for Chicagoland OSM data – he has other roles, too – said there’s an application that can generate these images automatically.

View in OpenStreetMap now.

I’m also writing a draft tutorial on how to convert GIS data stored as a shapefile to a format you can import into AutoCAD. GRASS will take a .shp and convert it to .dxf (a geo-aware CAD file).

How to upload shapefiles to Google Fusion Tables

It is now possible to upload a shapefile (and its companion files SHX, PRJ, and DBF) to Google Fusion Tables (GFT).

Before we go any further, keep in mind that the application that does this will only process 100,000 rows. Additionally, GFT only gives each user 200 MB of storage (and they don’t tell you your current status, that I can see).

  1. Login to your Google account (at Gmail, or at GFT).
  2. Prepare your data. Ensure it has fewer than 100,000 rows.
  3. ZIP up your dataX.shp, dataX.shx, dataX.prj, and dataX.dbf. Use WinZip for Windows, or for Mac, right-click the selection of files and select “Compress 4 items”.
  4. Visit the Shape to Fusion website. You will have to authorize the web application to “grant access” to your GFT tables. It needs this access so that after the web application processes your data, it can insert it into GFT.
  5. If you want a Centroid Geometry column or a Simplified Geometry column added, click “Advanced Options” and check their checkboxes – see notes below for an explanation.
  6. Choose the file to upload and click Upload.
  7. Leave the window open until it says it has processed all of the rows. It will report “Processed Y rows and inserted Y rows”. You will be given a link to the GFT the web application created.

Sample Data

If you’re looking to give this a try and see results quickly, try some sample data from the City of Chicago data portal:

Notes

I had trouble many times while using Shape to Fusion in that after I chose the file to upload and clicked Upload, I had to grant access to the web application again and start over (choose the file and click Upload a second time).

Centroid Geometry – This creates a column with the geographic coordinates of the centroid in a polygon. It lists it in the original projection system. So if your projection is in feet, the value will be in feet. This is a function that can easily be performed in free and open source QGIS, where you can also reproject files to get latitude and longitude values (in WGS84 project, EPSG 4326). The centroid value is surrounded in the field by KML syntax “<Point><coordinates>X,Y</coordinates></Point>”.

Simplified Geometry – A geometry column is automatically created by the web application (or GFT, I’m not sure). This function will create a simpler version of that geometry, with fewer lines and vertices. It also creates columns to list the vertices count for the simple and regular geometry columns.

Free online GIS tools: An introduction to GeoCommons

Read my tutorial on how I created the pedestrian map with GeoCommons. Read on for an introduction to GeoCommons and online GIS tools.

GeoCommons, like Google My Maps and Earth, is part of the “poor man’s GIS package.” It’s another tool that provides (few) of the functions that desktop GIS software offers. But it excels at making simple and somewhat complex maps.

I first used GeoCommons over a year ago. I started using it because it would convert whatever data you uploaded into another format that was probably more useful. I mentioned it in this article about converting files. For example, if you have a KML file, you can upload it and export it as a shapefile for GIS programs, or a CSV file to load into a table editor or spreadsheet application.

After creating the Chicago bike crash maps using Google Fusion Tables, I wanted to try out another map-making web application, one that provided more customization and prettier maps.

I found that web application and created a version of the bike crash maps, with several other data layers, in GeoCommons. I overlaid bike counts and bikeways so you can observe some relationships between each visual dataset. My latest map (screenshot below), created Wednesday, shows pedestrian counts in downtown Chicago overlaid with CTA and downtown Metra stations, as well as the 48 intersections with the most pedestrian collisions (from this UNC study, PDF).

Screenshot of pedestrian count map described above.

How these online GIS tools can be useful to you

I bet there’s a way you can use Google Fusion Tables and GeoCommons for your job or project. They’re extremely simple to use: they can take in data from the spreadsheets you’re already working on and turn them into themed reference maps. With mapping, you can do simple, visual analysis that doesn’t require statistical software or knowledge.

Imagine plotting your client list on a map and grouping them by age to see if perhaps your younger clients tend to live in the same neighborhoods of town, or if they’re more diverse (should you do this, keep the map private, something that you can’t do in GeoCommons – yet).

You may also find it useful if you want to create a route for your salespeople or for visiting church members at their homes. Plot all the addresses on a map, then manually filter them into different groups based on the clusters you see. With Google Fusion Tables, you can easily add a new column with the GROUP information and apply a numbered or lettered group and then re-sort.

Other things you can do in GeoCommons

  • Merge tables with geography – I uploaded two datasets: a table containing census tract IDs and demographic information for Cook County I downloaded from the American FactFinder 2; and a shapefile containing Cook County census tracts boundary information. After merging them, I could download a NEW shapefile that contained both datasets.
  • Make multi-layer maps
  • Symbolize based on frequency/rate
  • Convert data – This is by far the most useful feature. It imports “shapefiles (SHP), comma separated values (CSV), Keyhole Markup Language (KML), and GeoRSS” and exports “Shapefile, CSV, KML, GeoRSS Atom, Spatialite, and JSON” (from the GeoCommons user manual).

Read my tutorial on how I created the pedestrian map with GeoCommons.