Converting a transit agency’s GTFS to shapefile and GeoJSON with QGIS

Many years ago I wrote a tutorial on how to use an ArcGIS plugin to convert a transit agency’s GTFS package – a group of files that describe when and where their buses and trains stop – into files that could easily be manipulated by popular GIS desktop software.

That was so long ago, before I became an expert in using QGIS, a free and open source alternative to ArcGIS.

This tutorial will show you how to convert GTFS to a shapefile and to GeoJSON so you can edit and visualize the transit data in QGIS.


First you’ll need to have QGIS installed on your computer (it works with Linux, Mac, and Windows). Second you’ll need a GTFS package for the transit agency of your choice (here’s the one for Pace Suburban Bus*, which operates all suburban transit buses in Chicagoland). You can find another transit agency around the world on the GTFS Data Exchange website.

Section 1: Let’s start

  1. Open QGIS.
  2. Load your GTFS data into the QGIS table of contents (also called the Layers Panel). Click Layer>Add Layer>Add Delimited Text Layer. You will be adding one or two files depending on which ones are provided.

    QGIS add delimited text layer

    Add delimited text layer.

  3. Now, here it can get tricky. Not all transit agencies provide a “shapes.txt” file. The shapes.txt file draws out the routes of buses and trains. If it’s not provided, that’s fine, but if you turn them into routes based on the stops.txt data, then you will have funny looking and impossible routes.

    QGIs browse for the stops.txt file

    Browse for the stops.txt file

  4. Click on “Browse…” and find the “stops.txt”. QGIS will read the file very quickly and determine which fields hold the latitude and longitude coordinates. If its determination is wrong, you can choose a different “X field” (longitude) and “Y field” (latitude).
  5. Click “OK”. A new dialog box will appear asking you to choose a coordinate reference system (EPSG). Choose or filter for “WGS 84, EPSG:4326”. Then click “OK”.
  6. The Pace bus stops in the Chicagoland region are now drawn in QGIS!

    Pace bus stops are shown

    Pace bus stops are shown

  7. If the GTFS package you downloaded includes a “shapes.txt” file (that represents the physical routes and paths that the buses or trains take), import that file also by repeating steps 4 and 5.

Section 2: Converting the stops

It’s really easy now to convert the bus or train stops into a shapefile or GeoJSON representing all of those points.

  1. Right-click the layer “stops” in the table of contents (Layers Panel) and click “Save As…”.
  2. In the “Save vector layer as…” dialog box, choose the format you want, either “ESRI Shapefile” or “GeoJSON”. **
  3. Then click “Browse” to tell QGIS where in your computer’s file browser you want to save the file. Leave the “CRS” as-is (EPSG:4326).

    Convert the bus stops to a shapefile or GeoJSON.

    Convert the Pace bus stops to a shapefile or GeoJSON.

  4. Then click “OK” and QGIS will quickly report that the file has been converted and saved where you specified in step 3.

Section 3: Converting the bus or train routes

The “shapes.txt” file is a collection of points that when grouped by their route number, show the physical routes and paths that buses and trains take. You’ll need a plugin to make the lines from this data.

  1. Install the plugin “Points to Paths”. Click on Plugins>Manage and Install Plugins… Then click “All” and search for “points”. Click the “Points to Paths” plugin and then click the “Install plugin” button. Then click “Close”.

    Install the Points to Paths plugin.

    Install the Points to Paths plugin.

  2. Pace bus doesn’t provide the “shapes.txt” file so we’ll need to find a new GTFS package. Download the GTFS package provided by the Chicago Transit Authority, which has bus and rail service in Chicago and the surrounding municipalities.
  3. Load the CTA’s “shapes.txt” file into the table of contents (Layers Panel) by following steps 4 and 5 in the first section of this tutorial.  Note that this data includes both the bus routes and the train routes.

    QGIS load CTA bus and train stops

    Import CTA bus and train stops into QGIS

  4. Now let’s start the conversion process. Click on Plugins>Points to Paths. In the next dialog box choose the “shapes” layer as your “Input point layer”.
  5. Select “shape_id” as the field with which you want to “Point group field”. This tells the plugin how to distinguish one bus route from the next.
  6. Select “shape_pt_sequence” as the field with which you want to “Point order field”. This tells the plugin in what order the points should be connected to form the route’s line.
  7. Click “Browse” to give the converted output shapefile a name and a location with your computer’s file browser.
  8. Make sure all  of the options look like the one in this screenshot and then click “OK”. QGIS and the plugin will start working to piece together the points into lines and create a new shapefile from this work.

    These are the options you need to set to convert the CTA points (stops) to paths (routes).

    These are the options you need to set to convert the CTA points to paths (routes).

  9. You’ll know it’s finished when the hourglass or “waiting” cursor returns to a pointer, and when you see a question asking if you would like the resulting shapefile added to your table of contents (Layers Panel). Go ahead and choose “Yes”.

    QGIS: CTA bus and train points are converted to paths (routes)

    The CTA bus and train points, provided in a GTFS package, have been converted to paths (routes/lines).

  10. Now follow steps 1-4 from Section 3 to convert the routes/lines data to a shapefile or GeoJSON file.**


* As of this writing, the schedules in Pace’s GTFS package are accurate as of January 18, 2016. It appears their download link always points to the latest version. Transit schedules typically change several times each year. Pace says, “Only one package is posted at any given time, typically representing Pace service from now until a couple of months in the future. Use the Calendar table to see on which days and dates service in the Trips table are effective.”

** Choose GeoJSON if you want to show this data on a web map (like in Leaflet or the Google Maps API), or if you want to share the data on GitHub.

How to extract highways and subway lines from OpenStreetMap as a shapefile

It’s possible to use Overpass Turbo to extract any object from the OpenStreetMap “planet” and convert it from a GeoJSON or KML file to a shapefile for manipulation and analysis in GIS.

Say you want the subway lines for Mexico City, and you can’t find a GTFS file that you could convert to shapefile, and you can’t find the right files on Sistema de Transporte Colectivo’s website (I didn’t look for it).

Here’s how to extract the subway lines that are shown in OpenStreetMap and save them as a GIS shapefile.

This is my second tutorial to describe using Overpass Turbo. The first extracted places of worship in Cook County. I’ve also used Overpass Turbo to extract a map of campgrounds

Extract free and open source data from OpenStreetMap

  1. Open the Overpass Turbo website and, on the map, search for the city from which you want to extract data. (The Overpass query will be generated in such a way that it’ll only search for data in the current map view.)
  2. Click the “Wizard” button in the top toolbar. (Alternatively you can copy the code below and paste it into the text area on the website and click the “Run” button.)
  3. In the Wizard dialog box, type in “railway=subway” in order to find metro, subway, or rapid transit lines. (If you want to download interstate highways, or what they call motorways in the UK, use “highway=motorway“.) Then click the “build and run query” button.
  4. In a few seconds you’ll see lines and dots (representing the metro or subway stations) on the map, and a new query in the text area. Notice that the query has looked for three kinds of objects: node (points/stations), way (the subway tracks), relation (the subway routes).
  5. If you don’t want a particular kind of object, then delete its line from the query and click the “Run” button. (You probably don’t want relation if you’re just needing GIS data for mapping purposes, and because routes are not always well-defined by OpenStreetMap contributors.)
  6. Download the data by clicking the “Export” button. Choose from one of the first three options (GeoJSON, GPX, KML). If you’re going to use a desktop GIS software, or place this data in a web map (like Leaflet), then choose GeoJSON. Now, depending on what browser you’re using, a couple things could happen after you click on GeoJSON. If you’re using Chrome then clicking it will download a file. If you’re using Safari then clicking it will open a new tab and put the GeoJSON text in there. Copy and paste this text into TextEdit and save the file as “mexico_city_subway.geojson”.
Overpass Turbo screenshot 1 of 2

Screenshot 1: After searching for the city for which you want to extract data (Mexico City in this case), click the “Wizard” button and type “railway=subway” and click run.

Overpass Turbo screenshot 2

Screenshot 2: After building and running the query from the Wizard you’ll see subway lines and stations.

Overpass Turbo screenshot 3

Screenshot 3: Click the Export button and click GeoJSON. In Chrome, a file will download. In Safari, a new tab with the GeoJSON text will open (copy and paste this into TextEdit and save it as “mexico_city_subway.geojson”).

Convert the free and open source data into a shapefile

  1. After you’ve downloaded (via Chrome) or re-saved (Safari) a GeoJSON file of subway data from OpenStreetMap, open QGIS, the free and open source GIS desktop application for Linux, Windows, and Mac.
  2. In QGIS, add the GeoJSON file to the table of contents by either dragging the file in from the Finder (Mac) or Explorer (Windows), or by clicking File>Open and browsing and selecting the file.
  3. Convert it to GeoJSON by right-clicking on the layer in the table of contents and clicking “Save As…”
  4. In the “Save As…” dialog box choose “ESRI Shapefile” from the dropdown menu. Then click “Browse” to find a place to save this file, check “Add saved file to map”, and click the “OK” button.
  5. A new layer will appear in your table of contents. In the map this new layer will be layered directly above your GeoJSON data.
Overpass Turbo screenshot 4

Screenshot 4: The GeoJSON file exported from Overpass Turbo has now been loaded into the QGIS table of contents.

Overpass Turbo screenshot 5

Screenshot 5: In QGIS, right-click the layer, select “Save As…” and set the dialog box to have these settings before clicking OK.

Query for finding subways in your current Overpass Turbo map view

This has been generated by the overpass-turbo wizard.
The original search was:
// gather results
// query part for: “railway=subway”
/*relation is for "routes", which are not always
well-defined, so I would ignore it*/
// print results
out body;
out skel qt;

How to convert bike-share JSON data to CSV and then to shapefile

Update January 4, 2013: The easiest way to do this is to use Ian Dees’s Divvy API as it outputs straight to GeoJSON (which QGIS likes). See below.

For Michael Carney’s Divvy bike-share stations + Census tract + unbanked Chicagoans analysis and map he needed the Divvy station locations as a shapefile. I copied the JSON-formatted text of the Divvy real-time station API, converted it to CSV with OpenRefine, and then created a shapefile with QGIS.

Here’s how to create a shapefile of any bike-share system that uses hardware from Public Bike System Co based on Montréal and is operated by Alta Bicycle Share (this includes New York City, Chattanooga, Bay Area, Melbourne, and Chicago): Continue reading

How many miles of roads are in your ward?

Screenshot 1: Showing how some streets are not being counted. There should be a yellow section of road between the two existing yellow road sections. 

A friend recently asked me how many blocks of road are in his ward. He wanted to know so that he could measure how many blocks of streets would have an older style of street lighting after X number of blocks receive the new style of street lighting. For this project, I used two datasets from Chicago’s open data portal: street center lines and wards. The output data is not very accurate as there may be some overlap and some uncounted street segments; this is likely due to a shortcoming in my process. I will show you how to find the number of blocks per ward using QGIS (download Quantum GIS, a free program for all OSes).

Here’s how I did it

  1. Load in the two datasets. Wards and street center lines (zipped shapefiles). They are projected in EPSG:3435.
  2. Exclude several road classifications in the street center lines by querying only for "CLASS" > '1' AND "CLASS" < '5'. The data dictionary for the road classifications is at the end. We don’t want the river, sidewalks, expressways, and any ramps to be included in the blocks per ward analysis.
  3. Intersect. In QGIS, select Vector>Geoprocessing Tools>Intersect. The input vector layer is “Transportation” (the name of the street center lines dataset) and the intersect layer is “Wards”. Save the resulting shapefile as “streets intersect wards”. Click OK. This will take a while.
  4. Add the “streets intersect wards” shapefile to the table of contents.
  5. You’ll notice some of the issues with the resulting shapefile: missing street segments (see screenshot 1). What should QGIS do if a street is a ward boundary?
  6. Obtain street length information, part 1. Remove all the columns in the “streets intersect wards” shapefile that have something to do with geometry. These are now outdated and will confuse you when you add a geometry column generated by QGIS.
  7. Obtain street length information, part 2. With the “streets intersect wards” shapefile selected in the table of contents, select Vector>Geometry Tools>Export/Add geometry columns. Select “streets intersect wards” shapefile as your input layer, leave CRS as “Layer CRS” and save as new shapefile “streets intersect wards geom”.
  8. Add the “streets intersect wards geom” shapefile to the table of contents.
  9. You will see a new column at the end of the attribute table called LENGTH. Since the data is projected in EPSG:3435 (Illinois StatePlane NAD83 East Feet), the unit is feet.
  10. Simply export “streets intersect wards geom” to a CSV file and open the CSV file in a spreadsheet application. From there you can group the data by Ward number and add the street lengths together. (I thought it would be faster to do this in a database so I imported it into a localhost MySQL database and ran a simple query, SELECT wardNum, sum(`chistreets_classes234`.`LENGTH`) as sum FROM chistreets_classes234 WHERE ward > 0 group by wardNum. I then exported this to a spreadsheet to convert feet to miles.)

Because of the errors described in step 5, you shouldn’t use this analysis for any application where accuracy is important. There are road lengths missing in the output dataset (table with street lengths summed by ward) and I cannot tell if the inaccuracy is equally distributed.

[table id=7 /]

Wards 19 (south side) and 41 (Norwood Park, including O’Hare airport) have the highest portion of street length in the city.

Screenshot 2: Ward 41 is seen. 

Street data dictionary

Column is “CLASS”. The value is a string. This dataset lacks alleys. Adapted from the City’s data dictionary.

1. Expressway

2. Arterials (1 mile grid, no diagonals)

3. Collectors (includes diagonals)

4. Other streets (side streets, neighborhood streets)

5. Named alleys (mostly downtown, like Couch Place and Garland Place)

7. Tiered (lower level streets, including LaSalle, Michigan, Columbus, and Wacker)

9. Ramps (goes along with expressway)

E. Extent (not sure how to describe these; includes riverwalk and lake walk segments, and Navy Pier, also includes some streets, like Mies van der Rohe Way)

RIV. River

S. Sidewalk

99. Unclassified

How to split a bike lane in two and copy features with QGIS

A screenshot of the splash image seen on users with iPad retina displays in landscape mode. 

To make the Chicago Offline Bike Map, I need bikeways data. I got this from the City of Chicago’s data portal, in GIS shapefile format. It has a good attribute table listing the name of the street the bikeway is on and the bikeway’s class (see below). After several bike lanes had been installed, I asked the City’s data portal operators for an updated shapefile. I got it a month later and found that it wasn’t up-to-date. I probably could have received a shapefile with the current bikeway installations marked, but I didn’t have time to wait: every day delayed was one more day I couldn’t promote my app; I make 70 cents per sale.

Since the bikeway lines were already there, I could simply reclassify the sections that had been changed to an upgraded form of bikeway (for example, Wabash Avenue went from a door zone-style bike lane to a buffered bike lane in 2011). I tried to do this but ran into trouble when the line segment was longer than the bikeway segment that needed to be reclassified (for example, Elston Avenue has varying classifications from Milwaukee Avenue to North Avenue that didn’t match the line segments for that street). I had to divide the bikeway into shorter segments and reclassify them individually.

Enter the Split Features tool. QGIS is short on documentation and I had trouble using this feature. I eventually found the trick after a search that took more time than I expected. Here’s how to cut a line:

  1. Select the line using one of the selection tools. I prefer the default one, Select Features, where you have to click on the feature one-by-one. (It’s not required that you select the line, but doing so will ensure you only cut the selected line. If you don’t select the line, you can cut many lines in one go.)
  2. Toggle editing on the layer that contains the line you want to cut.
  3. Click Edit>Split Features to activate that tool, or find its icon in one the toolbars (which may or may not be shown).
  4. Click once near where you want to split the line.
  5. Move the cursor across the line you want to split, in the desired split location.
  6. When the red line indicating your split is where you desire, press the right-click mouse button.

Your line segment has now been split. A new entry has been added to the attribute table. There are now two entries with duplicate attributes representing that together make up the original line segment, before you split it.

This screenshot shows a red line across a road. The red line indicates where the road will be split. Press the right-click mouse button to tell QGIS to “split now”.

After splitting, open the attribute table to see that you now have two features with identical attributes. 

Copying features in QGIS

A second issue I had when creating new bikeways data was when a bikeway didn’t exist and I couldn’t reclassify it. This was the case on Franklin Boulevard: no bikeway had ever been installed there. I solved this problem by copying the relevant street segments from the Transportation (roads) shapefile and pasted them into the bikeways shapefile. New entries were created in the attribute table but with blank attributes. It was simple to fill in the street name, class, and extents.

Chicago bikeways GIS description

Bikeway classes (TYPE in the dataset) in the City of Chicago data portal are:

  1. Existing bike lane
  2. Existing marked shared lane
  3. Proposed on-street bikeway
  4. Recommended bike route
  5. Existing trail
  6. Proposed off-street trail
  7. Access path (to existing trail)
  8. Existing cycle track (also known as protected bike lane)
  9. Existing buffered bike lane

It remains to be seen if the City will identify the “enhanced marked shared lane” on Wells Street between Wacker Drive and Van Buren street differently than “existing marked shared lane” in the data.