Category: GIS

Converting a transit agency’s GTFS to shapefile and GeoJSON with QGIS

Many years ago I wrote a tutorial on how to use an ArcGIS plugin to convert a transit agency’s GTFS package – a group of files that describe when and where their buses and trains stop – into files that could easily be manipulated by popular GIS desktop software.

That was so long ago, before I became an expert in using QGIS, a free and open source alternative to ArcGIS.

This tutorial will show you how to convert GTFS to a shapefile and to GeoJSON so you can edit and visualize the transit data in QGIS.

Prerequisites

First you’ll need to have QGIS installed on your computer (it works with Linux, Mac, and Windows). Second you’ll need a GTFS package for the transit agency of your choice (here’s the one for Pace Suburban Bus*, which operates all suburban transit buses in Chicagoland). You can find another transit agency around the world on the GTFS Data Exchange website.

Section 1: Let’s start

  1. Open QGIS.
  2. Load your GTFS data into the QGIS table of contents (also called the Layers Panel). Click Layer>Add Layer>Add Delimited Text Layer. You will be adding one or two files depending on which ones are provided.

    QGIS add delimited text layer

    Add delimited text layer.

  3. Now, here it can get tricky. Not all transit agencies provide a “shapes.txt” file. The shapes.txt file draws out the routes of buses and trains. If it’s not provided, that’s fine, but if you turn them into routes based on the stops.txt data, then you will have funny looking and impossible routes.

    QGIs browse for the stops.txt file

    Browse for the stops.txt file

  4. Click on “Browse…” and find the “stops.txt”. QGIS will read the file very quickly and determine which fields hold the latitude and longitude coordinates. If its determination is wrong, you can choose a different “X field” (longitude) and “Y field” (latitude).
  5. Click “OK”. A new dialog box will appear asking you to choose a coordinate reference system (EPSG). Choose or filter for “WGS 84, EPSG:4326”. Then click “OK”.
  6. The Pace bus stops in the Chicagoland region are now drawn in QGIS!

    Pace bus stops are shown

    Pace bus stops are shown

  7. If the GTFS package you downloaded includes a “shapes.txt” file (that represents the physical routes and paths that the buses or trains take), import that file also by repeating steps 4 and 5.

Section 2: Converting the stops

It’s really easy now to convert the bus or train stops into a shapefile or GeoJSON representing all of those points.

  1. Right-click the layer “stops” in the table of contents (Layers Panel) and click “Save As…”.
  2. In the “Save vector layer as…” dialog box, choose the format you want, either “ESRI Shapefile” or “GeoJSON”. **
  3. Then click “Browse” to tell QGIS where in your computer’s file browser you want to save the file. Leave the “CRS” as-is (EPSG:4326).

    Convert the bus stops to a shapefile or GeoJSON.

    Convert the Pace bus stops to a shapefile or GeoJSON.

  4. Then click “OK” and QGIS will quickly report that the file has been converted and saved where you specified in step 3.

Section 3: Converting the bus or train routes

The “shapes.txt” file is a collection of points that when grouped by their route number, show the physical routes and paths that buses and trains take. You’ll need a plugin to make the lines from this data.

  1. Install the plugin “Points to Paths”. Click on Plugins>Manage and Install Plugins… Then click “All” and search for “points”. Click the “Points to Paths” plugin and then click the “Install plugin” button. Then click “Close”.

    Install the Points to Paths plugin.

    Install the Points to Paths plugin.

  2. Pace bus doesn’t provide the “shapes.txt” file so we’ll need to find a new GTFS package. Download the GTFS package provided by the Chicago Transit Authority, which has bus and rail service in Chicago and the surrounding municipalities.
  3. Load the CTA’s “shapes.txt” file into the table of contents (Layers Panel) by following steps 4 and 5 in the first section of this tutorial.  Note that this data includes both the bus routes and the train routes.

    QGIS load CTA bus and train stops

    Import CTA bus and train stops into QGIS

  4. Now let’s start the conversion process. Click on Plugins>Points to Paths. In the next dialog box choose the “shapes” layer as your “Input point layer”.
  5. Select “shape_id” as the field with which you want to “Point group field”. This tells the plugin how to distinguish one bus route from the next.
  6. Select “shape_pt_sequence” as the field with which you want to “Point order field”. This tells the plugin in what order the points should be connected to form the route’s line.
  7. Click “Browse” to give the converted output shapefile a name and a location with your computer’s file browser.
  8. Make sure all  of the options look like the one in this screenshot and then click “OK”. QGIS and the plugin will start working to piece together the points into lines and create a new shapefile from this work.

    These are the options you need to set to convert the CTA points (stops) to paths (routes).

    These are the options you need to set to convert the CTA points to paths (routes).

  9. You’ll know it’s finished when the hourglass or “waiting” cursor returns to a pointer, and when you see a question asking if you would like the resulting shapefile added to your table of contents (Layers Panel). Go ahead and choose “Yes”.

    QGIS: CTA bus and train points are converted to paths (routes)

    The CTA bus and train points, provided in a GTFS package, have been converted to paths (routes/lines).

  10. Now follow steps 1-4 from Section 3 to convert the routes/lines data to a shapefile or GeoJSON file.**

Notes

* As of this writing, the schedules in Pace’s GTFS package are accurate as of January 18, 2016. It appears their download link always points to the latest version. Transit schedules typically change several times each year. Pace says, “Only one package is posted at any given time, typically representing Pace service from now until a couple of months in the future. Use the Calendar table to see on which days and dates service in the Trips table are effective.”

** Choose GeoJSON if you want to show this data on a web map (like in Leaflet or the Google Maps API), or if you want to share the data on GitHub.

How to make a map of places of worship in Cook County using OpenStreetMap data

The screenshot shows the configuration you need to find and download places of worship in Cook County, Illinois, using the Overpass Turbo website.

If you’re looking to make a map of churches, mosques, synagogues and other places of worship, you’ll need data.

  • The Yellow Pages won’t help because you can’t download that.
  • And Google Maps doesn’t let you have a slice of their database, either.
  • There’s no open data* for this because churches don’t pay taxes, don’t have business licenses, and aren’t required to register in any way

This where OpenStreetMap (OSM) comes in. It’s a virtual planet that anyone can edit and anyone can have for free.

First we need to figure out what tag people use to identify these places. Sometimes on OSM there are multiple tags that identify the same kind of place. You should prefer the one that’s either more accurate (and mentioned as such in the wiki) or widespread.

The OSM taginfo website says that editors have added over 1.2 million places of worship to the planet using amenity=place_of_worship.

Now that we know which tag to look for, we need an app that will help us get those places, but only within our desired boundary. Open up Overpass Turbo, which is a website that helps construct calls to the Overpass API, which is one way to find and download data from OSM.

Tutorial

In the default Overpass Turbo query, there’s probably a tag in brackets that says [amenity=drinking_fountain]. Change that to say [amenity=place_of_worship] (without the quotes). Now change the viewport of the map to show only the area in which you want Overpass Turbo to look for these places of worship. In the query this argument is listed as ({{bbox}}).

The map has a search bar to find boundaries (cities, counties, principalities, neighborhoods, etc.) so type in “Cook County” and press Enter. The Cook County in Illinois, United States of America, will probably appear first. Select that one and the map will zoom to show the whole county in the viewport.

Now that we’ve set the tag to [amenity=place_of_worship] and moved the map to show Cook County we can click “Run”. In a few seconds you’ll see a circle over each place of worship.

It’s now simple to download: Click on the “Export” button and click “KML” to be able to load the data into Google Earth, “GeoJSON” to load it into a GIS app like QGIS, or “save GeoJSON to gist” to create an instant map within GitHub.

*You could probably get creative and ask a municipality for a list of certificates of occupancy or building permits that had marked “religious assembly” as the zoning use for the property.

Use Turf to perform GIS functions in a web browser

Turf's merge function joins invisible buffers around each Divvy station into a single, super buffer.

Turf’s merge function joins invisible buffers around each Divvy station into a single, super buffer –all client-side, in your web browser.

I’m leading the development of a website for Slow Roll Chicago that shows the distribution of bike lane infrastructure in Chicago relative to key and specific demographics to demonstrate if the investment has been equitable.

We’re using GitHub to store code, publish meeting notes, and host discussions with the issues tracker. Communication is done almost entirely in GitHub issues. I chose GitHub over Slack and Google Groups because:

  1. All of our research and code should be public and open source so it’s clear how we made our assumptions and came to our conclusions (“show your work”).
  2. Using git, GitHub, and version control is a desirable skill and more people should learn it; this project will help people apply that skill.
  3. There are no emails involved. I deplore using email for group communication.*

The website focuses on using empirical research, maps, geographic analysis to tell the story of bike lane distribution and requires processing this data using GIS functions. Normally the data would be transformed in a desktop GIS software like QGIS and then converted to a format that can be used in Leaflet, an open source web mapping library.

Relying on desktop software, though, slows down development of new ways to slice and dice geographic data, which, in our map, includes bike lanes, wards, Census tracts, Divvy stations, and grocery stores (so far). One would have to generate a new dataset if our goals or needs changed .

I’ve built maps for images and the web that way enough in the past and I wanted to move away from that method for this project and we’re using Turf.js to replicate many GIS functions – but in the browser.

Yep, Turf makes it possible to merge, buffer, contain, calculate distance, transform, dissolve, and perform dozens of other functions all within the browser, “on the fly”, without any software.

After dilly-dallying in Turf for several weeks, our group started making progress this month. We have now pushed to our in-progress website a map with three features made possible by Turf:

  1. Buffer and dissolving buffers to show the Divvy station walk shed, the distance a reasonable person would walk from their home or office to check out a Divvy station. A buffer of 0.25 miles (two Chicago blocks) is created around each of the 300 Divvy stations, hidden from display, and then merged (dissolved in traditional GIS parlance) into a single buffer. The single buffer –called a “super buffer” in our source code – is used for another feature. Currently the projection is messed up and you see ellipsoid shapes instead of circles.
  2. Counting grocery stores in the Divvy station walk shed. We use the “feature collection” function to convert the super buffer into an object that the “within” function can use to compare to a GeoJSON object of grocery stores. This process is similar to the “select by location” function in GIS software. Right now this number is printed only to the console as we look for the best way to display stats like this to the user. A future version of the map could allow the user to change the 0.25 miles distance to an arbitrary distance they prefer.
  3. Find the nearest Divvy station from any place on the map. Using Turf’s “nearest” function and the Context Menu plugin for Leaflet, the user can right-click anywhere on the map and choose “Find nearby Divvy stations”. The “nearest” function compares the place where the user clicked against the GeoJSON object of Divvy stations to select the nearest one. The problem of locating 2+ nearby Divvy stations remains. The original issue asked to find the number of Divvy stations near the point; we’ll likely accomplish this by drawing an invisible, temporary buffer around the point and then using “within” to count the number of stations inside that buffer and then destroy the buffer.
Right-click the map and select "Find nearby Divvy stations" and Turf will locate the nearest Divvy station.

Right-click the map and select “Find nearby Divvy stations” and Turf will locate the nearest Divvy station.

* I send one email to new people who join us at Open Gov Hack Night on Tuesdays at the Mart to send them a link to our GitHub repository, and to invite them to a Dropbox folder to share large files for those who don’t learn to use git for file management.

Mapping a campground that doesn’t exist: a before and after view of OpenStreetMap

Pretty soon there will be a campground shown in OpenStreetMap, and added to its geocoding database, when I’m done adding it.

I temporarily become addicted to mapping places on OpenStreetMap. In my quest to find and map all campgrounds in Chicagoland – in order to publish them in the Chicago Bike Guide – I came across a campground that was constructed this year and opened in August 2013. This is the story of figuring out how to map the Big Rock Forest Preserve campground in Big Rock, Illinois.

I found on the Kane County Forest Preserve District website that the organization operated a campground at Big Rock Forest Preserve. I couldn’t locate the campground in Google Maps by the address the website gave. I couldn’t find it in OpenStreetMap, either, because no one had mapped it, but it’s there now.

When I searched for the park by name, Google Maps zoomed me to the main entrance of the park, but I still couldn’t see a campground. I downloaded the forest preserve district’s park map (always as a PDF) and followed the roads in Google Maps until I came across the campgrounds approximate position. There was a new road here so I followed that to find a campground under construction.

Google Maps shows the campground and artificial lake under construction.

Google’s imagery of the under-construction campground was taken on May 23, 2013 (get the date from Google Earth). This was great because now I could open JOSM, a powerful desktop OpenStreetMap editor, and locate the site, load Bing’s imagery and start tracing the campground to upload to OSM.

Bing’s imagery in JOSM, the OpenStreetMap-editing app, doesn’t show the campground.

The problem was that Bing’s imagery – and this is typical – was outdated. I could easily compare the imagery side-by-side and based on other landscape features (like the forest edge) guess where to trace the campground, but OSM needs better quality data. Enter MapWarper.

Read the rest of this post on Web Map Academy.

How to convert bike-share JSON data to CSV and then to shapefile

Update January 4, 2013: The easiest way to do this is to use Ian Dees’s Divvy API as it outputs straight to GeoJSON (which QGIS likes). See below.

For Michael Carney’s Divvy bike-share stations + Census tract + unbanked Chicagoans analysis and map he needed the Divvy station locations as a shapefile. I copied the JSON-formatted text of the Divvy real-time station API, converted it to CSV with OpenRefine, and then created a shapefile with QGIS.

Here’s how to create a shapefile of any bike-share system that uses hardware from Public Bike System Co based on Montréal and is operated by Alta Bicycle Share (this includes New York City, Chattanooga, Bay Area, Melbourne, and Chicago): Continue reading