TagQGIS

Converting a transit agency’s GTFS to shapefile and GeoJSON with QGIS

Many years ago I wrote a tutorial on how to use an ArcGIS plugin to convert a transit agency’s GTFS package – a group of files that describe when and where their buses and trains stop – into files that could easily be manipulated by popular GIS desktop software.

That was so long ago, before I became an expert in using QGIS, a free and open source alternative to ArcGIS.

This tutorial will show you how to convert GTFS to a shapefile and to GeoJSON so you can edit and visualize the transit data in QGIS.

Prerequisites

First you’ll need to have QGIS installed on your computer (it works with Linux, Mac, and Windows). Second you’ll need a GTFS package for the transit agency of your choice (here’s the one for Pace Suburban Bus*, which operates all suburban transit buses in Chicagoland). You can find another transit agency around the world on the GTFS Data Exchange website.

Section 1: Let’s start

  1. Open QGIS.
  2. Load your GTFS data into the QGIS table of contents (also called the Layers Panel). Click Layer>Add Layer>Add Delimited Text Layer. You will be adding one or two files depending on which ones are provided.

    QGIS add delimited text layer

    Add delimited text layer.

  3. Now, here it can get tricky. Not all transit agencies provide a “shapes.txt” file. The shapes.txt file draws out the routes of buses and trains. If it’s not provided, that’s fine, but if you turn them into routes based on the stops.txt data, then you will have funny looking and impossible routes.

    QGIs browse for the stops.txt file

    Browse for the stops.txt file

  4. Click on “Browse…” and find the “stops.txt”. QGIS will read the file very quickly and determine which fields hold the latitude and longitude coordinates. If its determination is wrong, you can choose a different “X field” (longitude) and “Y field” (latitude).
  5. Click “OK”. A new dialog box will appear asking you to choose a coordinate reference system (EPSG). Choose or filter for “WGS 84, EPSG:4326”. Then click “OK”.
  6. The Pace bus stops in the Chicagoland region are now drawn in QGIS!

    Pace bus stops are shown

    Pace bus stops are shown

  7. If the GTFS package you downloaded includes a “shapes.txt” file (that represents the physical routes and paths that the buses or trains take), import that file also by repeating steps 4 and 5.

Section 2: Converting the stops

It’s really easy now to convert the bus or train stops into a shapefile or GeoJSON representing all of those points.

  1. Right-click the layer “stops” in the table of contents (Layers Panel) and click “Save As…”.
  2. In the “Save vector layer as…” dialog box, choose the format you want, either “ESRI Shapefile” or “GeoJSON”. **
  3. Then click “Browse” to tell QGIS where in your computer’s file browser you want to save the file. Leave the “CRS” as-is (EPSG:4326).

    Convert the bus stops to a shapefile or GeoJSON.

    Convert the Pace bus stops to a shapefile or GeoJSON.

  4. Then click “OK” and QGIS will quickly report that the file has been converted and saved where you specified in step 3.

Section 3: Converting the bus or train routes

The “shapes.txt” file is a collection of points that when grouped by their route number, show the physical routes and paths that buses and trains take. You’ll need a plugin to make the lines from this data.

  1. Install the plugin “Points to Paths”. Click on Plugins>Manage and Install Plugins… Then click “All” and search for “points”. Click the “Points to Paths” plugin and then click the “Install plugin” button. Then click “Close”.

    Install the Points to Paths plugin.

    Install the Points to Paths plugin.

  2. Pace bus doesn’t provide the “shapes.txt” file so we’ll need to find a new GTFS package. Download the GTFS package provided by the Chicago Transit Authority, which has bus and rail service in Chicago and the surrounding municipalities.
  3. Load the CTA’s “shapes.txt” file into the table of contents (Layers Panel) by following steps 4 and 5 in the first section of this tutorial.  Note that this data includes both the bus routes and the train routes.

    QGIS load CTA bus and train stops

    Import CTA bus and train stops into QGIS

  4. Now let’s start the conversion process. Click on Plugins>Points to Paths. In the next dialog box choose the “shapes” layer as your “Input point layer”.
  5. Select “shape_id” as the field with which you want to “Point group field”. This tells the plugin how to distinguish one bus route from the next.
  6. Select “shape_pt_sequence” as the field with which you want to “Point order field”. This tells the plugin in what order the points should be connected to form the route’s line.
  7. Click “Browse” to give the converted output shapefile a name and a location with your computer’s file browser.
  8. Make sure all  of the options look like the one in this screenshot and then click “OK”. QGIS and the plugin will start working to piece together the points into lines and create a new shapefile from this work.

    These are the options you need to set to convert the CTA points (stops) to paths (routes).

    These are the options you need to set to convert the CTA points to paths (routes).

  9. You’ll know it’s finished when the hourglass or “waiting” cursor returns to a pointer, and when you see a question asking if you would like the resulting shapefile added to your table of contents (Layers Panel). Go ahead and choose “Yes”.

    QGIS: CTA bus and train points are converted to paths (routes)

    The CTA bus and train points, provided in a GTFS package, have been converted to paths (routes/lines).

  10. Now follow steps 1-4 from Section 3 to convert the routes/lines data to a shapefile or GeoJSON file.**

Notes

* As of this writing, the schedules in Pace’s GTFS package are accurate as of January 18, 2016. It appears their download link always points to the latest version. Transit schedules typically change several times each year. Pace says, “Only one package is posted at any given time, typically representing Pace service from now until a couple of months in the future. Use the Calendar table to see on which days and dates service in the Trips table are effective.”

** Choose GeoJSON if you want to show this data on a web map (like in Leaflet or the Google Maps API), or if you want to share the data on GitHub.

How to extract highways and subway lines from OpenStreetMap as a shapefile

It’s possible to use Overpass Turbo to extract any object from the OpenStreetMap “planet” and convert it from a GeoJSON or KML file to a shapefile for manipulation and analysis in GIS.

Say you want the subway lines for Mexico City, and you can’t find a GTFS file that you could convert to shapefile, and you can’t find the right files on Sistema de Transporte Colectivo’s website (I didn’t look for it).

Here’s how to extract the subway lines that are shown in OpenStreetMap and save them as a GIS shapefile.

This is my second tutorial to describe using Overpass Turbo. The first extracted places of worship in Cook County. I’ve also used Overpass Turbo to extract a map of campgrounds

Extract free and open source data from OpenStreetMap

  1. Open the Overpass Turbo website and, on the map, search for the city from which you want to extract data. (The Overpass query will be generated in such a way that it’ll only search for data in the current map view.)
  2. Click the “Wizard” button in the top toolbar. (Alternatively you can copy the code below and paste it into the text area on the website and click the “Run” button.)
  3. In the Wizard dialog box, type in “railway=subway” in order to find metro, subway, or rapid transit lines. (If you want to download interstate highways, or what they call motorways in the UK, use “highway=motorway“.) Then click the “build and run query” button.
  4. In a few seconds you’ll see lines and dots (representing the metro or subway stations) on the map, and a new query in the text area. Notice that the query has looked for three kinds of objects: node (points/stations), way (the subway tracks), relation (the subway routes).
  5. If you don’t want a particular kind of object, then delete its line from the query and click the “Run” button. (You probably don’t want relation if you’re just needing GIS data for mapping purposes, and because routes are not always well-defined by OpenStreetMap contributors.)
  6. Download the data by clicking the “Export” button. Choose from one of the first three options (GeoJSON, GPX, KML). If you’re going to use a desktop GIS software, or place this data in a web map (like Leaflet), then choose GeoJSON. Now, depending on what browser you’re using, a couple things could happen after you click on GeoJSON. If you’re using Chrome then clicking it will download a file. If you’re using Safari then clicking it will open a new tab and put the GeoJSON text in there. Copy and paste this text into TextEdit and save the file as “mexico_city_subway.geojson”.
Overpass Turbo screenshot 1 of 2

Screenshot 1: After searching for the city for which you want to extract data (Mexico City in this case), click the “Wizard” button and type “railway=subway” and click run.

Overpass Turbo screenshot 2

Screenshot 2: After building and running the query from the Wizard you’ll see subway lines and stations.

Overpass Turbo screenshot 3

Screenshot 3: Click the Export button and click GeoJSON. In Chrome, a file will download. In Safari, a new tab with the GeoJSON text will open (copy and paste this into TextEdit and save it as “mexico_city_subway.geojson”).

Convert the free and open source data into a shapefile

  1. After you’ve downloaded (via Chrome) or re-saved (Safari) a GeoJSON file of subway data from OpenStreetMap, open QGIS, the free and open source GIS desktop application for Linux, Windows, and Mac.
  2. In QGIS, add the GeoJSON file to the table of contents by either dragging the file in from the Finder (Mac) or Explorer (Windows), or by clicking File>Open and browsing and selecting the file.
  3. Convert it to GeoJSON by right-clicking on the layer in the table of contents and clicking “Save As…”
  4. In the “Save As…” dialog box choose “ESRI Shapefile” from the dropdown menu. Then click “Browse” to find a place to save this file, check “Add saved file to map”, and click the “OK” button.
  5. A new layer will appear in your table of contents. In the map this new layer will be layered directly above your GeoJSON data.
Overpass Turbo screenshot 4

Screenshot 4: The GeoJSON file exported from Overpass Turbo has now been loaded into the QGIS table of contents.

Overpass Turbo screenshot 5

Screenshot 5: In QGIS, right-click the layer, select “Save As…” and set the dialog box to have these settings before clicking OK.

Query for finding subways in your current Overpass Turbo map view

/*
This has been generated by the overpass-turbo wizard.
The original search was:
“railway=subway”
*/
[out:json][timeout:25];
// gather results
(
// query part for: “railway=subway”
node["railway"="subway"]({{bbox}});
way["railway"="subway"]({{bbox}});
relation["railway"="subway"]({{bbox}});
/*relation is for "routes", which are not always
well-defined, so I would ignore it*/
);
// print results
out body;
>;
out skel qt;

How to convert bike-share JSON data to CSV and then to shapefile

Update January 4, 2013: The easiest way to do this is to use Ian Dees’s Divvy API as it outputs straight to GeoJSON (which QGIS likes). See below.

For Michael Carney’s Divvy bike-share stations + Census tract + unbanked Chicagoans analysis and map he needed the Divvy station locations as a shapefile. I copied the JSON-formatted text of the Divvy real-time station API, converted it to CSV with OpenRefine, and then created a shapefile with QGIS.

Here’s how to create a shapefile of any bike-share system that uses hardware from Public Bike System Co based on Montréal and is operated by Alta Bicycle Share (this includes New York City, Chattanooga, Bay Area, Melbourne, and Chicago): Continue reading

How many miles of roads are in your ward?

Screenshot 1: Showing how some streets are not being counted. There should be a yellow section of road between the two existing yellow road sections. 

A friend recently asked me how many blocks of road are in his ward. He wanted to know so that he could measure how many blocks of streets would have an older style of street lighting after X number of blocks receive the new style of street lighting. For this project, I used two datasets from Chicago’s open data portal: street center lines and wards. The output data is not very accurate as there may be some overlap and some uncounted street segments; this is likely due to a shortcoming in my process. I will show you how to find the number of blocks per ward using QGIS (download Quantum GIS, a free program for all OSes).

Here’s how I did it

  1. Load in the two datasets. Wards and street center lines (zipped shapefiles). They are projected in EPSG:3435.
  2. Exclude several road classifications in the street center lines by querying only for "CLASS" > '1' AND "CLASS" < '5'. The data dictionary for the road classifications is at the end. We don’t want the river, sidewalks, expressways, and any ramps to be included in the blocks per ward analysis.
  3. Intersect. In QGIS, select Vector>Geoprocessing Tools>Intersect. The input vector layer is “Transportation” (the name of the street center lines dataset) and the intersect layer is “Wards”. Save the resulting shapefile as “streets intersect wards”. Click OK. This will take a while.
  4. Add the “streets intersect wards” shapefile to the table of contents.
  5. You’ll notice some of the issues with the resulting shapefile: missing street segments (see screenshot 1). What should QGIS do if a street is a ward boundary?
  6. Obtain street length information, part 1. Remove all the columns in the “streets intersect wards” shapefile that have something to do with geometry. These are now outdated and will confuse you when you add a geometry column generated by QGIS.
  7. Obtain street length information, part 2. With the “streets intersect wards” shapefile selected in the table of contents, select Vector>Geometry Tools>Export/Add geometry columns. Select “streets intersect wards” shapefile as your input layer, leave CRS as “Layer CRS” and save as new shapefile “streets intersect wards geom”.
  8. Add the “streets intersect wards geom” shapefile to the table of contents.
  9. You will see a new column at the end of the attribute table called LENGTH. Since the data is projected in EPSG:3435 (Illinois StatePlane NAD83 East Feet), the unit is feet.
  10. Simply export “streets intersect wards geom” to a CSV file and open the CSV file in a spreadsheet application. From there you can group the data by Ward number and add the street lengths together. (I thought it would be faster to do this in a database so I imported it into a localhost MySQL database and ran a simple query, SELECT wardNum, sum(`chistreets_classes234`.`LENGTH`) as sum FROM chistreets_classes234 WHERE ward > 0 group by wardNum. I then exported this to a spreadsheet to convert feet to miles.)

Because of the errors described in step 5, you shouldn’t use this analysis for any application where accuracy is important. There are road lengths missing in the output dataset (table with street lengths summed by ward) and I cannot tell if the inaccuracy is equally distributed.

[table id=7 /]

Wards 19 (south side) and 41 (Norwood Park, including O’Hare airport) have the highest portion of street length in the city.

Screenshot 2: Ward 41 is seen. 

Street data dictionary

Column is “CLASS”. The value is a string. This dataset lacks alleys. Adapted from the City’s data dictionary.

1. Expressway

2. Arterials (1 mile grid, no diagonals)

3. Collectors (includes diagonals)

4. Other streets (side streets, neighborhood streets)

5. Named alleys (mostly downtown, like Couch Place and Garland Place)

7. Tiered (lower level streets, including LaSalle, Michigan, Columbus, and Wacker)

9. Ramps (goes along with expressway)

E. Extent ( not sure how to describe these; includes riverwalk and lake walk segments, and Navy Pier, also includes some streets, like Mies van der Rohe Way)

RIV. River

S. Sidewalk

99. Unclassified

How to split a bike lane in two and copy features with QGIS

A screenshot of the splash image seen on users with iPad retina displays in landscape mode. 

To make the Chicago Offline Bike Map, I need bikeways data. I got this from the City of Chicago’s data portal, in GIS shapefile format. It has a good attribute table listing the name of the street the bikeway is on and the bikeway’s class (see below). After several bike lanes had been installed, I asked the City’s data portal operators for an updated shapefile. I got it a month later and found that it wasn’t up-to-date. I probably could have received a shapefile with the current bikeway installations marked, but I didn’t have time to wait: every day delayed was one more day I couldn’t promote my app; I make 70 cents per sale.

Since the bikeway lines were already there, I could simply reclassify the sections that had been changed to an upgraded form of bikeway (for example, Wabash Avenue went from a door zone-style bike lane to a buffered bike lane in 2011). I tried to do this but ran into trouble when the line segment was longer than the bikeway segment that needed to be reclassified (for example, Elston Avenue has varying classifications from Milwaukee Avenue to North Avenue that didn’t match the line segments for that street). I had to divide the bikeway into shorter segments and reclassify them individually.

Enter the Split Features tool. QGIS is short on documentation and I had trouble using this feature. I eventually found the trick after a search that took more time than I expected. Here’s how to cut a line:

  1. Select the line using one of the selection tools. I prefer the default one, Select Features, where you have to click on the feature one-by-one. (It’s not required that you select the line, but doing so will ensure you only cut the selected line. If you don’t select the line, you can cut many lines in one go.)
  2. Toggle editing on the layer that contains the line you want to cut.
  3. Click Edit>Split Features to activate that tool, or find its icon in one the toolbars (which may or may not be shown).
  4. Click once near where you want to split the line.
  5. Move the cursor across the line you want to split, in the desired split location.
  6. When the red line indicating your split is where you desire, press the right-click mouse button.

Your line segment has now been split. A new entry has been added to the attribute table. There are now two entries with duplicate attributes representing that together make up the original line segment, before you split it.

This screenshot shows a red line across a road. The red line indicates where the road will be split. Press the right-click mouse button to tell QGIS to “split now”.

After splitting, open the attribute table to see that you now have two features with identical attributes. 

Copying features in QGIS

A second issue I had when creating new bikeways data was when a bikeway didn’t exist and I couldn’t reclassify it. This was the case on Franklin Boulevard: no bikeway had ever been installed there. I solved this problem by copying the relevant street segments from the Transportation (roads) shapefile and pasted them into the bikeways shapefile. New entries were created in the attribute table but with blank attributes. It was simple to fill in the street name, class, and extents.

Chicago bikeways GIS description

Bikeway classes (TYPE in the dataset) in the City of Chicago data portal are:

  1. Existing bike lane
  2. Existing marked shared lane
  3. Proposed on-street bikeway
  4. Recommended bike route
  5. Existing trail
  6. Proposed off-street trail
  7. Access path (to existing trail)
  8. Existing cycle track (also known as protected bike lane)
  9. Existing buffered bike lane

It remains to be seen if the City will identify the “enhanced marked shared lane” on Wells Street between Wacker Drive and Van Buren street differently than “existing marked shared lane” in the data.

Initial intersection crash analysis for Milwaukee Avenue

Slightly upgraded Chicago Crash Browser

This screenshot from the Chicago Crash Browser map shows the location of bike-car collisions at Ogden/Milwaukee, an intersection that exemplifies the yellow trap problem the city hasn’t remedied.

List of the most crash-prone intersections on Milwaukee Avenue in Chicago. Using data from 2007-2009, when reported to the Chicago Police Department. Dooring data not included on the bike crash map. I used QGIS to draw a 50-feet buffer around the point where the intersection center lines meet.

Intersecting street (class 4*) Bike crashes
Chicago Avenue (see Ogden below) 12 (17)
California Avenue 9
Halsted Street & Grand Avenue 7
Damen Avenue & North Avenue 6
Western Avenue 6
Ogden Avenue (see Chicago above) 5 (17)
Ashland Avenue 5
Diversey Avenue 5
Fullerton Avenue 5
Elston Avenue 5
Augusta Boulevard (not class 4) 5

Combine the six-way (with center triangle) intersection of Ogden, Milwaukee, Chicago, and you see 17 crashes. Add the 6 just outside the 50-feet buffer and you get 23 crashes. Compare this to the six-way (without center triangle) at Halsted, Milwaukee, Grand, where there’s only 7 crashes.

What about the two intersections causes such a difference in crashes? Let’s look at some data:

Ogden, Milwaukee, Chicago Halsted, Milwaukee, Grand
Automobile traffic Approx 58,000 cars per day Approx 50,000 cars per day.
Bicycle traffic Not counted, but probably fewer than 3,100 bikes More than 3,100 bikes per day*
Bus traffic Two bus routes Three bus routes
Intersection style Island; three signal cycles No island; one signal cycle

*Notes

Traffic counts are assumed estimates. Counts are taken on a single day, either Tuesday, Wednesday, or Thursday. Bike counts at Halsted/Milwaukee/Grand were actually taken on Milwaukee several hundred feet northwest of the intersection so DO NOT include people biking on Halsted or Grand! This means that more than 3,100 people are biking through the intersection each day.

Intersection style tells us which kind of six-way intersection it is. At island styles you’ll find a concrete traffic island separating the three streets. You’ll also find three signal cycles because there are actually three intersections instead of one, making it a 12-way intersection. Also at these intersections you’ll see confusing instructional signage like, “OBEY YOUR SIGNAL ONLY” and “ONCOMING TRAFFIC HAS LONGER GREEN.”

These intersections are more likely to have a “yellow trap” – Ogden/Milwaukee definitely has this problem. The yellow trap occurs at that intersections when northbound, left-turning motorists (from Milwaukee to Ogden) get a red light but they still need to vacate the intersection. Thinking that oncoming traffic has a red light but are just being jerks and blowing the red light (when in fact they still have a green for 5-10 more seconds) they turn and sometimes hit the southbound traffic. The City of Chicago acknowledged this problem, for bicyclists especially, in summer 2013 but as of November 2014 the issue remains.

Here’s a more lengthy description of one of the problems here as well as an extremely simple solution: install a left-turn arrow for northbound Milwaukee Avenue. The entire intersection is within Alderman Burnett’s Ward 27.

Source and method

I can’t yet tell you how I obtained this data or created the map. I’m still working out the specifics in my procedures log. It involved some manual work at the end because in the resulting table that counted the number of crashes per intersection, every intersection was repeated, but the street names were in opposite columns.

Crash data from the Illinois Department of Transportation. Street data from the City of Chicago. Intersection data created with fTools in QGIS. To save time in this initial analysis, I only considered Milwaukee Avenue intersections with streets in the City of Chicago centerline file with a labeled CLASS of 1, 2, or 3.

My essential QGIS plugins

Plugins for QGIS I use most often.

All of these can be installed automatically by QGIS. Click on Plugins>Fetch Python Plugins. Then search for the plugin, click on its name, and click Install Plugin. Few plugins require a restart.

  • MMQGIS – Great for working with CSV files; also merges layers (even if they have differing attributes); has various other useful functions, including converting string data to float data. Has Voronoi diagram function (takes a long time to process).
  • fTools – Replicates some of the most basic geographic tools in ArcGIS, like Clip, Dissolve, and Reproject. Can also add X/Y values to point attribute tables that are missing them (if you want latitude/longitude, you must reproject into a coordinate reference system first, like WGS84 [EPSG: 4326]). Unfortunately, there’s little information on what each fTools function does. Below are descriptions:
    • Extract Nodes – Create a point at each intersection of vertices.
    • Basic Statistics – Generate arithmetic statistics for fields, same as statistics function in ArcGIS. Great for quickly understanding the extent of values in a field (especially numeric values), like mean, max, min, standard deviation, and number of unique values.
    • Nearest Neighbour Analysis – More details here.
    • Geoprocessing Tools>Dissolve – Combine features based on a shared attribute. For example, all features with an identical STREET_TYPE be combined into a single feature. For example, all “Avenues” will become one feature and all “Boulevards” will become a second feature. Only works on polygon layers.
    • Descrição em português
  • Table Manager
  • Open Layers – Embed Google, Yahoo, Bing, and OpenStreetMap layers in your map. See my example.

Gaps

A map that focuses on striped bikeways in downtown Chicago.

When you look at your bikeways more abstractly, like in the graphic above, do you see deficiencies or gaps in the network? Anything glaring or odd?

It’s a simple exercise: Open up QGIS and load in the relevant geographic data for your city. For Chicago, I added the city boundary, hydrography and parks (for locational reference), and bike lanes and marked-shared lanes*. Symbolize the bikeways to stand out in a bright color. I had the Chicago Transit Authority stations overlaid, but I removed them because it minimized the “black hole of bikeways” I want to show.

What do you see?

Bigger impact map

This exercise can have more impact if it was visualized differently. You have to be familiar with downtown Chicago and the Loop to fully understand why it’s important to notice what’s missing. It’s an extremely office and job dense neighborhood. It also has one of the highest densities of students in the country; the number of people residing downtown continues to grow. If I had good data on how many workers and students there were per building, I could indicate that on the map to show just how many people are potentially affected by the lack of bicycle infrastructure that leads them to their jobs (or class) in the morning, and home in the evening. I don’t know how to account for all of the bicycling that goes through downtown just for events, like at Millennium and Grant Parks, the Cultural Center, and other theaters and venues.

*If you cannot find GIS data for your city, please let me know and I will try to help you find it. It should be available for your city as a matter of course.

Trying out uDig, a free, multi-platform GIS application

ArcGIS is the standard in geographic information system applications. I don’t like that it’s expensive, unwieldy to install and update, and its user interface is stymying and slow*. I also use Mac OS X most of the time and ArcGIS is not available for Mac. It doesn’t have to be the standard.

I’ve tried my hand at Cartographica and QGIS. I really like QGIS because there’re many plugins, it’s open source, there’s a diverse community supporting it, and best of all, it’s free. I’ve written about Cartographica once – I’m not a fan right now.

My project

  • The data: Bicycle crashes in the City of Chicago as reported to IDOT for 2007-2009
  • Goal: Publish an interactive map of this data using Google Fusion Tables and its instant mapping feature.
  • Visualizing it: Added streets (prepared beforehand to exclude highways), water features, and city boundary (get that here)
  • Process: Combine bike crash data; reproject to WGS84 for Google; remove extraneous information; add latitude/longitude coordinates; export as CSV; upload to Google Fusion Tables; map it!
  • View the final product

Trying out uDig

In reaching my goal I had a task that I couldn’t figure out how to complete with QGIS: I needed to combine three shapefiles with identical table schemes into one shapefile – this one shapefile would eventually be published as one map. The join feature in fTools wasn’t working so I looked for a new solution, uDig, or “User-friendly Desktop Internet GIS.”

The solution was very easy. Highlight all the records in the attribute table of one shapefile, click Edit>Copy, then select the destination table and click Edit>Paste. The new records were added within a couple seconds. I could then bring this data back into QGIS to finish the process (outlined above under Project). I did use fTools later in the process to add lat/long coordinates to my single shapefile.

After adding more data to better visualize the crashes in Chicago, I noticed that uDig renders maps to look smoother and slightly prettier than QGIS or ArcGIS. See the screenshot below.

A screenshot of the three bicycle crash datasets (2007, 2008, 2009) with the visualization data added.

The end product: three years of police reported bicycle crashes in the City of Chicago on an interactive map powered by Google Fusion Tables, another product in Google’s arsenal of GIS for the poor man. View the final product.

*I haven’t used ArcGIS version 10 yet, which I see and read has an improved user interface; it’s unclear to me and other users if the program’s been updated to take advantage of multi-core processors. ESRI has a roundabout way of describing their support.

How to convert GTFS to GIS shapefiles and KML

This tutorial will teach how you to convert any transit agency’s General Transit Feed Specification (GTFS) data into ESRI ArcGIS-compatible shapefiles (.shp), KML, or XML. This is simple to do because GTFS data is essentially a collection of CSV (comma separated values) text files (really, really large text files).

Note: I don’t know how to do the reverse, converting shapefiles or other geodata into GTFS data. I’m not sure if this is possible and I’m still investigating it. If you have tips, let me know.

Converting GTFS to GIS shapefiles

Instructions require the use of ArcGIS (Windows only) and a free plugin called ET GeoWizards GIS for any version of ArcGIS. I do not have instructions for Mac users at this time.

I wrote these instructions while converting the Chicago Transit Authority’s GTFS files into shapefiles based on a reader’s request. “Field names” are quoted and layer names are italicized.

  1. Download the GTFS data you want. Find data from agencies around the world (although not many from Europe) on GTFS Data Exchange.
  2. Import into ArcGIS the shapes.txt file using Tools>Add XY Data. Specify Y=lat and X=lon
  3. Using ET GeoWizards GIS tools, in the Convert tab, convert the points shapefile to polyline.
  4. Select the shapes layer in the wizard, then create a destination file. Click Next.
  5. Select the “shape_id” field
  6. Click the checkbox next to Order and select the field “shape_pt_sequence” and click Finish.
  7. Depending on the number of records (the CTA has 466,000 shapes), it may take a while.
  8. The new shapefile will be added to your Table of Contents and appear in your map.
  9. Import the trips.txt and routes.txt files. Inspect them for any NULL values in the “route_id” field. You will be using this field to join the routes and trips table. It may be a case that ArcGIS imported them incorrectly; the text files will show the correct data. If NULL values appear, follow steps 10 and 11 and continue. If not, follow steps 10 and 12 and continue. This happens because ArcGIS inspected some of the data and determined they were integers and ignored text. However, this is not the case.
  10. Export the text files as DBF files so that ArcGIS operates on them better. Then remove the text files from the Table of Contents.
  11. (Only if NULL values appear) Go into editing mode and fix the NULL values you noticed in step 9. You may have to make a new column with a more forgiving data type (string) and then copy the “route_id” column into the new column. Then continue to step 12.
  12. Join routes and trips based on the field “route_id” – export as trips_routes.dbf
  13. Add a new column to shapes.shp called “shape_id2”, with data type double 18, 11. This is so we can perform step 14. Use the field calculator to copy the values from “shape_id” (also known as ET_ID) to “shape_id2”
  14. Join routes_trips with shapes into routes_poly based on the field “shape_id” (and “shape_id2”)
  15. Dissolve routes_poly on “route_id.” Make sure all selections are cleared. Use statistics/summary fields: “route_long,” “route_url.” Save as routes_diss.shp
  16. Inspect the new shapefile to ensure it was created correctly. You may notice that some bus routes don’t have names. Since these routes are well documented on the CTA website, I’m not going to fill in their names.

Click on the screenshot to see various steps in the tutorials.

Converting GTFS to KML

After you have it in shapefile form, converting to KML is easy – follow these instructions for using QGIS. Or if you want to skip the shapefile-creation process (quite involved!), you can use KMLWriter, a Python script. Also, I think the latest version of ArcGIS has built-in KML exporting.

Converting GTFS to XML

If you want to convert the GTFS data (which are essentially comma-separated value – CSV – files) to XML, that’s easier and you can avoid using GIS programs.

  • First try Mr. Data Converter (very user friendly).
  • If that doesn’t work, try this website form on Creativyst. I tested it by converting the CTA’s smallest GTFS table, frequencies.txt, and it worked properly. However, it has a data size limit. (User friendly.)
  • Next try csv2xml, a command line tool. (Not user friendly.)
  • You can also use Microsoft Excel, but read these tips and caveats first. (I haven’t found a Microsoft application I like or think is user friendly.)

© 2016 Steven Can Plan

Theme by Anders NorénUp ↑