CategorySoftware

How to convert GTFS to GIS shapefiles and KML

This tutorial will teach how you to convert any transit agency’s General Transit Feed Specification (GTFS) data into ESRI ArcGIS-compatible shapefiles (.shp), KML, or XML. This is simple to do because GTFS data is essentially a collection of CSV (comma separated values) text files (really, really large text files).

Note: I don’t know how to do the reverse, converting shapefiles or other geodata into GTFS data. I’m not sure if this is possible and I’m still investigating it. If you have tips, let me know.

Converting GTFS to GIS shapefiles

Instructions require the use of ArcGIS (Windows only) and a free plugin called ET GeoWizards GIS for any version of ArcGIS. I do not have instructions for Mac users at this time.

I wrote these instructions while converting the Chicago Transit Authority’s GTFS files into shapefiles based on a reader’s request. “Field names” are quoted and layer names are italicized.

  1. Download the GTFS data you want. Find data from agencies around the world (although not many from Europe) on GTFS Data Exchange.
  2. Import into ArcGIS the shapes.txt file using Tools>Add XY Data. Specify Y=lat and X=lon
  3. Using ET GeoWizards GIS tools, in the Convert tab, convert the points shapefile to polyline.
  4. Select the shapes layer in the wizard, then create a destination file. Click Next.
  5. Select the “shape_id” field
  6. Click the checkbox next to Order and select the field “shape_pt_sequence” and click Finish.
  7. Depending on the number of records (the CTA has 466,000 shapes), it may take a while.
  8. The new shapefile will be added to your Table of Contents and appear in your map.
  9. Import the trips.txt and routes.txt files. Inspect them for any NULL values in the “route_id” field. You will be using this field to join the routes and trips table. It may be a case that ArcGIS imported them incorrectly; the text files will show the correct data. If NULL values appear, follow steps 10 and 11 and continue. If not, follow steps 10 and 12 and continue. This happens because ArcGIS inspected some of the data and determined they were integers and ignored text. However, this is not the case.
  10. Export the text files as DBF files so that ArcGIS operates on them better. Then remove the text files from the Table of Contents.
  11. (Only if NULL values appear) Go into editing mode and fix the NULL values you noticed in step 9. You may have to make a new column with a more forgiving data type (string) and then copy the “route_id” column into the new column. Then continue to step 12.
  12. Join routes and trips based on the field “route_id” – export as trips_routes.dbf
  13. Add a new column to shapes.shp called “shape_id2”, with data type double 18, 11. This is so we can perform step 14. Use the field calculator to copy the values from “shape_id” (also known as ET_ID) to “shape_id2”
  14. Join routes_trips with shapes into routes_poly based on the field “shape_id” (and “shape_id2”)
  15. Dissolve routes_poly on “route_id.” Make sure all selections are cleared. Use statistics/summary fields: “route_long,” “route_url.” Save as routes_diss.shp
  16. Inspect the new shapefile to ensure it was created correctly. You may notice that some bus routes don’t have names. Since these routes are well documented on the CTA website, I’m not going to fill in their names.

Click on the screenshot to see various steps in the tutorials.

Converting GTFS to KML

After you have it in shapefile form, converting to KML is easy – follow these instructions for using QGIS. Or if you want to skip the shapefile-creation process (quite involved!), you can use KMLWriter, a Python script. Also, I think the latest version of ArcGIS has built-in KML exporting.

Converting GTFS to XML

If you want to convert the GTFS data (which are essentially comma-separated value – CSV – files) to XML, that’s easier and you can avoid using GIS programs.

  • First try Mr. Data Converter (very user friendly).
  • If that doesn’t work, try this website form on Creativyst. I tested it by converting the CTA’s smallest GTFS table, frequencies.txt, and it worked properly. However, it has a data size limit. (User friendly.)
  • Next try csv2xml, a command line tool. (Not user friendly.)
  • You can also use Microsoft Excel, but read these tips and caveats first. (I haven’t found a Microsoft application I like or think is user friendly.)

How to geocode a single address in QGIS

Since the last time I wrote about how to use BatchGeocode.com to perform pseudo-geocoding tasks in QGIS, there have been considerable improvements in the multi-platform, free, and open source GIS software. Now, geocoding (turning addresses into coordinates) is more automatic, albeit difficult to setup. (Okay, this has been around June 2009 and I just found out about it in October 2010.)

Once you install all the components, you’ll never have to do this again.

This method can only geocode one address at a time, but it will geocode all of the addresses into a single shapefile.

  1. Download QGIS.
  2. Download and install Python SetupTools. This includes the easy_install function that will download a necessary Python script, simplejson. On Mac you will have to use the Terminal (Applications>Utilities). Email me if you run into problems.
  3. Install simplejson. In the command line (Terminal for Mac; in Windows press Start>Run>”cmd”>Enter), type “easy_install simplejson”.
  4. Download the GeoCode plugin by Alessandro Pasotti via QGIS>Plugins>Fetch Python Plugins. You may have to load additional repositories to see it.
  5. Install geopy. In the command line (like step 3), type “easy_install geopy”.
  6. Specify your project’s projection in File>Project Properties.
  7. Get a Google Maps API key and tell the GeoCode plugin about it (QGIS>Plugins>GeoCode>Settings). You will need a Google account. If you don’t have your own domain name, you can just enter “google.com” when it asks for your domain.
  8. Geocode your first address by clicking on Plugins>GeoCode>Geocode. Type the full address (e.g. 121 N LaSalle Street, Chicago, IL for City Hall).
  9. The geocoded address will then appear in your Layers list as its own shapefile. All addresses geocoded (or reverse geocoded) in this project will appear in the same layer (therefore same attribute table).

Once you install all the components, you’ll never have to do this again. Geocoding will be available each and every time you use QGIS in the future on that workstation.

Tips

  • When you’re done geocoding,  save your results as a shapefile (right click the layer and click “Save as shapefile”). Twice I’ve lost my results after saving the project and quitting QGIS. When I reopened the project, the results layer was still listed, but contained no data.
  • Add a “name” column to the GeoCoding Plugin Results layer’s attribute table (toggle editing first). You can then type in the name of the building or destination at the address you geocoded. Edit the layer’s properties to have that name appear as a label for the point.

A map I made with QGIS showing three geocoded points of interest in Chicago. Data from City of Chicago’s GIS team.

 

Trying out new GIS software

I want to draw 50 and 120 feet buffers around the points of store entrances to show where bike parking should and shouldn’t be installed. I want to follow this example:

walgreens with bike parking buffers

Aerial photo of a Tucson, Arizona, Walgreens showing the location of existing bike parking and two buffers (50 and 120 feet) where proposed city rules would allow bike parking. I advocate for ratifying the 50 feet rule, which I’ve discussed on this blog and elsewhere many times.

I want to do this easily and accurately, so I will use GIS software to create a “buffer.” I use QGIS occasionally, but I want to try out other Mac-friendly applications. I’m getting my orthoimagery (geometrically corrected aerial photography) from the United States Geological Survey (USGS) using a web protocol called Web Map Server. I’m trying:

  • Cartographica, $495, with free trial license.
  • uDig, completely free software. UPDATE: I have had NO success getting any data to load from a WMS connection into uDig. I would like to understand why. Cartographica can obtain some of the WMS-stored data I want, although it messes up often.

I’m having success with neither – both are having issues downloading or maintaining a connection to the USGS orthoimagery. In one case, Cartographica trims the Bing Maps imagery to match the extent of my other objects (the buffer). In another case, it won’t even download the USGS imagery (and gives no indication that anything is happening). uDig hasn’t been able to download anything so far – I hope it’s asking for the current extent, instead of all data because it’s taking a looong time to do anything (so long that I just quit in the  middle of it).

This screenshot shows how to add new WMS connections to Cartographica.

UPDATE: I did it! I successfully used Cartographica (and the integrated Bing Maps) to create this drawing that shows the current (abysmal) bike parking at a Chicago Home Depot outside the 50 feet line.

Google Maps and Earth is the poor man’s GIS

For over four years, Google’s geography products have become the most popular geographic information systems on the Earth (no, the earth). Google is now as much a platform of GIS for computers and users as ESRI, the number one GIS software maker.

To continue its corporate goal of organizing the world’s information, Google has made sure to also organize the world’s (and other realms) geographic information.

Google’s free tools and products manipulate, map, reproduce and analyze geographic information:

  • Maps – the simplest source of satellite imagery for the public, although Microsoft’s TerraServer was probably first
  • Street View
  • Transit – including travel directions for trips on Transit
  • Ocean
  • Earth desktop software – includes Moon, Mars, Sky
  • My Maps
  • Yellow pages-style business listings
  • Driving and Walking Directions – including automobile traffic overlay
  • Keyhole Markup Language (KML) – a file format based on XML that allows for the easy sharing and portability of data about locations. I wrote about it here.
  • Maps API – this allows developers to include maps in their own applications and websites as well as build features on top of maps

These applications now allow anyone in the world with an internet connection* and a computer to start thinking about the world and neighborhood in which they live in terms of space, distance, the environment, land use, and most important of all the relationships between real life places and these greater themes. But not only will these instruments influence the thinking of individuals and the groups to which they belong, but they will give people tools to create.

What have people created with Google’s GIS tools?

I created a map that shows the locations of open grated metal bridges on bikeways (featured in the bike map) in Chicago. This is important to bicyclists because open grated metal bridges can be hazardous to them, especially those with high centers of gravity or narrow tires on their bikes. Bicyclists will most often encounter these bridges on trips into and out of the Central Business District. This map will help bicyclists find routes that avoid these bridges. Precipitation exacerbates the danger, especially if it’s actively raining, or snow isn’t melting.

UPDATE 12-03-10: I was looking for information on an upcoming Chicago Cyclocross meet and I found a great example of using the tools Google has created for everyone. See a screenshot of the map below:

I’m posting this image to show how easy it is to create a map that tells a story. The story here is a guide on how to be a participant or spectator at the meet. It points out places where people can park, cannot park, and where the restrooms are in relation to parking or the race course. See the full map.

What have you created? Leave a comment below.

Evolution of Google’s GIS toolbox

I believe that Google will continue to expand its array of GIS-related applications, and also expand their existing ones. I would like to see them create new connections between the applications they’ve already created. For example:

  • Google can mimic the attribute table essential in desktop GIS software (like ESRI’s ArcGIS, qGIS, or GRASS) by integrating their Docs web application with My Maps. I want to save my information in a Google Docs spreadsheet (either inputted directly online or uploaded from my computer), then create a custom map and assign a location to each of the records in my spreadsheet. Then, using tools shared between Docs and My Maps, I can automate the creation of colored points and lines for the records based on categories or numbers in my spreadsheet, much like the classification and symbology tools of desktop GIS software. For example, on my “open grated metal bridges” custom map discussed above, I want to create a spreadsheet with a column that has a yes or no value to the question, “Is the bridge treated?” All records with “yes” will have green dots, and all “no” values will have blue dots.
  • The reverse situation could also be made possible by an integration between My Maps and Google Docs. Let’s say I’m a clerk at my church and I need to group the congregants into geographically close clusters for purposes of assigning community service work. I’ve inputted all of their addresses into My Maps and added a point for every house. There’re only 40 houses on the map and I can see see about 5 clusters (to keep it simple I won’t introduce arithmetic means of finding clusters). I use a selection lasso in My Maps and select the points in my first cluster. Using a new Classify function I label these points part of Cluster 1 and color them purple – I also assign Cluster 1 to work at the nearest park. I continue for the remaining four clusters, assigning each cluster to help clean a different park. Once I’ve completed grouping the houses, I tell My Maps to generate for me a spreadsheet that lists the names and phone numbers and clean up time for all the congregants. Now I can quickly call everyone in Cluster 1 and give them their community service assignment which is convenient to where they live.
  • Google should open up its many data layers. Google has many data layers in its table of contents: They recently added real estate data, but they also have the locations of transit stations and bus stops (including timetables and route information), the addresses and phone numbers of businesses (like the Yellow Pages), as well as terrain in some cases and bike trails in others. If the data in these layers were open, map users could perform some basic analysis like counting the number of check cashing businesses within 1 mile for a study of banking behavior in low-income neighborhoods. Or a map users could find the gain in elevation on a bike trail over 4 miles to determine their ride’s difficulty. Another map user could use the transit information to calculate the level of bus service in a neighborhood by counting the number of stops available and the number of buses scheduled.

I’ll have to figure out a way Google can extract revenue from these features if I want to convince Google to produce them, but sometimes the company builds products and features before it figures out how to make money.

The importance of sharing data in KML format

The KML file is an important format in which to share locational data. KML was developed by a company called Keyhole, which Google purchased in 2004, and subsequently released Keyhole’s flagship product: Earth.

A Keyhole Markup Language file is a way to display on a map (particularly a 3D globe of Earth) a collection of points with a defined style. Google has added more functionality and style to the KML format, expanding the styles that can be applied and the information that can be embedded.

KML, like XML (eXtensible Markup Language), is extremely web-friendly. For a web application at work I developed, I included this PHP class that creates an KML file on-demand based on a predefined database query. The file contains locations and attributes of recently installed bike racks in Chicago. EveryBlock imports the file and its information into their location-based service, aggregating many news types around your block.

But a KML file is more important than being the native file for use within Google Earth. It’s an open source text file that can be manipulated by a number of software programs on any computer system on earth (or read on a printed page). It’s not encoded, like shapefiles, so I can read the file with my own mind and understand the data it would present in a compatible map viewer. I see lines of organized syntax describing points and polygons, listing their attributes in plain language.

Have you ever tried to see the “inside” of a shapefile? Only GIS programs can read them for you. KML provides data producers and consumers the opportunities to keep data open, available, and easy to use. We need locational data for our work, and we need tools to help us use it, not hide it.

Converting shapefiles and KML files

Google Earth Pro is a slightly more advanced version than the free edition of the popular satellite imagery application (okay, it does way more, but many people just use that feature). One major additional feature it includes is the ability to import GIS shapefiles and display their features on top of the imagery, including terrain. It’s useful to have your data as KML (Keyhole Markup Language) because KML (or KMZ) is easier to share and Google Earth standard edition is free. But then again, it’s useful to have your KML files as shapefiles because proper GIS software is more powerful at analyzing data. Also, someone might ask you for your data in shapefile format (but they could easily follow these instructions).

Good data management requires options. Options mean your data won’t be locked into a proprietary format. Data want to be free! Read on for ways to convert your KML and shapefile data:

Converting KML files to shapefiles

Like Google Earth Pro, Quantum GIS (QGIS) can convert KML or KMZ to shapefile, and best of all – it doesn’t cost $400 per year (it’s free!). QGIS is a cross-platform application meaning it will run on Windows, Mac OS X, and Linux.

Use QGIS to convert a KML or KMZ file to shapefile:

  1. Click on Layer > Add Vector Layer
  2. Find your KML or KMZ file.
  3. Right-click your new layer and click “Save as shapefile.”

Zonums provides online conversion tools. Or, use ArcGIS and this plugin to convert KML files to shapefiles.

Converting shapefiles to KML files

The freeware Shp2kml 2.o (Windows only) from Zonums will convert shapefiles to KML files. Want some free, interesting data to try it out? Check my ever expanding repository.

ESRI’s ArcGIS can convert KML files to shapefiles using this plugin and then import the shapefile as a layer onto your map.

Creating KML files online

As I described in this post, BatchGecode will generate a KML file for you by inputting a list of addresses and names. Additionally, Google Earth (part of the rising Google GIS platform) creates KML files. Google’s My Maps feature also allows you to generate KML files (for sharing or download) by clicking and drawing points and lines on a map and inviting you to describe the features you create. Use this to get a map of your church congregation, or a map of people who voted for your candidate.

GeoCommons Finder lets you upload geodata in many formats, save it to your profile, and then download it into multiple formats. You can upload a shapefile (.shp) and its accompany files (shx, dbf, and prj), verify that it read your data correctly,

More choices for converting

Additional software with conversion capabilities:

  • MapWindow (another free software choice; Windows only) – An alternative to QuantumGIS and ArcGIS.
  • ExpertGPS (Windows only, not free) – Ideal for GPS device owners, or for researchers using GPS devices in projects. But it can convert the GPS and shapefile data into KML, shapefiles, or a spreadsheet, amongst other functions.
  • Zonums, creator of the standalone Shp2kml software converter, now offers many online tools for KML users, including one that reverses the conversion and exports shapefiles from KML files. I found the link on FreeGeographyTools.com.
  • OpenGeo Suite – Commercial software with non-profit licenses.
  • uDig – Free GIS software, but I haven’t had good experiences with it on my computers.

GeoCommunity has a good article, with screenshots, on how some of these programs work.

Need to work with General Transit Feed Spec (GTFS) data?

Geocoding in Quantum GIS – QGIS

Geocoding is the process of turning street addresses into geographic coordinates. You can geocode easily in QGIS using several methods.

If you just want to geocode and you don’t need to see the addresses plotted on a map in QGIS, then follow these instructions. If you don’t need to see them on a map nor do you need the geographic coordinates, then use BatchGeocode.

If you only need to geocode a single address and get its coordinates immediately, use geocoder.us.

An example geocoded address on the map using the “single address” method.

How to geocode multiple addresses in QGIS

UPDATE April 11, 2013: Updated the directions because the “Add delimited text layer” function moved from the Plugins to Layer menu. 

UPDATE March 24, 2011: I updated the directions to use GPS Visualizer instead of BatchGeocode.com because BG stopped giving geographic coordinates in its output.

Get directions on geocoding a single address in QGIS with a plugin.

QGIS is an open-source Geographic Information Systems (GIS) application that has been gaining ground since 2004. It runs on all operating systems (it began as a Linux project) and you can download it for free.

I use it often because ESRI doesn’t make the popular ArcGIS software for Mac. That’s unfortunate, but like I said here, software, technology and mapping issues can be easily overcome – we can use QGIS to create maps. QGIS, though, is missing one major feature for basic map building: geocoding.

Here’s a step-by-step tutorial on how to bring in multiple street addresses and their XY coordinates into your QGIS map en masse: Continue reading

© 2017 Steven Can Plan

Theme by Anders NorénUp ↑