TagQGIS

How to convert bike-share JSON data to CSV and then to shapefile

Update January 4, 2013: The easiest way to do this is to use Ian Dees’s Divvy API as it outputs straight to GeoJSON (which QGIS likes). See below.

For Michael Carney’s Divvy bike-share stations + Census tract + unbanked Chicagoans analysis and map he needed the Divvy station locations as a shapefile. I copied the JSON-formatted text of the Divvy real-time station API, converted it to CSV with OpenRefine, and then created a shapefile with QGIS.

Here’s how to create a shapefile of any bike-share system that uses hardware from Public Bike System Co based on Montréal and is operated by Alta Bicycle Share (this includes New York City, Chattanooga, Bay Area, Melbourne, and Chicago): Continue reading

How many miles of roads are in your ward?

Screenshot 1: Showing how some streets are not being counted. There should be a yellow section of road between the two existing yellow road sections. 

A friend recently asked me how many blocks of road are in his ward. He wanted to know so that he could measure how many blocks of streets would have an older style of street lighting after X number of blocks receive the new style of street lighting. For this project, I used two datasets from Chicago’s open data portal: street center lines and wards. The output data is not very accurate as there may be some overlap and some uncounted street segments; this is likely due to a shortcoming in my process. I will show you how to find the number of blocks per ward using QGIS (download Quantum GIS, a free program for all OSes).

Here’s how I did it

  1. Load in the two datasets. Wards and street center lines (zipped shapefiles). They are projected in EPSG:3435.
  2. Exclude several road classifications in the street center lines by querying only for "CLASS" > '1' AND "CLASS" < '5'. The data dictionary for the road classifications is at the end. We don’t want the river, sidewalks, expressways, and any ramps to be included in the blocks per ward analysis.
  3. Intersect. In QGIS, select Vector>Geoprocessing Tools>Intersect. The input vector layer is “Transportation” (the name of the street center lines dataset) and the intersect layer is “Wards”. Save the resulting shapefile as “streets intersect wards”. Click OK. This will take a while.
  4. Add the “streets intersect wards” shapefile to the table of contents.
  5. You’ll notice some of the issues with the resulting shapefile: missing street segments (see screenshot 1). What should QGIS do if a street is a ward boundary?
  6. Obtain street length information, part 1. Remove all the columns in the “streets intersect wards” shapefile that have something to do with geometry. These are now outdated and will confuse you when you add a geometry column generated by QGIS.
  7. Obtain street length information, part 2. With the “streets intersect wards” shapefile selected in the table of contents, select Vector>Geometry Tools>Export/Add geometry columns. Select “streets intersect wards” shapefile as your input layer, leave CRS as “Layer CRS” and save as new shapefile “streets intersect wards geom”.
  8. Add the “streets intersect wards geom” shapefile to the table of contents.
  9. You will see a new column at the end of the attribute table called LENGTH. Since the data is projected in EPSG:3435 (Illinois StatePlane NAD83 East Feet), the unit is feet.
  10. Simply export “streets intersect wards geom” to a CSV file and open the CSV file in a spreadsheet application. From there you can group the data by Ward number and add the street lengths together. (I thought it would be faster to do this in a database so I imported it into a localhost MySQL database and ran a simple query, SELECT wardNum, sum(`chistreets_classes234`.`LENGTH`) as sum FROM chistreets_classes234 WHERE ward > 0 group by wardNum. I then exported this to a spreadsheet to convert feet to miles.)

Because of the errors described in step 5, you shouldn’t use this analysis for any application where accuracy is important. There are road lengths missing in the output dataset (table with street lengths summed by ward) and I cannot tell if the inaccuracy is equally distributed.

[table “7” not found /]

Wards 19 (south side) and 41 (Norwood Park, including O’Hare airport) have the highest portion of street length in the city.

Screenshot 2: Ward 41 is seen. 

Street data dictionary

Column is “CLASS”. The value is a string. This dataset lacks alleys. Adapted from the City’s data dictionary.

1. Expressway

2. Arterials (1 mile grid, no diagonals)

3. Collectors (includes diagonals)

4. Other streets (side streets, neighborhood streets)

5. Named alleys (mostly downtown, like Couch Place and Garland Place)

7. Tiered (lower level streets, including LaSalle, Michigan, Columbus, and Wacker)

9. Ramps (goes along with expressway)

E. Extent (not sure how to describe these; includes riverwalk and lake walk segments, and Navy Pier, also includes some streets, like Mies van der Rohe Way)

RIV. River

S. Sidewalk

99. Unclassified

How to split a bike lane in two and copy features with QGIS

A screenshot of the splash image seen on users with iPad retina displays in landscape mode. 

To make the Chicago Offline Bike Map, I need bikeways data. I got this from the City of Chicago’s data portal, in GIS shapefile format. It has a good attribute table listing the name of the street the bikeway is on and the bikeway’s class (see below). After several bike lanes had been installed, I asked the City’s data portal operators for an updated shapefile. I got it a month later and found that it wasn’t up-to-date. I probably could have received a shapefile with the current bikeway installations marked, but I didn’t have time to wait: every day delayed was one more day I couldn’t promote my app; I make 70 cents per sale.

Since the bikeway lines were already there, I could simply reclassify the sections that had been changed to an upgraded form of bikeway (for example, Wabash Avenue went from a door zone-style bike lane to a buffered bike lane in 2011). I tried to do this but ran into trouble when the line segment was longer than the bikeway segment that needed to be reclassified (for example, Elston Avenue has varying classifications from Milwaukee Avenue to North Avenue that didn’t match the line segments for that street). I had to divide the bikeway into shorter segments and reclassify them individually.

Enter the Split Features tool. QGIS is short on documentation and I had trouble using this feature. I eventually found the trick after a search that took more time than I expected. Here’s how to cut a line:

  1. Select the line using one of the selection tools. I prefer the default one, Select Features, where you have to click on the feature one-by-one. (It’s not required that you select the line, but doing so will ensure you only cut the selected line. If you don’t select the line, you can cut many lines in one go.)
  2. Toggle editing on the layer that contains the line you want to cut.
  3. Click Edit>Split Features to activate that tool, or find its icon in one the toolbars (which may or may not be shown).
  4. Click once near where you want to split the line.
  5. Move the cursor across the line you want to split, in the desired split location.
  6. When the red line indicating your split is where you desire, press the right-click mouse button.

Your line segment has now been split. A new entry has been added to the attribute table. There are now two entries with duplicate attributes representing that together make up the original line segment, before you split it.

This screenshot shows a red line across a road. The red line indicates where the road will be split. Press the right-click mouse button to tell QGIS to “split now”.

After splitting, open the attribute table to see that you now have two features with identical attributes. 

Copying features in QGIS

A second issue I had when creating new bikeways data was when a bikeway didn’t exist and I couldn’t reclassify it. This was the case on Franklin Boulevard: no bikeway had ever been installed there. I solved this problem by copying the relevant street segments from the Transportation (roads) shapefile and pasted them into the bikeways shapefile. New entries were created in the attribute table but with blank attributes. It was simple to fill in the street name, class, and extents.

Chicago bikeways GIS description

Bikeway classes (TYPE in the dataset) in the City of Chicago data portal are:

  1. Existing bike lane
  2. Existing marked shared lane
  3. Proposed on-street bikeway
  4. Recommended bike route
  5. Existing trail
  6. Proposed off-street trail
  7. Access path (to existing trail)
  8. Existing cycle track (also known as protected bike lane)
  9. Existing buffered bike lane

It remains to be seen if the City will identify the “enhanced marked shared lane” on Wells Street between Wacker Drive and Van Buren street differently than “existing marked shared lane” in the data.

Initial intersection crash analysis for Milwaukee Avenue

List of the most crash-prone intersections on Milwaukee Avenue in Chicago. Using data from 2007-2009, when reported to the Chicago Police Department. Dooring data not included on the bike crash map. I used QGIS to draw a 50-feet buffer around the point where the intersection center lines meet.

Intersecting street (class 4*) Bike crashes
Chicago Avenue (see Ogden below) 12 (17)
California Avenue 9
Halsted Street & Grand Avenue 7
Damen Avenue & North Avenue 6
Western Avenue 6
Ogden Avenue (see Chicago above) 5 (17)
Ashland Avenue 5
Diversey Avenue 5
Fullerton Avenue 5
Elston Avenue 5
Augusta Boulevard (not class 4) 5

Combine the six-way (with center triangle) intersection of Ogden, Milwaukee, Chicago, and you see 17 crashes. Add the 6 just outside the 50-feet buffer and you get 23 crashes. Compare this to the six-way (without center triangle) at Halsted, Milwaukee, Grand, where there’s only 7 crashes.

What about the two intersections causes such a difference in crashes? Let’s look at some data:

Ogden, Milwaukee, Chicago Halsted, Milwaukee, Grand
Automobile traffic Approx 58,000 cars per day Approx 50,000 cars per day.
Bicycle traffic Not counted, but probably fewer than 3,100 bikes More than 3,100 bikes per day*
Bus traffic Two bus routes Three bus routes
Intersection style Island; three signal cycles No island; one signal cycle

*Notes

Traffic counts are assumed estimates. Counts are taken on a single day, either Tuesday, Wednesday, or Thursday. Bike counts at Halsted/Milwaukee/Grand were actually taken on Milwaukee several hundred feet northwest of the intersection so DO NOT include people biking on Halsted or Grand! This means that more than 3,100 people are biking through the intersection each day.

Intersection style tells us which kind of six-way intersection it is. At island styles you’ll find a concrete traffic island separating the three streets. You’ll also find three signal cycles because there are actually three intersections instead of one, making it a 12-way intersection. Also at these intersections you’ll see confusing instructional signage like, “OBEY YOUR SIGNAL ONLY” and “ONCOMING TRAFFIC HAS LONGER GREEN.”

Here’s a more lengthy description of one of the problems here as well as an extremely simple solution: install a left-turn arrow for northbound Milwaukee Avenue. The entire intersection is within Alderman Burnett’s Ward 27.

Source and method

I can’t yet tell you how I obtained this data or created the map. I’m still working out the specifics in my procedures log. It involved some manual work at the end because in the resulting table that counted the number of crashes per intersection, every intersection was repeated, but the street names were in opposite columns.

Crash data from the Illinois Department of Transportation. Street data from the City of Chicago. Intersection data created with fTools in QGIS. To save time in this initial analysis, I only considered Milwaukee Avenue intersections with streets in the City of Chicago centerline file with a labeled CLASS of 1, 2, or 3.

My essential QGIS plugins

Plugins for QGIS I use most often.

All of these can be installed automatically by QGIS. Click on Plugins>Fetch Python Plugins. Then search for the plugin, click on its name, and click Install Plugin. Few plugins require a restart.

  • MMQGIS – Great for working with CSV files; also merges layers (even if they have differing attributes); has various other useful functions, including converting string data to float data. Has Voronoi diagram function (takes a long time to process).
  • fTools – Replicates some of the most basic geographic tools in ArcGIS, like Clip, Dissolve, and Reproject. Can also add X/Y values to point attribute tables that are missing them (if you want latitude/longitude, you must reproject into a coordinate reference system first, like WGS84 [EPSG: 4326]). Unfortunately, there’s little information on what each fTools function does. Below are descriptions:
    • Extract Nodes – Create a point at each intersection of vertices.
    • Basic Statistics – Generate arithmetic statistics for fields, same as statistics function in ArcGIS. Great for quickly understanding the extent of values in a field (especially numeric values), like mean, max, min, standard deviation, and number of unique values.
    • Nearest Neighbour Analysis – More details here.
    • Geoprocessing Tools>Dissolve – Combine features based on a shared attribute. For example, all features with an identical STREET_TYPE be combined into a single feature. For example, all “Avenues” will become one feature and all “Boulevards” will become a second feature. Only works on polygon layers.
    • Descrição em português
  • Table Manager
  • Open Layers – Embed Google, Yahoo, Bing, and OpenStreetMap layers in your map. See my example.

Gaps

A map that focuses on striped bikeways in downtown Chicago.

When you look at your bikeways more abstractly, like in the graphic above, do you see deficiencies or gaps in the network? Anything glaring or odd?

It’s a simple exercise: Open up QGIS and load in the relevant geographic data for your city. For Chicago, I added the city boundary, hydrography and parks (for locational reference), and bike lanes and marked-shared lanes*. Symbolize the bikeways to stand out in a bright color. I had the Chicago Transit Authority stations overlaid, but I removed them because it minimized the “black hole of bikeways” I want to show.

What do you see?

Bigger impact map

This exercise can have more impact if it was visualized differently. You have to be familiar with downtown Chicago and the Loop to fully understand why it’s important to notice what’s missing. It’s an extremely office and job dense neighborhood. It also has one of the highest densities of students in the country; the number of people residing downtown continues to grow. If I had good data on how many workers and students there were per building, I could indicate that on the map to show just how many people are potentially affected by the lack of bicycle infrastructure that leads them to their jobs (or class) in the morning, and home in the evening. I don’t know how to account for all of the bicycling that goes through downtown just for events, like at Millennium and Grant Parks, the Cultural Center, and other theaters and venues.

*If you cannot find GIS data for your city, please let me know and I will try to help you find it. It should be available for your city as a matter of course.

Trying out uDig, a free, multi-platform GIS application

ArcGIS is the standard in geographic information system applications. I don’t like that it’s expensive, unwieldy to install and update, and its user interface is stymying and slow*. I also use Mac OS X most of the time and ArcGIS is not available for Mac. It doesn’t have to be the standard.

I’ve tried my hand at Cartographica and QGIS. I really like QGIS because there’re many plugins, it’s open source, there’s a diverse community supporting it, and best of all, it’s free. I’ve written about Cartographica once – I’m not a fan right now.

My project

  • The data: Bicycle crashes in the City of Chicago as reported to IDOT for 2007-2009
  • Goal: Publish an interactive map of this data using Google Fusion Tables and its instant mapping feature.
  • Visualizing it: Added streets (prepared beforehand to exclude highways), water features, and city boundary (get that here)
  • Process: Combine bike crash data; reproject to WGS84 for Google; remove extraneous information; add latitude/longitude coordinates; export as CSV; upload to Google Fusion Tables; map it!
  • View the final product

Trying out uDig

In reaching my goal I had a task that I couldn’t figure out how to complete with QGIS: I needed to combine three shapefiles with identical table schemes into one shapefile – this one shapefile would eventually be published as one map. The join feature in fTools wasn’t working so I looked for a new solution, uDig, or “User-friendly Desktop Internet GIS.”

The solution was very easy. Highlight all the records in the attribute table of one shapefile, click Edit>Copy, then select the destination table and click Edit>Paste. The new records were added within a couple seconds. I could then bring this data back into QGIS to finish the process (outlined above under Project). I did use fTools later in the process to add lat/long coordinates to my single shapefile.

After adding more data to better visualize the crashes in Chicago, I noticed that uDig renders maps to look smoother and slightly prettier than QGIS or ArcGIS. See the screenshot below.

A screenshot of the three bicycle crash datasets (2007, 2008, 2009) with the visualization data added.

The end product: three years of police reported bicycle crashes in the City of Chicago on an interactive map powered by Google Fusion Tables, another product in Google’s arsenal of GIS for the poor man. View the final product.

*I haven’t used ArcGIS version 10 yet, which I see and read has an improved user interface; it’s unclear to me and other users if the program’s been updated to take advantage of multi-core processors. ESRI has a roundabout way of describing their support.

How to convert GTFS to GIS shapefiles and KML

This tutorial will teach how you to convert any transit agency’s General Transit Feed Specification (GTFS) data into ESRI ArcGIS-compatible shapefiles (.shp), KML, or XML. This is simple to do because GTFS data is essentially a collection of CSV (comma separated values) text files (really, really large text files).

Note: I don’t know how to do the reverse, converting shapefiles or other geodata into GTFS data. I’m not sure if this is possible and I’m still investigating it. If you have tips, let me know.

Converting GTFS to GIS shapefiles

Instructions require the use of ArcGIS (Windows only) and a free plugin called ET GeoWizards GIS for any version of ArcGIS. I do not have instructions for Mac users at this time.

I wrote these instructions while converting the Chicago Transit Authority’s GTFS files into shapefiles based on a reader’s request. “Field names” are quoted and layer names are italicized.

  1. Download the GTFS data you want. Find data from agencies around the world (although not many from Europe) on GTFS Data Exchange.
  2. Import into ArcGIS the shapes.txt file using Tools>Add XY Data. Specify Y=lat and X=lon
  3. Using ET GeoWizards GIS tools, in the Convert tab, convert the points shapefile to polyline.
  4. Select the shapes layer in the wizard, then create a destination file. Click Next.
  5. Select the “shape_id” field
  6. Click the checkbox next to Order and select the field “shape_pt_sequence” and click Finish.
  7. Depending on the number of records (the CTA has 466,000 shapes), it may take a while.
  8. The new shapefile will be added to your Table of Contents and appear in your map.
  9. Import the trips.txt and routes.txt files. Inspect them for any NULL values in the “route_id” field. You will be using this field to join the routes and trips table. It may be a case that ArcGIS imported them incorrectly; the text files will show the correct data. If NULL values appear, follow steps 10 and 11 and continue. If not, follow steps 10 and 12 and continue. This happens because ArcGIS inspected some of the data and determined they were integers and ignored text. However, this is not the case.
  10. Export the text files as DBF files so that ArcGIS operates on them better. Then remove the text files from the Table of Contents.
  11. (Only if NULL values appear) Go into editing mode and fix the NULL values you noticed in step 9. You may have to make a new column with a more forgiving data type (string) and then copy the “route_id” column into the new column. Then continue to step 12.
  12. Join routes and trips based on the field “route_id” – export as trips_routes.dbf
  13. Add a new column to shapes.shp called “shape_id2″, with data type double 18, 11. This is so we can perform step 14. Use the field calculator to copy the values from “shape_id” (also known as ET_ID) to “shape_id2″
  14. Join routes_trips with shapes into routes_poly based on the field “shape_id” (and “shape_id2″)
  15. Dissolve routes_poly on “route_id.” Make sure all selections are cleared. Use statistics/summary fields: “route_long,” “route_url.” Save as routes_diss.shp
  16. Inspect the new shapefile to ensure it was created correctly. You may notice that some bus routes don’t have names. Since these routes are well documented on the CTA website, I’m not going to fill in their names.

Click on the screenshot to see various steps in the tutorials.

Converting GTFS to KML

After you have it in shapefile form, converting to KML is easy – follow these instructions for using QGIS. Or if you want to skip the shapefile-creation process (quite involved!), you can use KMLWriter, a Python script. Also, I think the latest version of ArcGIS has built-in KML exporting.

Converting GTFS to XML

If you want to convert the GTFS data (which are essentially comma-separated value – CSV – files) to XML, that’s easier and you can avoid using GIS programs.

  • First try Mr. Data Converter (very user friendly).
  • If that doesn’t work, try this website form on Creativyst. I tested it by converting the CTA’s smallest GTFS table, frequencies.txt, and it worked properly. However, it has a data size limit. (User friendly.)
  • Next try csv2xml, a command line tool. (Not user friendly.)
  • You can also use Microsoft Excel, but read these tips and caveats first. (I haven’t found a Microsoft application I like or think is user friendly.)

How to geocode a single address in QGIS

Since the last time I wrote about how to use BatchGeocode.com to perform pseudo-geocoding tasks in QGIS, there have been considerable improvements in the multi-platform, free, and open source GIS software. Now, geocoding (turning addresses into coordinates) is more automatic, albeit difficult to setup. (Okay, this has been around June 2009 and I just found out about it in October 2010.)

Once you install all the components, you’ll never have to do this again.

This method can only geocode one address at a time, but it will geocode all of the addresses into a single shapefile.

  1. Download QGIS.
  2. Download and install Python SetupTools. This includes the easy_install function that will download a necessary Python script, simplejson. On Mac you will have to use the Terminal (Applications>Utilities). Email me if you run into problems.
  3. Install simplejson. In the command line (Terminal for Mac; in Windows press Start>Run>”cmd”>Enter), type “easy_install simplejson”.
  4. Download the GeoCode plugin by Alessandro Pasotti via QGIS>Plugins>Fetch Python Plugins. You may have to load additional repositories to see it.
  5. Install geopy. In the command line (like step 3), type “easy_install geopy”.
  6. Specify your project’s projection in File>Project Properties.
  7. Get a Google Maps API key and tell the GeoCode plugin about it (QGIS>Plugins>GeoCode>Settings). You will need a Google account. If you don’t have your own domain name, you can just enter “google.com” when it asks for your domain.
  8. Geocode your first address by clicking on Plugins>GeoCode>Geocode. Type the full address (e.g. 121 N LaSalle Street, Chicago, IL for City Hall).
  9. The geocoded address will then appear in your Layers list as its own shapefile. All addresses geocoded (or reverse geocoded) in this project will appear in the same layer (therefore same attribute table).

Once you install all the components, you’ll never have to do this again. Geocoding will be available each and every time you use QGIS in the future on that workstation.

Tips

  • When you’re done geocoding,  save your results as a shapefile (right click the layer and click “Save as shapefile”). Twice I’ve lost my results after saving the project and quitting QGIS. When I reopened the project, the results layer was still listed, but contained no data.
  • Add a “name” column to the GeoCoding Plugin Results layer’s attribute table (toggle editing first). You can then type in the name of the building or destination at the address you geocoded. Edit the layer’s properties to have that name appear as a label for the point.

A map I made with QGIS showing three geocoded points of interest in Chicago. Data from City of Chicago’s GIS team.

 

Trying out new GIS software

I want to draw 50 and 120 feet buffers around the points of store entrances to show where bike parking should and shouldn’t be installed. I want to follow this example:

walgreens with bike parking buffers

Aerial photo of a Tucson, Arizona, Walgreens showing the location of existing bike parking and two buffers (50 and 120 feet) where proposed city rules would allow bike parking. I advocate for ratifying the 50 feet rule, which I’ve discussed on this blog and elsewhere many times.

I want to do this easily and accurately, so I will use GIS software to create a “buffer.” I use QGIS occasionally, but I want to try out other Mac-friendly applications. I’m getting my orthoimagery (geometrically corrected aerial photography) from the United States Geological Survey (USGS) using a web protocol called Web Map Server. I’m trying:

  • Cartographica, $495, with free trial license.
  • uDig, completely free software. UPDATE: I have had NO success getting any data to load from a WMS connection into uDig. I would like to understand why. Cartographica can obtain some of the WMS-stored data I want, although it messes up often.

I’m having success with neither – both are having issues downloading or maintaining a connection to the USGS orthoimagery. In one case, Cartographica trims the Bing Maps imagery to match the extent of my other objects (the buffer). In another case, it won’t even download the USGS imagery (and gives no indication that anything is happening). uDig hasn’t been able to download anything so far – I hope it’s asking for the current extent, instead of all data because it’s taking a looong time to do anything (so long that I just quit in the  middle of it).

This screenshot shows how to add new WMS connections to Cartographica.

UPDATE: I did it! I successfully used Cartographica (and the integrated Bing Maps) to create this drawing that shows the current (abysmal) bike parking at a Chicago Home Depot outside the 50 feet line.

© 2014 Steven Can Plan

Theme by Anders NorenUp ↑