Use Turf to perform GIS functions in a web browser

Turf's merge function joins invisible buffers around each Divvy station into a single, super buffer.

Turf’s merge function joins invisible buffers around each Divvy station into a single, super buffer –all client-side, in your web browser.

I’m leading the development of a website for Slow Roll Chicago that shows the distribution of bike lane infrastructure in Chicago relative to key and specific demographics to demonstrate if the investment has been equitable.

We’re using GitHub to store code, publish meeting notes, and host discussions with the issues tracker. Communication is done almost entirely in GitHub issues. I chose GitHub over Slack and Google Groups because:

  1. All of our research and code should be public and open source so it’s clear how we made our assumptions and came to our conclusions (“show your work”).
  2. Using git, GitHub, and version control is a desirable skill and more people should learn it; this project will help people apply that skill.
  3. There are no emails involved. I deplore using email for group communication.*

The website focuses on using empirical research, maps, geographic analysis to tell the story of bike lane distribution and requires processing this data using GIS functions. Normally the data would be transformed in a desktop GIS software like QGIS and then converted to a format that can be used in Leaflet, an open source web mapping library.

Relying on desktop software, though, slows down development of new ways to slice and dice geographic data, which, in our map, includes bike lanes, wards, Census tracts, Divvy stations, and grocery stores (so far). One would have to generate a new dataset if our goals or needs changed .

I’ve built maps for images and the web that way enough in the past and I wanted to move away from that method for this project and we’re using Turf.js to replicate many GIS functions – but in the browser.

Yep, Turf makes it possible to merge, buffer, contain, calculate distance, transform, dissolve, and perform dozens of other functions all within the browser, “on the fly”, without any software.

After dilly-dallying in Turf for several weeks, our group started making progress this month. We have now pushed to our in-progress website a map with three features made possible by Turf:

  1. Buffer and dissolving buffers to show the Divvy station walk shed, the distance a reasonable person would walk from their home or office to check out a Divvy station. A buffer of 0.25 miles (two Chicago blocks) is created around each of the 300 Divvy stations, hidden from display, and then merged (dissolved in traditional GIS parlance) into a single buffer. The single buffer –called a “super buffer” in our source code – is used for another feature. Currently the projection is messed up and you see ellipsoid shapes instead of circles.
  2. Counting grocery stores in the Divvy station walk shed. We use the “feature collection” function to convert the super buffer into an object that the “within” function can use to compare to a GeoJSON object of grocery stores. This process is similar to the “select by location” function in GIS software. Right now this number is printed only to the console as we look for the best way to display stats like this to the user. A future version of the map could allow the user to change the 0.25 miles distance to an arbitrary distance they prefer.
  3. Find the nearest Divvy station from any place on the map. Using Turf’s “nearest” function and the Context Menu plugin for Leaflet, the user can right-click anywhere on the map and choose “Find nearby Divvy stations”. The “nearest” function compares the place where the user clicked against the GeoJSON object of Divvy stations to select the nearest one. The problem of locating 2+ nearby Divvy stations remains. The original issue asked to find the number of Divvy stations near the point; we’ll likely accomplish this by drawing an invisible, temporary buffer around the point and then using “within” to count the number of stations inside that buffer and then destroy the buffer.
Right-click the map and select "Find nearby Divvy stations" and Turf will locate the nearest Divvy station.

Right-click the map and select “Find nearby Divvy stations” and Turf will locate the nearest Divvy station.

* I send one email to new people who join us at Open Gov Hack Night on Tuesdays at the Mart to send them a link to our GitHub repository, and to invite them to a Dropbox folder to share large files for those who don’t learn to use git for file management.

How many miles of roads are in your ward?

Screenshot 1: Showing how some streets are not being counted. There should be a yellow section of road between the two existing yellow road sections. 

A friend recently asked me how many blocks of road are in his ward. He wanted to know so that he could measure how many blocks of streets would have an older style of street lighting after X number of blocks receive the new style of street lighting. For this project, I used two datasets from Chicago’s open data portal: street center lines and wards. The output data is not very accurate as there may be some overlap and some uncounted street segments; this is likely due to a shortcoming in my process. I will show you how to find the number of blocks per ward using QGIS (download Quantum GIS, a free program for all OSes).

Here’s how I did it

  1. Load in the two datasets. Wards and street center lines (zipped shapefiles). They are projected in EPSG:3435.
  2. Exclude several road classifications in the street center lines by querying only for "CLASS" > '1' AND "CLASS" < '5'. The data dictionary for the road classifications is at the end. We don’t want the river, sidewalks, expressways, and any ramps to be included in the blocks per ward analysis.
  3. Intersect. In QGIS, select Vector>Geoprocessing Tools>Intersect. The input vector layer is “Transportation” (the name of the street center lines dataset) and the intersect layer is “Wards”. Save the resulting shapefile as “streets intersect wards”. Click OK. This will take a while.
  4. Add the “streets intersect wards” shapefile to the table of contents.
  5. You’ll notice some of the issues with the resulting shapefile: missing street segments (see screenshot 1). What should QGIS do if a street is a ward boundary?
  6. Obtain street length information, part 1. Remove all the columns in the “streets intersect wards” shapefile that have something to do with geometry. These are now outdated and will confuse you when you add a geometry column generated by QGIS.
  7. Obtain street length information, part 2. With the “streets intersect wards” shapefile selected in the table of contents, select Vector>Geometry Tools>Export/Add geometry columns. Select “streets intersect wards” shapefile as your input layer, leave CRS as “Layer CRS” and save as new shapefile “streets intersect wards geom”.
  8. Add the “streets intersect wards geom” shapefile to the table of contents.
  9. You will see a new column at the end of the attribute table called LENGTH. Since the data is projected in EPSG:3435 (Illinois StatePlane NAD83 East Feet), the unit is feet.
  10. Simply export “streets intersect wards geom” to a CSV file and open the CSV file in a spreadsheet application. From there you can group the data by Ward number and add the street lengths together. (I thought it would be faster to do this in a database so I imported it into a localhost MySQL database and ran a simple query, SELECT wardNum, sum(`chistreets_classes234`.`LENGTH`) as sum FROM chistreets_classes234 WHERE ward > 0 group by wardNum. I then exported this to a spreadsheet to convert feet to miles.)

Because of the errors described in step 5, you shouldn’t use this analysis for any application where accuracy is important. There are road lengths missing in the output dataset (table with street lengths summed by ward) and I cannot tell if the inaccuracy is equally distributed.

[table id=7 /]

Wards 19 (south side) and 41 (Norwood Park, including O’Hare airport) have the highest portion of street length in the city.

Screenshot 2: Ward 41 is seen. 

Street data dictionary

Column is “CLASS”. The value is a string. This dataset lacks alleys. Adapted from the City’s data dictionary.

1. Expressway

2. Arterials (1 mile grid, no diagonals)

3. Collectors (includes diagonals)

4. Other streets (side streets, neighborhood streets)

5. Named alleys (mostly downtown, like Couch Place and Garland Place)

7. Tiered (lower level streets, including LaSalle, Michigan, Columbus, and Wacker)

9. Ramps (goes along with expressway)

E. Extent (not sure how to describe these; includes riverwalk and lake walk segments, and Navy Pier, also includes some streets, like Mies van der Rohe Way)

RIV. River

S. Sidewalk

99. Unclassified

Get out of Googleville: my presentation on web mapping

Alternate headlines: Google Maps versus OpenStreetMap; why OpenStreetMap is better than Google Maps

I presented to the Chicago GIS Network Meetup group on February 5,2013, about alternatives to Google when it comes to mapping on the web. I created the presentation and outline a couple hours before giving it and came up with this slideshow with three frames.

Googleville 1 of 3

Google Maps and its data is a one-way street (or many one-way streets). Google will take data but won’t give it back.

Googleville 2 of 3

Google Maps has all of these features, but they’re easier to manipulate when you use an alternative. Alternatives like: MapBox, TileMill, OpenLayers, OpenStreetMap (made easy with JOSM), GeoCommons – I’m sure there are plenty more.

Googleville 3 of 3

OpenStreetMap is the Wikipedia of online mapping and geographic data. Considering switching to OSM.

How I created a map of Illinois Amtrak routes in TileMill in less than 30 minutes

This interactive map was created for a Grid Chicago article to show the cities and Amtrak routes mentioned. Click and drag it around or hover your mouse on the red train station markers. 

Want to create a map like that and publish it on your own website? It’s easy. I’ll show you how to do it in less than 30 minutes. First, download the following files:

All shapefiles are from the United States Department of Transportation, Bureau of Transportation Statistics’s National Transportation Atlas 2012 edition except for Illinois places, which comes from the Census Bureau’s TIGER project.

At the end of this tutorial, you’ll have a good introduction on how to find geographic data, build a map with TileMill, style the map, and publish it for the public. Your map will not look like mine as this tutorial doesn’t describe how to add labels or use the hover/info feature.

Tutorial to make Amtrak Illinois map

  1. Unzip the four ZIP files you downloaded and move their contents into a folder, like /Documents/GIS/Amtrak Illinois/shapefiles. This is your project folder.
  2. Install TileMill and open it.
  3. Set up a project. In the Projects pane, click “New Project”. In the filename field, title it “amtrak_illinois”. Ensure that the checkbox next to “Default data” is checked – this shows a world map and helps you get your bearings (but it’s not absolutely necessary).
  4. Get familiar with TileMill’s layout. Your new project will open with the map on the left side and your Carto style code on the right side. There are four buttons aligning the left edge of your map. From top to bottom they are: Templates, Font list, Carto guide, and Layers.
  5. Add a layer. We’re going to add the four shapefile layers you downloaded. Click the “Layers” button and then click “Add layer”. In the ID field, type in “amtrak_routes”. For Datasource, browse to your project folder and find “amtrak.shp” – this file has the Amtrak route lines. Then click “Done”. Click “Save & Style”.
  6. Style that layer. When you click “Save & Style” after adding a layer, your attention will be called to the Carto style code on the right side of TileMill. A section of code with the “amtrak_routes” #selector will have been inserted with some default colors and styles. If you know CSS, you will be familiar with how to change the Amtrak routes line styles. Change the “line-color” to “#000”. After “line-color”, add a new line and insert “line-opacity: 0.5;”. This will add some transparency to the line. Press the “Save” button above the code.
  7. Add remaining layers. Repeat Step 5 and add 3 more layers: “amtrk_sta.shp” (ID field: “amtrak_stations”), “state.shp” (ID field: “states”), and “tl_2012_17_place.shp” (ID field: “illinois_cities”).
  8. Hide bus stations. The Amtrak stations layer shows bus and ferry stations as part of Amtrak’s Thruway connections. You probably don’t want to show these. In your Carto style code, rename the #selector from “#amtrak_stations” to “#amtrak_stations[STNTYPE=’RAIL’]”. That makes the following style code only apply to stations with the “rail” type. Since there’s no style definition for things that aren’t of that type, they won’t appear.

Screenshot of my map.

Prepare your map for uploading

TileMill has many exporting options. You can save it as MBTiles and publish the map for free using MapBox (TileMill’s parent), or you can export it as image files (but it won’t be interactive), or you can display the map using the Leaflet JavaScript map library (which I use for the Chicago Bike Map app). This tutorial will explain how to export MBTiles and upload to MapBox, the server I’m using to display the map at the top of this page.

  1. Change project settings. To upload to MapBox, you’ll have to export your project as MBTiles, a proprietary format. Click the “Export” button above your Carto style code and click “MBTiles”. You’ll be asked to provide a name, description, attribution, and version. Input appropriate text for all but version.
  2. Adjust the zoom levels. Adjust the number of zoom levels you want (the more you have the longer it takes to export and upload your project, and you might exceed MapBox’s free 50 MB account limit). My map has zoom levels 8-11.
  3. Adjust the bounds. You’ll then want to draw your bounds: how much of the map’s geographic extents you want to export. Zoom to a level where you can see the entire state of Illinois in your map. Hold down the Shift key and drag a box around the state, plus a buffer (so viewers don’t fall of your map when they pan to the edges).
  4. Export your map. Click Export and watch the progress! On a four-year-old MacBook it took less than one minute to export the project.
  5. Bring the export to your project folder. When export finishes, click the “Save” button and browse to your project folder. Click the file browser’s save button.
  6. Upload to MapBox. Login to MapBox’s website and click “Upload Layer”. Browse to your project folder, select the .mbtiles folder, and click “Upload file”. Upon a successful upload, your map will display.
  7. Embed it in your website. Click the “Share” button in the upper left corner of your map and copy the embed code. Paste this into the HTML source code of a webpage (or in a WordPress post) and save that (I’m not going to provide instructions on how to do that).

Now you know how to find geographic data, build a custom map using the TileMill application, begin to understand how to style it, and embed your map for the public on a website or blog.

N.B. I was originally going to use QGIS to build a map and then publish a static image before I realized that TileMill + MapBox (the website) can build a map but publish an interactive feature instead of a static image. I’m happy I went that route. However, I did use QGIS to verify the data and even create a new shapefile of just a few of the key train stations on the Lincoln Service (the centerpiece of my Grid Chicago article).

Crashes by bike or by foot at different intersections

While working on a private web application that I call Chicago Crash Browser, I added some code to show the share of pedestrian and pedalcyclist crashes. The site offers users (sorry I don’t have a web server that can make it public) a list of the “Top 10” intersections in terms of bike crash frequency (that’s bike+auto crash). You can click on the intersection and a list will populate showing all the pedestrian and pedalcyclist crashes there, sorted by date. At the bottom of the list is a simple sentence that tells what percentage pedestrian and pedalcyclists made up at that intersection.

I’m still developing ideas on how this information may be useful, and what it’s saying about the intersection or the people using it.

Let me tell you about a few:

Milwaukee Avenue and Ogden Avenue

I mentioned in my article Initial intersection crash analysis for Milwaukee Avenue that this intersection is the most bike crash-frequent.

23 crashes within 150 feet of the center, 2005-2010

82.61% bike crashes **

17.39% ped crashes.

Ashland Avenue and Division Street

28 crashes within 150 feet of the center, 2005-2010

46.43% bike crashes

53.57% ped crashes **

Milwaukee, North and Damen Avenues

46 crashes within 150 feet of the center, 2005-2010

39.13% bike crashes

60.87% ped crashes **

Halsted Street, Lincoln and Fullerton Avenues

38 crashes within 150 feet of the center, 2005-2010

42.11% bike crashes

57.89% ped crashes **

Montrose Avenue and Marine Drive (Lake Shore Drive ramps)

11 crashes within 150 feet of the center, 2005-2010

90.91% bike crashes **

9.09% ped crashes

Why do you think some intersections have more of one kind of crash than the other?

People walking at Milwaukee-North-Damen.

The Chicago Crash Browser can be made public if I have a host that offers the PostgreSQL database. Do you have one to offer?

Are protected bike lanes going in the right places?

Bike crash map of Ogden, Milwaukee, Chicago

Common bike-car crash locations in West Town. The bottom blue circle identifies Ogden/Milwaukee, where there is a yellow trap for northbound, left-turning motorists (from Milwaukee to Ogden) that makes them run into southbound bicyclists who have a green light.

My contribution to a discussion on The Chainlink, Are protected bike lanes going in the right places?

Kelvin, Milwaukee/Ogden/Chicago is the intersection along Milwaukee Avenue with the highest number of bicycle crashes. I created this table and map to show them, using data from 2007-2009.

The blue rings on the map are called, in GIS parlance, “buffers” and are circles used to select things (in this case, bike crashes) within a certain distance of the circle center. In this map I used 50 feet radius buffers (100 feet diameter). While this distance encompasses the intersection from center to all four curbs, it doesn’t encompass the crashes that happened just outside the buffer that were still most likely influenced by the intersection (like drivers’ turning movements).

I am working on a project with three friends to create a better map and “crash browser”. I mentioned it in the last story on Grid Chicago in this post. For this project, we are using 200 feet radius (400 feet diameter) buffers to ensure we encompass the entire intersection and the area in which it still has an effect. This also grabs the bike lane “pinch points”, places where a bike lane doesn’t start until 100-200 feet beyond the intersection.

I am also concerned with the strategy and approach CDOT is using to choose locations. It’s not transparent; at MBAC, CDOT said they were choosing locations “without controversy and that could be implemented quickly”.

Read more about Kinzie Street, Chicago’s first protected bike lane, and my other thoughts on protected bike lanes

How to upload shapefiles to Google Fusion Tables

It is now possible to upload a shapefile (and its companion files SHX, PRJ, and DBF) to Google Fusion Tables (GFT).

Before we go any further, keep in mind that the application that does this will only process 100,000 rows. Additionally, GFT only gives each user 200 MB of storage (and they don’t tell you your current status, that I can see).

  1. Login to your Google account (at Gmail, or at GFT).
  2. Prepare your data. Ensure it has fewer than 100,000 rows.
  3. ZIP up your dataX.shp, dataX.shx, dataX.prj, and dataX.dbf. Use WinZip for Windows, or for Mac, right-click the selection of files and select “Compress 4 items”.
  4. Visit the Shape to Fusion website. You will have to authorize the web application to “grant access” to your GFT tables. It needs this access so that after the web application processes your data, it can insert it into GFT.
  5. If you want a Centroid Geometry column or a Simplified Geometry column added, click “Advanced Options” and check their checkboxes – see notes below for an explanation.
  6. Choose the file to upload and click Upload.
  7. Leave the window open until it says it has processed all of the rows. It will report “Processed Y rows and inserted Y rows”. You will be given a link to the GFT the web application created.

Sample Data

If you’re looking to give this a try and see results quickly, try some sample data from the City of Chicago data portal:


I had trouble many times while using Shape to Fusion in that after I chose the file to upload and clicked Upload, I had to grant access to the web application again and start over (choose the file and click Upload a second time).

Centroid Geometry – This creates a column with the geographic coordinates of the centroid in a polygon. It lists it in the original projection system. So if your projection is in feet, the value will be in feet. This is a function that can easily be performed in free and open source QGIS, where you can also reproject files to get latitude and longitude values (in WGS84 project, EPSG 4326). The centroid value is surrounded in the field by KML syntax “<Point><coordinates>X,Y</coordinates></Point>”.

Simplified Geometry – A geometry column is automatically created by the web application (or GFT, I’m not sure). This function will create a simpler version of that geometry, with fewer lines and vertices. It also creates columns to list the vertices count for the simple and regular geometry columns.

My essential QGIS plugins

Plugins for QGIS I use most often.

All of these can be installed automatically by QGIS. Click on Plugins>Fetch Python Plugins. Then search for the plugin, click on its name, and click Install Plugin. Few plugins require a restart.

  • MMQGIS – Great for working with CSV files; also merges layers (even if they have differing attributes); has various other useful functions, including converting string data to float data. Has Voronoi diagram function (takes a long time to process).
  • fTools – Replicates some of the most basic geographic tools in ArcGIS, like Clip, Dissolve, and Reproject. Can also add X/Y values to point attribute tables that are missing them (if you want latitude/longitude, you must reproject into a coordinate reference system first, like WGS84 [EPSG: 4326]). Unfortunately, there’s little information on what each fTools function does. Below are descriptions:
    • Extract Nodes – Create a point at each intersection of vertices.
    • Basic Statistics – Generate arithmetic statistics for fields, same as statistics function in ArcGIS. Great for quickly understanding the extent of values in a field (especially numeric values), like mean, max, min, standard deviation, and number of unique values.
    • Nearest Neighbour Analysis – More details here.
    • Geoprocessing Tools>Dissolve – Combine features based on a shared attribute. For example, all features with an identical STREET_TYPE be combined into a single feature. For example, all “Avenues” will become one feature and all “Boulevards” will become a second feature. Only works on polygon layers.
    • Descrição em português
  • Table Manager
  • Open Layers – Embed Google, Yahoo, Bing, and OpenStreetMap layers in your map. See my example.

Using Google Refine to get the stories out of your data

Let’s say you’re perusing the 309,425 crash reports for automobile crashes in Chicago from 2007 to 2009 and you want to know a few things quickly.

Like how many REAR END crashes there were in January 2007 that had more than 1 injury in the report. With Google Refine, you could do that in about 60 seconds. You just need to know which “facets” to setup.

By the way, there are 90 crash reports meeting those criteria. Look at the screenshot below for how to set that up.

Facets to choose to filter the data

  1. Get your January facet
  2. Add your 2007 facet
  3. Select the collision type of “REAR END” facet
  4. Choose to include all the reports where injury is greater than 1 (click “include” next to each number higher than 1)

After we do this, we can quickly create a map using another Google tool, Fusion Tables.

Make a map

  1. Click Export… and select “Comma-separated value.” The file will download. (Make sure your latitude and longitude columns are called latitude and longitude instead of XCOORD and YCOORD or sometimes Fusion Tables will choke on the location and try to geocode your records, which is redundant.)
  2. Go to Google Fusion Tables and click New Table>Import Table and select your file.
  3. Give the new table a descriptive title, like “January 2007 rear end crashes with more than 1 injury”
  4. In the table view, click Visualize>Map.
  5. BAM!

I completed all the tasks on this page in under 5 minutes and then spent 5 more minutes writing this blog. “The power of Google.”


A map that focuses on striped bikeways in downtown Chicago.

When you look at your bikeways more abstractly, like in the graphic above, do you see deficiencies or gaps in the network? Anything glaring or odd?

It’s a simple exercise: Open up QGIS and load in the relevant geographic data for your city. For Chicago, I added the city boundary, hydrography and parks (for locational reference), and bike lanes and marked-shared lanes*. Symbolize the bikeways to stand out in a bright color. I had the Chicago Transit Authority stations overlaid, but I removed them because it minimized the “black hole of bikeways” I want to show.

What do you see?

Bigger impact map

This exercise can have more impact if it was visualized differently. You have to be familiar with downtown Chicago and the Loop to fully understand why it’s important to notice what’s missing. It’s an extremely office and job dense neighborhood. It also has one of the highest densities of students in the country; the number of people residing downtown continues to grow. If I had good data on how many workers and students there were per building, I could indicate that on the map to show just how many people are potentially affected by the lack of bicycle infrastructure that leads them to their jobs (or class) in the morning, and home in the evening. I don’t know how to account for all of the bicycling that goes through downtown just for events, like at Millennium and Grant Parks, the Cultural Center, and other theaters and venues.

*If you cannot find GIS data for your city, please let me know and I will try to help you find it. It should be available for your city as a matter of course.

© 2017 Steven Can Plan

Theme by Anders NorénUp ↑