Tag: howto

How to upload shapefiles to Google Fusion Tables

It is now possible to upload a shapefile (and its companion files SHX, PRJ, and DBF) to Google Fusion Tables (GFT).

Before we go any further, keep in mind that the application that does this will only process 100,000 rows. Additionally, GFT only gives each user 200 MB of storage (and they don’t tell you your current status, that I can see).

  1. Login to your Google account (at Gmail, or at GFT).
  2. Prepare your data. Ensure it has fewer than 100,000 rows.
  3. ZIP up your dataX.shp, dataX.shx, dataX.prj, and dataX.dbf. Use WinZip for Windows, or for Mac, right-click the selection of files and select “Compress 4 items”.
  4. Visit the Shape to Fusion website. You will have to authorize the web application to “grant access” to your GFT tables. It needs this access so that after the web application processes your data, it can insert it into GFT.
  5. If you want a Centroid Geometry column or a Simplified Geometry column added, click “Advanced Options” and check their checkboxes – see notes below for an explanation.
  6. Choose the file to upload and click Upload.
  7. Leave the window open until it says it has processed all of the rows. It will report “Processed Y rows and inserted Y rows”. You will be given a link to the GFT the web application created.

Sample Data

If you’re looking to give this a try and see results quickly, try some sample data from the City of Chicago data portal:

Notes

I had trouble many times while using Shape to Fusion in that after I chose the file to upload and clicked Upload, I had to grant access to the web application again and start over (choose the file and click Upload a second time).

Centroid Geometry – This creates a column with the geographic coordinates of the centroid in a polygon. It lists it in the original projection system. So if your projection is in feet, the value will be in feet. This is a function that can easily be performed in free and open source QGIS, where you can also reproject files to get latitude and longitude values (in WGS84 project, EPSG 4326). The centroid value is surrounded in the field by KML syntax “<Point><coordinates>X,Y</coordinates></Point>”.

Simplified Geometry – A geometry column is automatically created by the web application (or GFT, I’m not sure). This function will create a simpler version of that geometry, with fewer lines and vertices. It also creates columns to list the vertices count for the simple and regular geometry columns.

Introduction to DIY bike ridership research

A lot of people ask me how many people are out there bicycling.

“Not a lot”, I tell them.

And I explain why: the primary source of data is the American Community Survey, which is a questionnaire that asks people questions about how they got to work in a specific week. (More details on how it does this below.) We don’t have data, except in rare “Household Travel Surveys”, about trips by bike to school, shopping, and social activities.

It’s comparable across the country – you can get this data for any city.

Here’s how:

  1. Visit the “legacy” American FactFinder and select American Community Survey, operated by the United States Census Bureau.
  2. Select 2005-2009 American Community Survey 5-Year Estimates (or the latest 5-year estimate). This is the most accurate data.
  3. In the right-side menu that appears, click on “Enter a table number”.
  4. In the new window, input the table number ” S0801″ (“Commuting Characteristics by Sex”) and submit the form. The new window will close and the other window will go to that table.
  5. Now it’s time to select your geography. In the left-side menu, under “Change…” click on “geography (state, county, place…)”
  6. In the window to change your geography, select “Place” as your “Geographic Type”.
  7. Then select the state.
  8. Then select your city and click “Show Result”.
Notes:
  • This data shows all modes people take to work, who live in that city. It’s highly probable that people are leaving the city to their jobs on these modes. For example, someone who lives in Rogers Park may ride their bike to work in Evanston.
  • The URL is a permanent link to this dataset. Each city has a unique URL. You should save these as bookmarks so you can easily reference the data later.
  • The question on the survey doesn’t allow multiple choices: “People who used more than one means of transportation to get to work each day were asked to report the one used for the longest distance during the work trip”.

Using Google Fusion Tables to create individual Chicago Ward maps

I wanted to create a map of the 35th Ward boundaries using Google My Maps for a story on Grid Chicago. I planned to create this by taking the Chicago Wards boundary shapefile and exporting just the 35th Ward using QGIS into a KML file. I ran into many problems and ended up using Google Fusion Tables as the final solution.

The problems

First, QGIS creates invalid KML files. Google Earth will tell you this. I opened the KML file in a text editor and removed the offending parts (Google Earth mildly tells you what these are; you can use this validator to get more information).

Second, Google My Maps would not import the KML file. I tried a different browser and a different KML file; a friend ran into the same issue. I reported this problem to Google.

The solution

I uploaded to Google Fusion Tables a KML file containing all wards. I did this instead of uploading the single Ward because, like a database, I can filter values in the column, selecting only the row I want with “ward=35”.

After applying the filter, the map will show the boundary for just that ward. I grab the HTML code for an embeddable map and voila, the article now displays an interactive map of the 35th Ward.

Whenever I want to create a map for a different ward, I go back to this Fusion Table, make a new filter and copy the new HTML code.

A screenshot of the embedded map, showing just 1 of 50 wards, in the Grid Chicago article. 

Elsewhere

I had the same problems with QGIS exporting and uploading the KML files to My Maps the other day when I was creating maps for the abandoned railroads for Monday’s Grid Chicago article. Not thinking about Fusion Tables, I drew on the map with my mouse the lines.

Screenshot of the map of abandoned railroads. 

How to convert GTFS to GIS shapefiles and KML

This tutorial will teach how you to convert any transit agency’s General Transit Feed Specification (GTFS) data into ESRI ArcGIS-compatible shapefiles (.shp), KML, or XML. This is simple to do because GTFS data is essentially a collection of CSV (comma separated values) text files (really, really large text files).

Note: I don’t know how to do the reverse, converting shapefiles or other geodata into GTFS data. I’m not sure if this is possible and I’m still investigating it. If you have tips, let me know.

Converting GTFS to GIS shapefiles

Instructions require the use of ArcGIS (Windows only) and a free plugin called ET GeoWizards GIS for any version of ArcGIS. I do not have instructions for Mac users at this time.

I wrote these instructions while converting the Chicago Transit Authority’s GTFS files into shapefiles based on a reader’s request. “Field names” are quoted and layer names are italicized.

  1. Download the GTFS data you want. Find data from agencies around the world (although not many from Europe) on GTFS Data Exchange.
  2. Import into ArcGIS the shapes.txt file using Tools>Add XY Data. Specify Y=lat and X=lon
  3. Using ET GeoWizards GIS tools, in the Convert tab, convert the points shapefile to polyline.
  4. Select the shapes layer in the wizard, then create a destination file. Click Next.
  5. Select the “shape_id” field
  6. Click the checkbox next to Order and select the field “shape_pt_sequence” and click Finish.
  7. Depending on the number of records (the CTA has 466,000 shapes), it may take a while.
  8. The new shapefile will be added to your Table of Contents and appear in your map.
  9. Import the trips.txt and routes.txt files. Inspect them for any NULL values in the “route_id” field. You will be using this field to join the routes and trips table. It may be a case that ArcGIS imported them incorrectly; the text files will show the correct data. If NULL values appear, follow steps 10 and 11 and continue. If not, follow steps 10 and 12 and continue. This happens because ArcGIS inspected some of the data and determined they were integers and ignored text. However, this is not the case.
  10. Export the text files as DBF files so that ArcGIS operates on them better. Then remove the text files from the Table of Contents.
  11. (Only if NULL values appear) Go into editing mode and fix the NULL values you noticed in step 9. You may have to make a new column with a more forgiving data type (string) and then copy the “route_id” column into the new column. Then continue to step 12.
  12. Join routes and trips based on the field “route_id” – export as trips_routes.dbf
  13. Add a new column to shapes.shp called “shape_id2”, with data type double 18, 11. This is so we can perform step 14. Use the field calculator to copy the values from “shape_id” (also known as ET_ID) to “shape_id2”
  14. Join routes_trips with shapes into routes_poly based on the field “shape_id” (and “shape_id2”)
  15. Dissolve routes_poly on “route_id.” Make sure all selections are cleared. Use statistics/summary fields: “route_long,” “route_url.” Save as routes_diss.shp
  16. Inspect the new shapefile to ensure it was created correctly. You may notice that some bus routes don’t have names. Since these routes are well documented on the CTA website, I’m not going to fill in their names.

Click on the screenshot to see various steps in the tutorials.

Converting GTFS to KML

After you have it in shapefile form, converting to KML is easy – follow these instructions for using QGIS. Or if you want to skip the shapefile-creation process (quite involved!), you can use KMLWriter, a Python script. Also, I think the latest version of ArcGIS has built-in KML exporting.

Converting GTFS to XML

If you want to convert the GTFS data (which are essentially comma-separated value – CSV – files) to XML, that’s easier and you can avoid using GIS programs.

  • First try Mr. Data Converter (very user friendly).
  • If that doesn’t work, try this website form on Creativyst. I tested it by converting the CTA’s smallest GTFS table, frequencies.txt, and it worked properly. However, it has a data size limit. (User friendly.)
  • Next try csv2xml, a command line tool. (Not user friendly.)
  • You can also use Microsoft Excel, but read these tips and caveats first. (I haven’t found a Microsoft application I like or think is user friendly.)

How to geocode a single address in QGIS

Since the last time I wrote about how to use BatchGeocode.com to perform pseudo-geocoding tasks in QGIS, there have been considerable improvements in the multi-platform, free, and open source GIS software. Now, geocoding (turning addresses into coordinates) is more automatic, albeit difficult to setup. (Okay, this has been around June 2009 and I just found out about it in October 2010.)

Once you install all the components, you’ll never have to do this again.

This method can only geocode one address at a time, but it will geocode all of the addresses into a single shapefile.

  1. Download QGIS.
  2. Download and install Python SetupTools. This includes the easy_install function that will download a necessary Python script, simplejson. On Mac you will have to use the Terminal (Applications>Utilities). Email me if you run into problems.
  3. Install simplejson. In the command line (Terminal for Mac; in Windows press Start>Run>”cmd”>Enter), type “easy_install simplejson”.
  4. Download the GeoCode plugin by Alessandro Pasotti via QGIS>Plugins>Fetch Python Plugins. You may have to load additional repositories to see it.
  5. Install geopy. In the command line (like step 3), type “easy_install geopy”.
  6. Specify your project’s projection in File>Project Properties.
  7. Get a Google Maps API key and tell the GeoCode plugin about it (QGIS>Plugins>GeoCode>Settings). You will need a Google account. If you don’t have your own domain name, you can just enter “google.com” when it asks for your domain.
  8. Geocode your first address by clicking on Plugins>GeoCode>Geocode. Type the full address (e.g. 121 N LaSalle Street, Chicago, IL for City Hall).
  9. The geocoded address will then appear in your Layers list as its own shapefile. All addresses geocoded (or reverse geocoded) in this project will appear in the same layer (therefore same attribute table).

Once you install all the components, you’ll never have to do this again. Geocoding will be available each and every time you use QGIS in the future on that workstation.

Tips

  • When you’re done geocoding,  save your results as a shapefile (right click the layer and click “Save as shapefile”). Twice I’ve lost my results after saving the project and quitting QGIS. When I reopened the project, the results layer was still listed, but contained no data.
  • Add a “name” column to the GeoCoding Plugin Results layer’s attribute table (toggle editing first). You can then type in the name of the building or destination at the address you geocoded. Edit the layer’s properties to have that name appear as a label for the point.

A map I made with QGIS showing three geocoded points of interest in Chicago. Data from City of Chicago’s GIS team.