Tag: open data

How to ascertain the area of Chicago beach parking lots to find the largest one

This tutorial is a direct response to a question about which Chicago beach has the largest parking lot. Matt Nardella of Moss Design, in a response to a Twitter-based conversation about Alderman Cappleman’s suggestion that perhaps Montrose beach has too much parking, researched on Wikipedia to find the answer. This is where it said that Montrose beach has the largest parking lot of any of Chicago’s 27 beaches.

Now we’re going to try and prove which beach has the largest associated parking lot.

This tutorial will teach how you to (1) display Chicago beaches, (2) download data held in OpenStreetMap, (3) find the parking lots within the OpenStreetMap data, (4) find the parking lots near the beaches, and (5) calculate each parking lot’s area (in square feet). You can use this tutorial to accomplish any one of these three tasks, or the same tasks but on a different part of OpenStreetMap data (like the area of indoor shopping malls).

You’ll need the QGIS software before starting. You’ll also need at least 500 MB of free space. Start a project folder called “Biggest Parking Lots in Chicago” and make two more folders, within this folder, called “origdata” and “data”.

First, let’s get some data about beaches

Since we only want to know about the parking lots near Chicago beaches we need to get a dataset that locates them. This data is presumably within the same OpenStreetMap extract we’re waiting for, but it’s best to go to the most reliable source.

  1. Download the Parks – Facilities & Features shapefile from the City of Chicago open data portal. I’ve already verified that it has all the beaches (as points).
  2. Open the parks shapefile in a new document in QGIS (call it “map01a.qgs”). You might not see the data so right-click the parks layer and select “Zoom to layer extent”.
  3. Filter out all the points that aren’t beaches by using the query builder. Right-click the layer and select “Filter…” and input this filter expression: “FACILITY_N” = ‘BEACH’
  4. Your map will now show 26 points along an invisible lakefront and then the beach at Humboldt Park.
  5. For the rest of this tutorial we’ll reference the beaches layer as ParkFacilities.

Second, let’s get some data from OpenStreetMap

The easiest way to grab data from OpenStreetMap is by using QGIS, a free, open source desktop GIS application that has myriad plugins that match the capabilities of the heavyweight ESRI ArcGIS line of software. We can download OpenStreetMap data straight into QGIS.

  1. Click on the Vector menu and select OpenStreetMap>Download data.
  2. We want as much data as will cover the beaches information so in the Extent section of the dialog box choose “From layer” and select the beaches layer (called ParkFacilities).
  3. Browse to the “origdata” folder you created in the first task and choose the filename “chicago.osm”.
  4. Click OK and watch the progress meter tell you how much data you’ve downloaded from OpenStreetMap.
  5. Once it’s completed downloading, click “Close”. Now we want to add this data to our map.
  6. Drag the chicago.osm file from your file system into the QGIS Layers list. A dialog box will appear asking which layers you want to add.
  7. Select the layer that has the type “MultiPolygon”. This represents areas like buildings and parking lots.

Third, display the OpenStreetMap data and eliminate everything but the parking lots

We only want to compare parking lots in this dataset with beaches in the previous dataset so we need to eliminate everything from the OpenStreetMap data that’s not a parking lot. Since OSM data depends on tags we can easily select and show all the objects where “amenity” = “parking”.

  1. Filter out all the polygons that aren’t parking lots by using the query builder. Right-click the layer and select “Filter…” and input this filter expression: “amenity” = ‘parking’. Hopefully all the parking lots have been drawn so we can analyze a complete dataset!
  2. Your map will now show little squares, rectangles, and myriad odd shapes that represent parking lots around Chicagoland. (Most of these have been drawn by hand.) It should look like Image XXX.
  3. Since this data is stored in a projection with the codename of EPSG:3435 and the OpenStreetMap data is stored with codename of EPSG:4326 we need to convert the beaches to match the beaches (because we’re going to be using feet as a  measuring distance instead of degrees).
  4. Right-click the layer and select “Save As…” and choose the format “ESRI Shapefile”. Then click the top Browse button and select a location on your hard drive for the converted file.
  5. For “CRS” choose “Selected CRS”. Then click the bottom Browse button and search for the EPSG with the codename 3435. Select the checkbox named “Add saved file to map” so the new layer will be immediately added to our map.

Fourth, select all the parking lots near a beach

This task will select all the parking lots near the beaches. I chose 2,000 feet but you could easily choose a different distance. You might want to measure on Google Earth some minimum and maximum distances between beaches and their respective, associated parking lots.

(This task is easier using PostGIS which has a ST_DWithin function to find objects within a certain distance because we can avoid having to create the buffer in QGIS.)

  1. Create a 2,000 feet buffer. Select Vector>Geoprocessing tools>Buffer.
  2. In the Buffer(s) dialog box, select ParkFacilities (which has your beaches) as the “Input vector layer”. Choose a distance of 2000 (the units are pre-chosen by the projection and since we’re using a projection that’s in feet, the distance unit will be feet).
  3. Browse to your project folder’s “data” folder and save the “Output shapefile” as “beaches buffer 2000ft.shp”.
  4. Click “Add results to canvas” and then click OK.
  5. Double check that 2,000 feet was enough to select the parking lots. In my case, I see that the point representing Montrose beach was further than 2,000 feet away from a parking lot.
  6. Let’s do it again but with 3,000 feet this time, and saving the “Output shapefile” as “beaches buffer 3000ft.shp”.
  7. This time it worked and the nearest parking lots are now in the 3,000 feet radius buffer. You can see in Image XXX how the two concentric circles stretch out from the beach point towards the parking lots.

We’re not done. We’re next going to use our newly created 3,000 feet buffers to tell us which parking lots are in them. These will be presumed to be our beach parking lots.

  1. Use the “Select by location” tool to find the beaches that intersect our 3,000 feet buffers. Select Vector>Research Tools>Select by location.
  2. Follow me: we want to select features in parking 3435 [our parking lots] that intersect features in beaches buffer 3000ft [our beaches]. We’ll modify the current selection by creating a new selection so that we don’t accidentally include any features previously selected.
  3. You’ll now see a bunch of parking lots turn yellow meaning they are actively selected.
  4. Let’s save our selected parking lots as a new file so it will be easier to analyze just them. Right-click “parking 3435” and select “Save Selection As…” (it’s important to choose “Save Selection As” instead of “Save As” because the former will save just the parking lots we’ve selected).
  5. Save it as “selected parking 3435.shp” in your “data” folder. The CRS should be EPSG:3435 (NAD83 Illinois StatePlane East Feet). Check off “Add saved file to map” and click OK.
  6. Turn off all other layers except ParkFacilities to see what we’re left with and you’ll see what I show in Image XXX.

Fifth, let’s calculate

Calculating the area is probably the easiest part of this tutorial.

  1. Close all attribute tables you may have opened.
  2. Select Vector>Geometry Tools>Export/Add geometry columns and choose “selected parking 3435” as your input vector layer.
  3. Leave all other options as-is and press OK. When told about how QGIS can’t access something simultaneously, choose “Yes”.
  4. QGIS should have told you that “selected parking 3435” has been updated. Right-click the layer and choose “Open Attribute Table”.
  5. Scroll to the far right and you’ll see a new column called AREA. This represents the parking lot’s area in square feet.
  6. Click on the AREA column heading to sort it from smallest to largest. Scroll to the bottom of the list and you’ll find the parking lot with the largest area. Double check – is it near a beach?

Conclusion

With my analysis, and with the data available from OpenStreetMap when I created this tutorial, there are three abnormally large parking lots:

  1. A linear lot near the Lincoln Park Zoo and North Avenue beach (6.8 acres)
  2. A curving lot near Montrose Beach (4.75 acres)
  3. An irregularly shaped lot near Montrose Beach (4.5 acres)

There’s one major caveat in this analysis and that’s the missing parking lots on beaches south of Navy Pier. This means that no one has drawn them into OpenStreetMap so it’s time to start editing!

Chicago wards with the most landmarked places

Montgomery Ward Complex

People float by the Montgomery Ward Complex on Kayaks. Photo by Michelle Anderson.

Last week I met with the passionate staff at Landmarks Illinois to talk about Licensed Chicago Contractors. I wanted to understand the legality for historic preservation and determine ways to highlight landmarked structures on the website and track any modifications or demolitions to them.

I incorporated two new geographies over the weekend: Chicago landmark districts, and properties and areas on the National Register of Historic Places (both available on the City of Chicago open data portal).

I used pgShapeLoader to import them to my DigitalOcean-hosted PostgreSQL database and modified some existing code to start looking at these two new datasets. Voila, you can now track what’s going on in the Montgomery Ward Company Complex – currently occupied by “600 W” (at 600 W Chicago Avenue) hosting Groupon among other businesses and restaurants.

Today I was messing around with some queries after I saw that the ward containing this place on the National Register – the 27th – also had a bunch of other listed spots.

I wrote a query to see which wards have the most places on the National Register. The table below lists the top three wards, with links to their page on Licensed Chicago Contractors. You’ll find that many have no building permits associated with them. This is because of two reasons: the listing’s small geography to look within for permits may not include the geography of issued permits (they’re a few feet off); we don’t have a copy of all permits yet.

[table id=15 /]

4 wards don’t have any listings on the National Register of Historic Places and nine wards have one listing.

Top 20 most active general contractors in Chicago this year so far

River Point tower is under construction at 444 W Lake Street by James McHugh Construction. Its associated building permit, which only comprises the base/train cover structure depicted, has an estimated project cost of $27,050,000. Photo by Bart Shore.

Here I go again not talking about urban planning on Steven Can Plan. With my Licensed Chicago Contractors & Construction Activity website I’ve ingested a lot of data from the City of Chicago’s open data portal that has a LOT to say about what’s going on.

Nearly all permits include a reasonable value that estimates the project’s cost – like $13,150,000.00 for converting Lafayette Elementary School into a performing arts high school. (All demolition permits show $1 and some permits cost over $6 billion, which we know is false.)

I recently reorganized the data (migrating from MySQL to PostgreSQL which supports the JSON datatype) to expand the ways I can extract info and I’ve sorted it by general contractor’s aggregate project value for this year.

This is a list of the 20 most active general contractors in Chicago for 2014 until May 13, 2014. I define “activity” purely by the companies’ involvement in a Chicago-based project and corresponds to that project’s estimated cost. (The data doesn’t specify a level of their involvement in a project.)

[table id=11 /]

Do any names ring a bell, or surprise you?

Finding interesting data in the building permits dataset

I had several great conversations with fellow #chihacknight visitors at the 1871 tech hub (222 W Merchandise Mart Plaza) about how to reveal more information about what’s being built in Chicago. I had introduced Licensed Chicago Contractors at the previous week’s hack night and tonight I showed site changes I made like how much faster it is now that I use DataTables’s server-side processing function.

Some of the discussions resulted in suggestions to try new tools and methods that would make processing the data more efficient, or more revealing. What are the ways I can aggregate the data, or connect to similar data from other sources?

One of the new features I announced I’ll be adding is statistics on building activity by neighborhood. I started testing some queries to see the results, and to find the query that outputs that information in a way that’ll pique users’ interests.

I calculated the aggregate estimated costs of all building permit activity for the past 90 days in select neighborhoods. All of the data was automatically generated using a simple MySQL query, but one that will get faster after switching to Postgres. (I eliminated any project whose estimated cost was less than $1,000 because there are many project types that are $0 to several hundred dollars.)

  • Logan Square: 77 projects, totaling $16,295,997.50 at a $211,636.33 average cost
  • West Loop: 30 projects, totaling $27,646,899.00 at a $921,563.30 average cost
  • Andersonville: 6 projects, totaling $358,770.00 at a $59,795.00 average cost
  • Bronzeville: 34 projects, totaling $17,050,662.00 at a $501,490.06 average cost
  • Hyde Park: 20 projects, totaling $13,492,265.00 at a $674,613.25 average cost
  • Humboldt Park: 35 projects, totaling $41,917,988.00 at a $1,197,656.80 average cost

How does Humboldt Park double the other neighborhoods’ average? I think it’s pretty simple: this $40 million Salvation Army residence that’s going to be built at 825 N Christiana Avenue.

The results for Bronzeville were higher than I expected because this is a distressed neighborhood that has lost of lot of population and has seen little development in the past several years. This isn’t to say the neighborhood is poor – I saw a report last fall that highlighted how the purchasing power of Bronzeville residents was quite high relative to neighboring communities.

Ronnie Harris showed me the report when I participated in the Center for Neighborhood Technology’s civic app competition and hackathon. We, along with Josh Engel, designed Build It! Bronzeville, although my participation was really pushing them to develop Josh’s game idea more and construct a paper version of it. Our team won the competition and Ronnie and Josh have kept working on it (I saw them at last week’s hack night).

Projects that pushed up Bronzeville’s average included several multi-family homes at around $1.4 million each on the blocks of 4700 and 4800 S Calumet Avenue.

Code discussion

I can’t test for the “Loop” right now in the way I have my data structured because a LIKE ‘%loop%’ query of the database will include “West Loop” records.

I need to change how the building permit data is stored – in my database – a little so that my site’s PHP codebase and MySQL queries can sift through the data faster. For example, I’m storing several key-value pairs as a JSON-encoded string in a TEXT field. One #chihacknight developer suggested I switch from MySQL to PostgreSQL because Postgres has native JSON-parsing functions.

I looked up how to use Postgres’s JSON functions and realized that, yes, I probably should do that, but that I also need to change the array structure of the data I’m encoding to JSON. In other words, with a tiny change now, I can be better prepared for the eventual migration to Postgres.

Using open data: Showing what projects licensed Chicago contractors are working on

The New City developer recently received permits for over $50 million of construction work across from the Lincoln Park REI.

The New City developer recently received permits for nearly $50 million of construction work across from the Lincoln Park REI.

I wrote in my last post that I found “pain” in the process of finding a licensed contractor in the city (the pain of finding one who can install in the public way remains unmedicated).

I wanted to provide more than a list (and a map) and EveryBlock has already answered “What’s going on across the street from my house?”. I wanted to add value by helping people answer the question, “What contractor should I choose?”

Several other sites help you do this, like BuildZoom, Angie’s List, and the Better Business Bureau, by showing you customer reviews or complaints. I needed something different from mimicking a review site (a lot of the businesses are also on Yelp) so I decided to answer the question, “What projects have these companies done?”

That’s where the City of Chicago’s open data portal comes in: it has a dataset for Building Permits.

Check out 180 Properties, LLC from Skokie, Illinois. They’ve had two permits issued within the last three months. One project, at 3705 N Hoyne Avenue, is for interior renovation: “Remove/replace cabinets, countertops, flooring, patch & repair drywall”. The estimated cost for the project is $80,000. Sound like the kind of contractor you’re looking for? Call them up or keep researching.

You can even see who else is working on this project. Burnham Nationwide is listed as an expeditor on this project which means they’re likely acting as the intermediary between the Chicago Department of Buildings and the companies actually doing the work. Burnham will do site plans, drawings, occupancy, and ensure everything is in order. The property owner is also listed in the permit information.

For people who want to explore construction activity the other way around, finding projects before contractors, I created a “Permits explorer” page. This page searches the Building Permits dataset to show the most recently issued permits for the most expensive projects. Right now a project to alter and renovate Chicago Vocational High School at 2100 E 87th Street has an estimated cost of $40 million. I didn’t realize how much the Department of Buildings is funded by permits until I saw the permit fees.

The permit fee for the school renovation would have been $372,598 fee but the dataset said the entirety was waived (likely because it’s a Chicago Public School). Other projects I reviewed had permit fees between $30,000 and $75,000.

Real estate speculators, development watchers, and editors of Curbed Chicago should find browsing permits useful. The list includes two projects associated with the New City development at Halsted Street and Clybourn Avenue, across from the Lincoln Park REI store. The two permits are held by 1515 N Halsted, LLC. The first is for a “3 story steel framed mixed-use retail, restuarant, assembly (movie theater) building” at 1500 N Clybourn Avenue (for an estimated cost of $26,403,193), and the second permit describes a 7 story parking garage at 710 W Schiller Street (for $21,518,012).

How it works

I used my programming magic – I prefer PHP – to query the Socrata Open Data API (or SODA) to look for the given contractor’s name in one of eight name fields (there are 16 name fields) and then return information about the most recent permits. The Building Permits dataset gives the project location, work description, and its estimated cost. I figured you could use the project’s estimated cost to gauge the kind of work the contractor does – is the contractor more familiar with big jobs, or little jobs?

This method isn’t the best. Ideally there’d be a relational database where the “Contractor ID” in the licensed contractors dataset would match a “Contractor ID” field in the permit dataset. But the licensed contractors dataset doesn’t have a unique ID field, and isn’t even on the data portal.

Instead, I’m finding contractor-to-project matches by finding the first two or three words of the contractor’s name at the beginning of eight of the 16 name fields in the permit field. SODA works quickly on the query and it passes the results back to PHP in no time.

In the future I’d like to pull in scores and reviews from Yelp and other sites that have APIs (Angies List and Better Business Bureau don’t), as well as try to determine the name of the building – if it has one – by querying OpenStreetMap Nominatim.