Category: Data

Chicago Crash Browser updates with stats, filters, and news article links

Today I’m adding a bunch of new features to the Chicago Crash Browser, which lives on Chicago Cityscape.

But first…special access is no longer required. Anyone can create a free Cityscape account and access the map. However, only those with special access or a Cityscape Real Estate Pro account will be able to download the data.


Five new features include:

  • Statistics that update weekly to summarize what happened in the past week: the number of crashes, the number of people killed in crashes, and the number of people with the two worst tiers of injuries. The statistics are viewable to everyone, including those without access to the crash browser.
screenshot of the new weekly statistics
The statistics will update every Sunday. The numbers may change throughout the week as Chicago police officers upload crash reports.
  • For data users, the crash record ID is viewable. The crash record ID links details about the same crash across the Chicago data portal’s three tables: Crashes, Vehicles, and People. My Chicago Crash Browser is currently only using the Crashes table. Click on the “More details” arrow in the first table column.
screenshot of the data table showing the crash record ID revealed.
The crash record ID is hidden by default but can be exposed. Use this ID to locate details in the data portal’s Vehicles and People tables.
  • Filter crashes by location. There are currently two location filters: (1) on a “Pedestrian Street” (a zoning designation to, over time, reduce the prevalence of car-oriented land uses and improve building design to make them more appealing to walk next to); (2) within one block of a CTA or Metra station, important places where people commonly walk to. Select a filter’s radio button and then click “Apply filters”.
  • Filter crashes by availability of a news article or a note. I intend to attach news articles to every crash where a pedestrian or bicyclist was killed (the majority of these will be to Streetsblog Chicago articles, where I am still “editor at large”. Notes will include explanations about data changes [1] (the “map editor” mentioned in some of the notes is me) and victims’ names.
screenshot of the two types of filters

After choosing a filter’s radio button click “Apply filters” and the map and data table will update.
  • Filter by hit and run status. If the officer filling out the crash report marked it as a hit and run crash, you can filter by choosing “Yes” in the options list. “No” is another option, as is “not recorded”, which means the officer didn’t select yes or no.
  • Search by address. Use the search bar inside the map view to center the map and show crashes that occurred within one block (660 feet) of that point. The default is one block and users can increase that amount using the dropdown menu in the filter.
screenshot of the map after the search by address function has been used
Use the search bar within the map view to show crashes near a specific address in Chicago.

Footnotes

[1] The most common data change as of this writing is when a crash’s “most severe injury” is upgraded from non-fatal to fatal, but the crash report in the city’s data portal does not receive that update. This data pipeline/publishing issue is described in the browser’s “Crash data notes” section.

The “map editor” (me) will change a crash’s “most severe injury” to fatal to ensure it appears when someone filters for fatal crashes. This change to the data will be noted.

Chicagoland’s massive parking footprint – as measured on September 16, 2018

Using the footprints of parking lots and garages drawn into OpenStreetMap as a data source, the area of land in Chicagoland occupied by parking lots and garages is 247,539,968 square feet. (The data was exported using HOT Export Tool; you can replicate my export.)

That converts to:

  • 5,682.71 acres
  • 8.88 mi^2 (square miles)
  • 22.99 km^2 (square kilometers)
  • ≈ 0.26 × area of Manhattan (≈ 87 km^2 )
  • 3.9% area of Chicago is parking (Chicago is ~589.56 km^2 )

(I forgot to measure the portion of this within Chicago, and now the data snapshot is gone. I fixed this in the 2019 report.)

Why Jefferson Park residents should allow more housing

Short answer: To provide more shoppers for the local businesses. Read on for the longer answer. 

Over on Chicago Cityscape I added a new feature called “market analysis” which measures the number of people who live within specific walking areas (measured by time) and driving areas (measured by distance). 

I am in favor of removing apartment & condo bans in Chicago, especially in areas where they were previously allowed and near train stations.

Jefferson Park is centered around two co-located train stations, serviced by CTA and Metra respectively. There have been multiple proposals for multi-family housing near the stations (collectively called the Jefferson Park Transit Center) and some have been approved. 

Always, however, there are residents who resist these proposals and the number of originally proposed apartments or condos gets reduced in the final version (classic NIMBYism). 

There’re four reasons – at least – why more housing should be allowed near the Jefferson Park Transit Center:

  • Locally owned businesses require a significant amount of shoppers who live nearby and walk up traffic
  • More people should have the opportunity to live near low-cost transportation
  • It will include more affordable housing, through Chicago’s inclusionary zoning rules (the Affordable Requirements Ordinance, ARO)
  • There will be less driving, and therefore lower household transportation costs and less neighborhood pollution

To support the first reason, I used the “market analysis” tool to see just how many people live in a walkable area centered around Veterans Square, a mixed-use office and retail development adjacent to the train stations. 

Only 9,368 people live within a 10 minute walk to Veterans Square (get the Address Snapshot). 

Comparatively, 19,707 people live within a 10 minute walk to The Crotch, or the center of Wicker Park, at the intersection of Milwaukee/North/Damen (get the Address Snapshot). The Blue Line station is about 75 feet south of the center point.

I would grant the low Veterans Square number a small discount based on the proximity to the Kennedy Expressway, which severely truncates walking areas up and down the northwest side. Still, even with that discount, ending up with less than half the amount as the one in Wicker Park, is disturbing. Wicker Park is hardly characterized by high-density housing. In fact, all of the new high-rises are just outside the 10 minute walk shed!

Don’t ban apartments on this vacant lot if you want more affordable housing – a case study

A vacant lot is for sale near the 606’s Bloomingdale Trail, a popular amenity that’s now known to have an effect in increasing home values. It’s zoned RS-3, which means it bans apartments. If the zoning stays the same, then the vacant lot will only allow a rich family to move in here. If the lot’s zoning is changed to allow apartments or condos, then the vacant lot could welcome families that earn median incomes.

You can build multi-family housing on the lot if you can get a zoning change, but you’ll have to pay the city a fee, convince your future neighbors that they shouldn’t oppose it, convince the alder that he should support it, and you’ll have to hire a lawyer.

Let’s say that zoning changes in Chicago were free and frictionless*. What should be built on this lot?

If the lot would allow multi-family housing, we can build several units for less money per unit than if we built a single-family house. That means that three families (let’s stick with three, which requires a zoning change to RM-4.5) could be housed for less money per family than the cost of one family.

How’s that? The sticker price for this lot is $425,000 right now, and if one family is paying for that plus the cost of building a house, then your minimum investment is pretty massive. (I suspect the lot will sell for something closer to $400,000.)

I looked at new construction costs on Chicago Cityscape, as indicated on building permits issued within 1 mile of the vacant lot, took the average, and added it to the cost of land per unit.

Construction costs

The average new construction single-family house, from the 10 most recent permits, is $304,052.78.

The average new construction multi-family housing, from the 10 most recent permits, is $230,192.13 per unit.

Total cost per unit (land + construction)

Add in the land cost per unit ($425,000 for the single-family house and $141,666.67 per unit for the 3-flat) and you end up with the total costs of:

  • $729,052.78 for the single-family house
  • $371,858.80 per unit in the 3-flat

Add in the profit or “cap rate” that a builder wants to make and the price is even higher, but the people who would buy in the multi-family house would be paying much less for their homes.

Takeaways

The city can generate more affordable housing if it “upzones” vacant land and stops banning multi-family housing. (Much of the city’s parcels have been “downzoned” to ban multi-family housing in a process that creates “exclusionary zoning” and allows only – expensive – single-family housing.)

The city and the Chicago Transit Authority will earn more real estate transfer taxes (RPTT) from the sales of the units as condos than from a single-family house.

Three families instead of one would enjoy living to the wonderful amenity that the Bloomingdale Trail and the parks that the 606 offers.

Want this kind of analysis for a property in Chicago? You can order a zoning report from me.

* The City of Chicago charges a zoning change fee of $1,025, and you will most likely have to hire a lawyer, and it will take about 3-6 months, depending on the complexity of the proposal that requires the zoning change. You can use Chicago Cityscape to see actual approval times (excluding the time meeting the alder for the ward of the proposed project).

The genius of using ST_Subdivide to speed up PostGIS intersection comparisons

You should use ST_Subdivide to break up large shapes into smaller ones

This infographic visually compares the difference between running a PostGIS comparison query like ST_Intersects on a large shape versus a subdivided version of that large shape. Click to embiggen.

Hundreds of GIS intersection comparisons are completed every hour on Chicago Cityscape.*

People are looking at, say, a map of the South Shore community area. That “Place” page then grabs all of the building permits, building violations, business licenses, and other “feature layers” that are stored as points.

A classic “point in polygon” comparison is made using the ST_Intersects(place_geometry, permits_geometry) function.

This has worked very well for several years.

The problem

But as Chicago Cityscape handles larger shapes – they come from users drawing their own, large shapes, and from large shapes like the downtown Chicago area – this query doesn’t cut it.

Setting indexes on the geometry is imperative, but it’s not the end of the to optimize performance. That’s because the index of the geometry is a rectangular bounding box (which is also called an “envelope” in GIS) that contains the entire shape of the South Shore community area.

The downtown Chicago area, however, is not even the largest shape I have. That belongs to the new Place, “Neighborhood Opportunity Fund investment zones” (NOF). Combined, they cover 75 square miles of Chicago. Downtown is only 7.7 square miles.

After I added the NOF map and tested its Place page, it “crashed” my server, metaphorically speaking. The query to just count the number of building permits in the area would take about five minutes.

There had to be a better way; in the meantime, however, I divided the NOF map into the West and South sections. This hardly improved the counting time.

The solution

Thankfully, today, I saw a tweet from Paul Ramsey linking to his blog that linked to his slides from a recent presentation about the use of PostgreSQL to store and manipulate GIS data.

In it he explained how the ST_Subdividefunction worked. I’m going to demonstrate it using graphics from my own maps.

A normal intersection comparison, using ST_Intersects(place_geometry, permits_geometry) in a query creates a bounding box (envelope) around each geometry and quickly determines whether the two envelopes overlap. If they do, then it checks again to see if the actual geometries overlap. If they do, that data is returned as a response to your query.

When your two datasets are massive, like the NOF zones, which collectively cover 1/3rd of Chicago, and the building permits, which are found across the entire city…well, that led to the five minutes counting time.

Enter ST_Subdivide. To use it properly you would run it against your existing geometry and store the much smaller shapes, derived from the big shape, in a new table. I applied the function to all the 22,203 maps that Chicago Cityscape has and stored their unique IDs and subdivided geometries in a new “lookup” table.

Now, any time I want to compare the building permits against the NOF, the building permits are instead compared to the small shapes that were subdivided.

The query

Chicago Cityscape uses a single table (created as a materialized view) to combine all 22,203 maps. Each map is stored in a source table (for example, there’s a table to hold the 77 community areas) and the materialized view runs once a day to combine all of the maps in the source tables. This ensures our data is managed well: different source tables can hold different information, and the single table holds only the name, type, and geometry of the source tables, for faster comparison. Each entry in the single table also has a “slug”, its unique identifier.

Thus, the materialized view of the subdivided maps is created from the aforementioned single table, using this query:

create materialized view view_places_subdivided as
select gid || '_' || random() as gid, slug, st_subdivide(geom) as geom
from view_places;

The “gid” is designed to create a new unique ID field, as the slug field will be repeated for every subdivided of each map. A unique ID field is necessary if you want to refresh the materialized view concurrently (to allow for other queries to access the materialized view while it’s being refreshed).

* The results are cached for a few hours, because the feature layers change 1-2 times per day and at different times each day, so the limited duration cache accommodates that. Ideally I would code a way to invalidate the cache when the feature layer data is updated.

Update 12/31/19: ST_Subdivide will fail if your geometries have any or certain geometry errors (I don’t know if it’s any kind of error, or certain kinds of errors that make the function fail). Chicago Cityscape has over 37,000 features that ST_Subdivide is attempting to process, and there is a lot of room for error in managing that many features from dozens of sources.