Category: GIS

Why Jefferson Park residents should allow more housing

Short answer: To provide more shoppers for the local businesses. Read on for the longer answer. 

Over on Chicago Cityscape I added a new feature called “market analysis” which measures the number of people who live within specific walking areas (measured by time) and driving areas (measured by distance). 

I am in favor of removing apartment & condo bans in Chicago, especially in areas where they were previously allowed and near train stations.

Jefferson Park is centered around two co-located train stations, serviced by CTA and Metra respectively. There have been multiple proposals for multi-family housing near the stations (collectively called the Jefferson Park Transit Center) and some have been approved. 

Always, however, there are residents who resist these proposals and the number of originally proposed apartments or condos gets reduced in the final version (classic NIMBYism). 

There’re four reasons – at least – why more housing should be allowed near the Jefferson Park Transit Center:

  • Locally owned businesses require a significant amount of shoppers who live nearby and walk up traffic
  • More people should have the opportunity to live near low-cost transportation
  • It will include more affordable housing, through Chicago’s inclusionary zoning rules (the Affordable Requirements Ordinance, ARO)
  • There will be less driving, and therefore lower household transportation costs and less neighborhood pollution

To support the first reason, I used the “market analysis” tool to see just how many people live in a walkable area centered around Veterans Square, a mixed-use office and retail development adjacent to the train stations. 

Only 9,368 people live within a 10 minute walk to Veterans Square (get the Address Snapshot). 

Comparatively, 19,707 people live within a 10 minute walk to The Crotch, or the center of Wicker Park, at the intersection of Milwaukee/North/Damen (get the Address Snapshot). The Blue Line station is about 75 feet south of the center point.

I would grant the low Veterans Square number a small discount based on the proximity to the Kennedy Expressway, which severely truncates walking areas up and down the northwest side. Still, even with that discount, ending up with less than half the amount as the one in Wicker Park, is disturbing. Wicker Park is hardly characterized by high-density housing. In fact, all of the new high-rises are just outside the 10 minute walk shed!

The genius of using ST_Subdivide to speed up PostGIS intersection comparisons

You should use ST_Subdivide to break up large shapes into smaller ones

This infographic visually compares the difference between running a PostGIS comparison query like ST_Intersects on a large shape versus a subdivided version of that large shape. Click to embiggen.

Hundreds of GIS intersection comparisons are completed every hour on Chicago Cityscape.*

People are looking at, say, a map of the South Shore community area. That “Place” page then grabs all of the building permits, building violations, business licenses, and other “feature layers” that are stored as points.

A classic “point in polygon” comparison is made using the ST_Intersects(place_geometry, permits_geometry) function.

This has worked very well for several years.

The problem

But as Chicago Cityscape handles larger shapes – they come from users drawing their own, large shapes, and from large shapes like the downtown Chicago area – this query doesn’t cut it.

Setting indexes on the geometry is imperative, but it’s not the end of the to optimize performance. That’s because the index of the geometry is a rectangular bounding box (which is also called an “envelope” in GIS) that contains the entire shape of the South Shore community area.

The downtown Chicago area, however, is not even the largest shape I have. That belongs to the new Place, “Neighborhood Opportunity Fund investment zones” (NOF). Combined, they cover 75 square miles of Chicago. Downtown is only 7.7 square miles.

After I added the NOF map and tested its Place page, it “crashed” my server, metaphorically speaking. The query to just count the number of building permits in the area would take about five minutes.

There had to be a better way; in the meantime, however, I divided the NOF map into the West and South sections. This hardly improved the counting time.

The solution

Thankfully, today, I saw a tweet from Paul Ramsey linking to his blog that linked to his slides from a recent presentation about the use of PostgreSQL to store and manipulate GIS data.

In it he explained how the ST_Subdividefunction worked. I’m going to demonstrate it using graphics from my own maps.

A normal intersection comparison, using ST_Intersects(place_geometry, permits_geometry) in a query creates a bounding box (envelope) around each geometry and quickly determines whether the two envelopes overlap. If they do, then it checks again to see if the actual geometries overlap. If they do, that data is returned as a response to your query.

When your two datasets are massive, like the NOF zones, which collectively cover 1/3rd of Chicago, and the building permits, which are found across the entire city…well, that led to the five minutes counting time.

Enter ST_Subdivide. To use it properly you would run it against your existing geometry and store the much smaller shapes, derived from the big shape, in a new table. I applied the function to all the 22,203 maps that Chicago Cityscape has and stored their unique IDs and subdivided geometries in a new “lookup” table.

Now, any time I want to compare the building permits against the NOF, the building permits are instead compared to the small shapes that were subdivided.

The query

Chicago Cityscape uses a single table (created as a materialized view) to combine all 22,203 maps. Each map is stored in a source table (for example, there’s a table to hold the 77 community areas) and the materialized view runs once a day to combine all of the maps in the source tables. This ensures our data is managed well: different source tables can hold different information, and the single table holds only the name, type, and geometry of the source tables, for faster comparison. Each entry in the single table also has a “slug”, its unique identifier.

Thus, the materialized view of the subdivided maps is created from the aforementioned single table, using this query:

create materialized view view_places_subdivided as
select gid || '_' || random() as gid, slug, st_subdivide(geom) as geom
from view_places;

The “gid” is designed to create a new unique ID field, as the slug field will be repeated for every subdivided of each map. A unique ID field is necessary if you want to refresh the materialized view concurrently (to allow for other queries to access the materialized view while it’s being refreshed).

* The results are cached for a few hours, because the feature layers change 1-2 times per day and at different times each day, so the limited duration cache accommodates that. Ideally I would code a way to invalidate the cache when the feature layer data is updated.

Update 12/31/19: ST_Subdivide will fail if your geometries have any or certain geometry errors (I don’t know if it’s any kind of error, or certain kinds of errors that make the function fail). Chicago Cityscape has over 37,000 features that ST_Subdivide is attempting to process, and there is a lot of room for error in managing that many features from dozens of sources.

At least 2.5 percent of the land area in Chicago is covered in parking lots and garages

Here’s how I know that at least 2.5 percent of the land area in Chicago is covered in parking lots and garages, as of February 5, 2017.

That’s a lot of polluted water runoff.

I grabbed the land area of 227.3 from the Wikipedia page.

I grabbed all the parking lots from OpenStreetMap via Metro Extracts, which is going to be the most complete map of parking lots and garages.

Volunteer mappers, including me, drew these by tracing satellite imagery.

With the parking lots data in GIS, I can count their area in square feet, which comes out to 160,075,942.42. Convert that to square miles and you get 5.74.

5.74/227.3*100 = 2.5 percent

The last snapshot of parking lot data I have is from February 2016, when only 3.39 square miles of parking lots have been drawn.

There are still many more parking lots to be drawn!

Oh, how Chicago land use is controlled by spot zoning

If you only had a zoning map to try and understand how the different blocks in the City of Chicago relate to their neighborhoods and the city at large, you might have the idea that the city has no neighborhoods, but is actually a collection of tiny, randomly dispersed zones of differing land uses.

And then when you walked those areas you’d find that the zones, which attempt to prescribe a land use, at least nominally, don’t have anything to do with the restaurant, housing, and commercial building mix of uses actually present.

No plan would have been devised to create a map like this.

Over the last five years, and surely over the last 14, the City of Chicago has been divided (really, split) into an increasing number of distinct zoning districts.

The city’s zoning map is updated after each monthly city council meeting, to reflect the numerous changes that the 50 alders have approved individually. (Their collective approval occurs unanimously in an omnibus bill.)

Every few months I ask the Chicago Department of Innovation and Technology (DoIT) for the latest zoning map, in the form of a shapefile (a kind of file that holds geographic information that can be analyzed by many computer programs). While Chicago has one of the country’s best open data offerings, some datasets, like zoning, don’t get updated in the catalog.

There are two ways I can analyze and present the data about the quantity of zoning districts. Both, however, show that the number of distinct zoning districts has increased. This means that the city is divided even more finely than it was just six months ago.

Analysis 1: Period snapshots

I have the zoning shapefile for five periods, snapshots of the city’s zoning map at that time. From August 2012 to now, May 2016, the number of discrete zoning districts (the sum of all B3-5, RS-1, DX-7, etc. zoning classes) has increased 7.8 percent.

Period Zoning districts change

August 2012


September 2014



June 2015



November 2015



May 2016



I collect the period snapshots to show the history of zoning at a specific address or building in Chicago, which is listed on Chicago Cityscape. For example, the zoning for the site of the new mixed-use development in Bucktown that includes a reconstructed Aldi has changed four times in four years.

aldi zoning history

Analysis 2: Creation date

The zoning shapefiles also have the date at which a zoning district was split or combined to create a new district, either with a different zoning class (RT-4, C1-1, etc.) or a different shape.

With the most recent zoning shapefile I can tell how many new zoning districts were split or combined and a record representing it was added to the list. The records start in 2002, and by the end of the year 7,717 records were created.

The following year, only 14 records were added, and in 2004, only 6. The Chicago City Council adopted a rewritten zoning code in 2004, and I guess that the zoning map was modified prior to adoption. After 2004, the number of new zoning districts picks up:

year zoning districts added by splitting/combining cumulative change




























































none listed



It seems there’s a light relationship between the recession that started in 2008 and the number of zoning changes made. There are more made annually before the recession than after it. It actually seems to track with building permits (sorry, no chart handy).

The U.S. DOT should collaborate with existing “National Transit Maps” makers

The U.S. DOT demonstrated one idea for how a National Transit Map might look and work at a conference in February.

The Washington Post reported this month that the United States Department of Transportation is going to develop a “National Transit Map” because, frankly, one doesn’t exist. The U.S. DOT said such a map could reveal “transit deserts” (the screen capture above shows one example from Salt Lake City, discussed below).

Secretary Anthony Foxx wrote in an open letter to say that the department and the nation’s transit agencies “have yet to recognize the full potential” of a data standard called the General Transit Feed Specification that Google promoted in order to integrate transit routing on its maps. Foxx described two problems that arose out of not using “GTFS”.

  1. Transit vehicles have significantly greater capacity than passenger cars, but are often considered just vehicles because we are unable to show where and when the transit vehicles are scheduled to operate. The realistic treatment of transit for planning, performance measures, and resiliency requires real data on transit system operations.
  2. One of the most important social values of transit is that it makes transportation available to people who do not have access to private automobiles, and provides transportation options for those who do. Yet, we cannot describe this value at a national level and in many regions because we do not have a national map of fixed transit routes.

“The solution is straightforward”, Foxx continued, “[is] a national repository of voluntarily provided, public domain GTFS feed data that is compiled into a common format with data from fixed route systems.”

The letter went on to explain exactly how the DOT would compile the GTFS files, and said the first “collection day” will be March 31, this week. As of this writing, the website to which transit agencies must submit their GTFS files is unavailable.

What Foxx is asking for has already been done to some degree. Two national transit maps and one data warehouse already exist and the DOT should engage those producers, and others who would use the map, to determine the best way to build a useful but inexpensive map and database. Each of the two existing maps and databases was created by volunteers and are already-funded projects so it would make sense to maximize the use of existing projects and data.

“Transitland” is a project to host transit maps and timetables for transit systems around the world. It was created by Mapzen, a company funded by Samsung to build open source mapping and geodata tools. Transitland is also built upon GTFS data from agencies all over the world. Its data APIs and public map can help answer the question: How many transit operators serve Bay Area residents, and what areas does each service?

For the United States, Transitland hosts and queries data from transit agencies in 31 states and the District of Columbia. In Washington, D.C., Transitland is aware of four transit agencies. It’s a great tool in that respect: Not all of the four transit agencies are headquartered in D.C. or primarily serve that city. The app is capable of understanding spatial overlaps between municipal and regional geographies and transit agencies.

Transitland has a “GUI” to show you how much transit data it has around the world.

“Transit Explorer” is an interactive map of all rail transit and bus rapid transit lines in the United States, Mexico, and Canada. Yonah Freemark, author of The Transport Politic, created the map using data culled from OpenStreetMap, the National Transit Atlas Database (administered by the DOT and which shows fixed-guideway transit), and his own research. I wrote the custom JavaScript code for the Leaflet-powered map.

No other agency or project has collected this much data about fixed-guideway transit lines in any of the three countries, since the map includes detailed information about line lengths, ridership, and other characteristics that are not included in GTFS data. Transit Explorer, though, does not include local bus service or service frequencies, which the DOT’s map may if it incorporates the full breadth of GTFS data.

Transit Explorer also goes a step further by providing data about under construction and proposed fixed-guideway transit lines, which is information that is very relevant to understanding future neighborhood accessibility to transit, but which is not available through GTFS sources.

Finally, “GTFS Data Exchange” is a website that has been storing snapshots of GTFS feeds from agencies around the world for almost a decade, or about as long as GTFS has been used in Google Maps. The snapshots allow for service comparisons of a single agency across time. For example, there are over 100 versions of the GTFS data for the Chicago Transit Authority, stretching back to November 2009; new versions are added – by “cta-archiver” – twice a month.

Josh Cohen, writing in Next City, highlighted the significance of Google’s invention of GTFS, saying, “Prior to the adoption of GTFS, creating such a map would’ve been unwieldy and likely produced an out-of-date product by the time it was completed.” The DOT’s own National Transit Atlas Database includes only fixed-guideway (a.k.a. trains) routes, and hasn’t been updated since 2004.

Not all GTFS feeds are created equal, though. Some transit agencies don’t include all of the data, some of which is optional for Google Map’s purpose, that would make the National Transit Map useful for the spatial analysis the DOT intends. Many agencies don’t include the “route shapes”, or the geographic lines between train stations and bus stops. Researchers are able to see where the vehicles stop, but not which streets or routes they take. Foxx’s letter doesn’t acknowledge this. It does, however, mention that transit agencies can use some federal funds to create the GTFS data.

David Levinson, professor at the University of Minnesota, believes the map will bias coverage (geographic reach of transit service) over frequency (how many buses are run each day that someone could ride).

The U.S. DOT’s chief data officer, Dan Morgan, whom I met at Transportation Camp 2015 in Washington, D.C., presented at the FedGIS Conference this year one idea to demonstrate coverage and frequency in Salt Lake City, using the GTFS data from the Utah Transit Authority.

Levinson also tweeted that it will be difficult for a national map to show service because of the struggles individual transit providers have symbolizing their own service patterns.

Foxx’s letter doesn’t describe how planners will be able to download the data in the collection, but whichever app they build or modify will cost money. Before going much further, and before spending any significant funds, Foxx should consult potential users and researchers to avoid duplicating existing projects that may ultimately be superior resources.

Foxx can also take advantage of “18F” a new agency within the General Services Administration to overcome government’s reputation for creating costly and difficult to use apps. The GSA procures all kinds of things the federal government needs, and 18F may be able to help the DOT create the National Transit Map (and database) in a modern, tech and user-friendly way – or write a good RFP for someone else to make it.

Look for the National Transit Map this summer.