Tag: Chicago Cityscape

The genius of using ST_Subdivide to speed up PostGIS intersection comparisons

You should use ST_Subdivide to break up large shapes into smaller ones

This infographic visually compares the difference between running a PostGIS comparison query like ST_Intersects on a large shape versus a subdivided version of that large shape. Click to embiggen.

Hundreds of GIS intersection comparisons are completed every hour on Chicago Cityscape.*

People are looking at, say, a map of the South Shore community area. That “Place” page then grabs all of the building permits, building violations, business licenses, and other “feature layers” that are stored as points.

A classic “point in polygon” comparison is made using the ST_Intersects(place_geometry, permits_geometry) function.

This has worked very well for several years.

The problem

But as Chicago Cityscape handles larger shapes – they come from users drawing their own, large shapes, and from large shapes like the downtown Chicago area – this query doesn’t cut it.

Setting indexes on the geometry is imperative, but it’s not the end of the to optimize performance. That’s because the index of the geometry is a rectangular bounding box (which is also called an “envelope” in GIS) that contains the entire shape of the South Shore community area.

The downtown Chicago area, however, is not even the largest shape I have. That belongs to the new Place, “Neighborhood Opportunity Fund investment zones” (NOF). Combined, they cover 75 square miles of Chicago. Downtown is only 7.7 square miles.

After I added the NOF map and tested its Place page, it “crashed” my server, metaphorically speaking. The query to just count the number of building permits in the area would take about five minutes.

There had to be a better way; in the meantime, however, I divided the NOF map into the West and South sections. This hardly improved the counting time.

The solution

Thankfully, today, I saw a tweet from Paul Ramsey linking to his blog that linked to his slides from a recent presentation about the use of PostgreSQL to store and manipulate GIS data.

In it he explained how the ST_Subdividefunction worked. I’m going to demonstrate it using graphics from my own maps.

A normal intersection comparison, using ST_Intersects(place_geometry, permits_geometry) in a query creates a bounding box (envelope) around each geometry and quickly determines whether the two envelopes overlap. If they do, then it checks again to see if the actual geometries overlap. If they do, that data is returned as a response to your query.

When your two datasets are massive, like the NOF zones, which collectively cover 1/3rd of Chicago, and the building permits, which are found across the entire city…well, that led to the five minutes counting time.

Enter ST_Subdivide. To use it properly you would run it against your existing geometry and store the much smaller shapes, derived from the big shape, in a new table. I applied the function to all the 22,203 maps that Chicago Cityscape has and stored their unique IDs and subdivided geometries in a new “lookup” table.

Now, any time I want to compare the building permits against the NOF, the building permits are instead compared to the small shapes that were subdivided.

The query

Chicago Cityscape uses a single table (created as a materialized view) to combine all 22,203 maps. Each map is stored in a source table (for example, there’s a table to hold the 77 community areas) and the materialized view runs once a day to combine all of the maps in the source tables. This ensures our data is managed well: different source tables can hold different information, and the single table holds only the name, type, and geometry of the source tables, for faster comparison. Each entry in the single table also has a “slug”, its unique identifier.

Thus, the materialized view of the subdivided maps is created from the aforementioned single table, using this query:

create materialized view view_places_subdivided as
select gid || '_' || random() as gid, slug, st_subdivide(geom) as geom
from view_places;

The “gid” is designed to create a new unique ID field, as the slug field will be repeated for every subdivided of each map. A unique ID field is necessary if you want to refresh the materialized view concurrently (to allow for other queries to access the materialized view while it’s being refreshed).

* The results are cached for a few hours, because the feature layers change 1-2 times per day and at different times each day, so the limited duration cache accommodates that. Ideally I would code a way to invalidate the cache when the feature layer data is updated.

Update 12/31/19: ST_Subdivide will fail if your geometries have any or certain geometry errors (I don’t know if it’s any kind of error, or certain kinds of errors that make the function fail). Chicago Cityscape has over 37,000 features that ST_Subdivide is attempting to process, and there is a lot of room for error in managing that many features from dozens of sources.

Upzone the 606

Map of the single family-only zoning around the Bloomingdale Trail

The area in green only allows single-family houses to be built.

Something’s gotta give.

This is all of the land area within two blocks of the Bloomingdale Trail that allows only single-family housing to be built (view full map). This isn’t to say that multi-family housing doesn’t exist here; it definitely does, and there’s probably a handful of two-flats on a majority of the blogs.

All of the five parks of the 606 are within this two block radius, and 49.6 percent of the land allows only single-family housing to be built.

But why build a transportation corridor, a park, a new, expensive, public amenity, and not change the kind of housing – which often determines the kind of family and makeup of a household – that can afford to buy a home near here.

It’s already been shown that detached single-family housing prices have grown intensely the closer you get to the trail. That price growth has meant displacement for some, and “no chance to buy or build a house here” for many others.

There are still plenty of vacant lots within the mapped area; lots that should have a 2-4 unit building built on them, but where only a 1-unit building is allowed.

This map was made possible by the new Zoning Assessment tool on Chicago Cityscape. Read about it or use it now.

Fun with stats: Building permits by street name and number edition

John Hancock Center

The John Hancock Center. Photo by Kevin Dickert.

 

On which street are the most building permits issued?

Michigan Avenue!

But where on Michigan Avenue are the most building permits issued?

Take a guess!

First, can you answer: Are most building permits issued to North Michigan Avenue (between Madison Street, 0 north/south, and Oak Street, 1000 north), or South Michigan Avenue (between Madison Street, 0 north/south, and um, somewhere south of 130th Street, 13000 south)?

Here’s the answer…

Even though South Michigan Avenue is at least 13x longer than North Michigan Avenue, South Michigan Avenue has 39 percent fewer building permits!

From 2006 to yesterday (Saturday), there were 7,828 building permits issued to projects on North Michigan Avenue and 4,714 building permits issued on South Michigan Avenue.

The most common address on North Michigan Avenue to receive building permits was 875 N Michigan Avenue. It’s also the most common address to receive building permits on all Chicago streets.

What’s there? The John Hancock Center (tower)!

The average building address number on North Michigan Avenue is 540.6. That means that building permits on North Michigan Avenue concentrate around Grand Avenue, which is near the city’s biggest Marriott hotel, and is where the Under Armor flagship store is.

The next most common street – after South Michigan Avenue – is North Clark Street, which extends from Madison Street (0 north/south) to the northern edge of the city at Howard Street, which is 7600 north, about 7.6 times longer than North Michigan Avenue.

S. Clark Street Signs

Businesses in the 400 block of South Clark Street, as of when the photo was taken in November 2008. I believe the hotel is still there. This is the busiest block of South Clark Street, for building permits. Photo by Bruce Laker.

South Clark Street doesn’t register in the top 10 or even the top 100. It comes it at number 162, with 772 building permits. This is surprising to me because South Clark Street runs from Madison Street (0 north/south) in downtown and goes to 2200 south, and has a lot of downtown office buildings.

South LaSalle Street (3,613 building permits), South Wabash Avenue (2,916), and South Dearborn (1,611) are all in the top 50. The data could be wrong somehow.

A map of maps

The map of maps.

Over on my website Chicago Cityscape I’ve assembled a map of maps: There are 20,432 maps in 36 layers. You might say there are 36 maps, and each of those maps has an arbitrary number of boundaries within. I say there are 20,000+ maps because there’s a unique webpage for each of them that can tell you even more information about that map.

This post is to throw out some analysis of these maps, in addition to the simple counts above.

The data comes from the City of Chicago, Cook County, and the U.S. Census Bureau. Some layers have come from bespoke sources, including the entrances of CTA and Metra stations drawn by Yonah Freemark and me for Transit Explorer. The sections of the Chicago River were divided and sliced by the Metropolitan Planning Council. The neighborhood and business organizations layers were drawn by me, by interpreting textual descriptions of the organizations’ boundaries, or by visually copying an organization’s own map.

There are 6,879 unique words longer than 2 characters, in the metadata of this map of maps. The most common word is “annexation”, which makes sense, given that the layer with the most maps shows the 10,668 Cook County annexation actions since 1830 – the first known plat was incorporated in the City of Chicago.

The GeoJSON file, an open source, human readable GIS format, comes out to 30 MB, and it make break your browser when you try to display this layer.

The next group of words are also generic, like “planned” and “development”, related to the Planned Development kind of zoning process in Chicago – called Planned Unit Development in other jurisdictions.

After that, some names of municipalities that traded back and forth between unincorporated Cook County and incorporated municipalities are on the list.

Working down the list, however, it gets really boring and I’m going to stop. I bet if you’re a smarter data science person you can find more interesting patterns in the words, but I’ve also increased the number of generic words (like planned development) by adding these as keywords to each map’s “full text search” index, to ensure that they would respond to a variety of search phrases from users.

Designing a new static map style for Chicago Cityscape

I redesigned the static maps that are shown on Chicago Cityscape’s Place pages to tone down their harsh hues, and change what data (which comes from OpenStreetMap) is shown.

All 2,800 maps are automatically generated using a program called MOATP (“Map of all the places”) which is based on Neil Freeman’s svgis program. Both programs are open source.

The map now shows all roads; it was awkward to see so many empty spaces between buildings. Secondary* and residential roads are shown with slightly less thickness than primary and motorway roads. Also included are multi-use trails in parks.

Parks and grass are shown in different hues of green, although I don’t think it’s distinctive enough to know there’s a difference. Cemeteries remain a darker green.

I’ve changed the building color to soften the harsh brown. Only named buildings and schools appear, which is why you see a lot of gaps. Most buildings outside downtown aren’t named.

Retail areas have been added in a soft, salmon and tan-like color to show where “activity” areas in each Place.

I’ll be uploading the new maps soon.

* These road categories come from the OpenStreetMap “highway” tag.