Category: Information

Working with ZIP code data (and alternatives to using sketchy ZIP code data)

1711 North Kimball Avenue, built 1890

This building at 1711 N Kimball no longer receives mail and the local mail carrier would mark it as vacant. After a minimum length of time the address will appear in the United States Postal Service’s vacancy dataset, provided by the federal Department of Housing and Urban Development. Photo: Gabriel X. Michael.

Working with accurate ZIP code data in your geographic publication (website or report) or demographic analysis can be problematic. The most accurate dataset – perhaps the only one that could be called reliably accurate – is one that you purchase from one of the United States Postal Service’s (USPS) authorized resellers. If you want to skip the introduction on what ZIP codes really represent, jump to “ZIP-code related datasets”.

Understanding what ZIP codes are

In other words the post office’s ZIP code data, which they use to deliver mail and not to locate people like your publication or analysis, is not free. It is also, unbeknownst to many, a dataset that lists mail carrier routes. It’s not a boundary or polygon, although many of the authorized resellers transform it into a boundary so buyers can geocode the location of their customers (retail companies might use this for customer tracking and profiling, and petition-creating websites for determining your elected officials).

The Census Bureau has its own issues using ZIP code data. For one, the ZIP code data changes as routes change and as delivery points change. Census boundaries needs to stay somewhat constant to be able to compare geographies over time, and Census tracts stay the same for a period of 10 years (between the decennial surveys).

Understanding that ZIP codes are well known (everybody has one and everybody knows theirs) and that it would be useful to present data on that level, the Bureau created “ZIP Code Tabulation Areas” (ZCTA) for the 2000 Census. They’re a collection of Census tracts that resemble a ZIP code’s area (they also often share the same 5-digit identifiers). The ZCTA and an area representing a ZIP code have a lot of overlap and can share much of the same space. ZCTA data is freely downloadable from the Census Bureau’s TIGER shapefiles website.

There’s a good discussion about what ZIP codes are and aren’t on the GIS StackExchange.

Chicago example of the problem

Here’s a real world example of the kinds of problems that ZIP code data availability and comprehension: Those working on the Chicago Health Atlas have run into this problem where they were using two different datasets: ZCTA from the Census Bureau and ZIP codes as prepared by the City of Chicago and published on their open data portal. Their solution, which is really a stopgap measure and needs further review not just by those involved in the app but by a diverse group of data experts, was to add a disclaimer that they use ZCTAs instead of the USPS’s ZIP code data.

ZIP-code related datasets

Fast forward to why I’m telling you all of this: The U.S. Department of Housing and Urban Development (HUD) has two ZIP-code based datasets that may prove useful to mappers and researchers.

1. ZIP code crosswalk files

This is a collection of eight datasets that link a level of Census geography to ZIP codes (and the reverse). The most useful to me is ZIP to Census tract. This dataset tells you in which ZIP code a Census tract lies (including if it spans multiple ZIP codes). HUD is using data from the USPS to create this.

The dataset is documented well on their website and updated quarterly, going back to 2010. The most recent file comes as a 12 MB Excel spreadsheet.

2. Vacant addresses

The USPS employs thousands of mail carriers to delivery things to the millions of households across the country, and they keep track of when the mail carrier cannot delivery something because no one lives in the apartment or house anymore. The address vacancy data tells you the following characteristics at the Census tract level:

  • total number of addresses the USPS knows about
  • number of addresses on urban routes to which the mail carrier hasn’t been able to delivery for 90 days and longer
  • “no-stat” addresses: undeliverable rural addresses, places under construction, urban addresses unlikely to be active

You must register to download the vacant addresses data and be a governmental entity or non-profit organization*, per the agreement** HUD has with USPS. Learn more and download the vacancy data which they update quarterly.

Tina Fassett Smith is a researcher at DePaul University’s Institute of Housing Studies and reviewed part of this blog post. She stresses to readers to ignore the “no-stat” addresses in the USPS’s vacancy dataset. She said that research by her and her colleagues at the IHS concluded this section of the data is unreliable. Tina also said that the methodology mail carriers use to identify vacant addresses and places under change (construction or demolition) isn’t made public and that mail carriers have an incentive to collect the data instead of being compensated normally. Tina further explained the issues with no-stat.

We have seen instances of a relationship between the number of P.O. boxes (i.e., the presence of a post office) and the number of no-stats in an area. This is one reason we took it off of the IHS Data Portal. We have not found it to be a useful data set for better understanding neighborhoods or housing markets.

The Institute of Housing Studies provides vacancy data on their portal for those who don’t want to bother with the HUD sign-up process to obtain it.

* It appears that HUD doesn’t verify your eligibility.

** This agreement also states that one can only use the vacancy data for the “stated purpose”: “measuring and forecasting neighborhood changes, assessing neighborhood needs, and measuring/assessing the various HUD programs in which Users are involved”.

I’ve got property tax data for Chicago Cityscape

Wrigley Field Ahead of a Seemingless Meaningless Game, September 2011

Wrigley Field is an old baseball stadium in Chicago’s Lakeview neighborhood. Photo by Dan X. O’Neil

1. Licensed Chicago Contractors, my website that tracks what developers and the city are proposing to build or demolish in your neighborhood, is now called Chicago Cityscape.

2. I’m grateful to Ian Dees who helped me get property tax data for 2009-2013 for over 1.4 million PINs (property identification numbers) in Cook County.

I’m going through various parts of the property tax data and figuring out how to integrate it with Chicago Cityscape. The first time Ian got the data I found out I didn’t tell him to get the right PINs. I think I’ve fixed that now.

As part of this process I’m checking properties somewhat randomly, based on the permits I’m browsing. I most recently viewed a Wrigley Field building permit at 1060 W Addison Street – for a Zac Brown concert – so I searched its PIN and how much the property is “worth”. Here goes:

Year Amount Billed Assessed Value
2013 $1,517,665.09 $8,049,996
2012 1,498,971.03 8,049,996
2011 1,493,002.47 8,865,636
2010 1,489,160.89 8,865,636
2009 1,360,673.45 10,613,423

Notice how the assessed value dropped over $2 million from 2009 to 2010. And even though it had three unique assessed values, the annually changing tax rate adjusted the amount billed. You can see this information on the Cook County Property Info portal.

Finding teardowns in Chicago

1923 South Allport Avenue, built 1884

A recent suspected teardown, at 1923 S Allport in Pilsen (25th Ward, 19th place for teardowns from 2006 to now). The demolition permit was issued August 7 and the new construction permit was issued August 5. The new building will have an increase in density, with three dwelling units. Photo by Gabriel Michael.

From Wikipedia, a teardown is a “process in which a real estate company or individual buys an existing home and then demolishes and replaces it with a new one”.

You can find suspected* teardowns in the building permits data on Licensed Chicago Contractors by looking for demolition permits and new construction permits for the same address. I limited my search to situations where the demolition permit was issued within 60 days prior or subsequent to the new construction permit. This shows properties that have a quick turnaround (thus more likely to get built). I didn’t want to include buildings that may have been demolished one year and got a building two years later.

Analysis

This analysis is based on data since January 1, 2006, the start of the first complete year of building permits data in the Chicago open data portal, and ends today. The first demolition permit in this analysis was issued January 10, 2006, and its associated new construction permit was issued five days prior. There may be a case when the demolition permit and new construction permits were issued in different years, but for this analysis I only consider the year in which the demolition permit was issued. (In my review of permits since March I believe that new construction permits are issued most often after the demolition permit.)

Suspected teardowns

The number for teardowns decreased dramatically as the economic crisis approached.

Results

There were 1,717 suspected teardowns in Chicago distributed across 57 community areas (of 77, whose boundaries don’t change) and 45 wards (of 50, whose boundaries changed in 2012).

West Town, Lake View, and North Center share top billing, with the most teardowns each year, but Lake View was #1 for seven of 10 years. Other top five community areas comprise Logan Square (thrice), Lincoln Square (thrice), Bridgeport (twice), McKinley Park (once), and Near West Side (once).

From 2012 to current, the most teardowns occurred in Wards 32 (Waguespack), 47 (Pawar), 1 (Moreno), 44 (Tunney), and 43 (Smith). All of those wards include parts of the top three community areas mentioned above.

The sixth ward with the most teardowns in this period was 2 (Fioretti) but this boundary no longer represents any part of the pre-2012 boundary that covered almost the entire South Loop. That means Ward 2 is now covering the west side. Additionally, the 2nd Ward made sixth place with 28 teardowns and fifth place, the 43rd Ward had 60 teardowns.

The South Loop, represented by the Near South Side community area, has had 0 suspected teardowns from 2012 to now. There was one teardown in the entire time period, where a three-story commercial was demolished at 1720 S Michigan Ave and replaced with a 32-story residential tower.

What else do you want to know about teardowns in Chicago?

* Notes

I use “suspected” because it’s impossible to know from the data if buildings were actually demolished and constructed.

Download the data as CSV for yourself.

Morgan CTA station ranks highly in rail system for building permits

Let Your Conscious Be Your Guide

The gutted cold storage warehouse in the background is within a quarter mile of the Morgan CTA station. Photo by Seth Anderson.

Excluding all of the Chicago Transit Authority stations in the central business district you’ll find that the new Morgan station ranks highly in the number of building permits issued within a quarter mile. It has a top spot when you calculate those permits’ estimated project costs. The CTA recently discussed with DNAInfo the results of a preliminary study it conducted that showed how the Morgan station is at the center of a lot of construction growth in the West Loop/Fulton Market area, and a contributing factor to this growth.

Now that Licensed Chicago Contractors shows you the two nearest CTA and Metra rail stations to each building permit, and I’ve become well-versed in writing PostGIS queries on the fly, I wrote a query that lists the CTA stations with the most building permits within a quarter mile (“nearby”).

First, though, let’s count how many stations don’t have permits nearby. With the query at the bottom you get a list of station names, the number of permits nearby, and a sum of the estimated costs of those permits sorted by the number of permits. Since I used a “LEFT JOIN” I also get a count of all the permits (the table on the LEFT) that don’t have a match with CTA stations (the table on the right).

There are 127 rows returned and a previous count of the table told me there are 145 stations, including ones outside the Chicago city limits. (There are stations in Cicero, Wilmette, Evanston, Rosemont, Oak Park, Forest Park, and Skokie.) The first row represents NULL, or all of the stations that don’t have permits nearby. That leaves me with 126 rows and 19 stations without permits, or 19 stations outside the City of Chicago.

I verified this by eyeballing it. I looked at a map and counted roughly 19 stations that wouldn’t have the 1/4 mile overlap with a Chicago building permit. The two Austin stations, on the Blue Line Forest Park branch and the Green Line Oak Park branch, are near Chicago and also showed up as a discrete station in the query results. Austin on the Blue Line was dead last, actually!

Let’s get back on track and look at Morgan now. I don’t think it’s fair to compare the Morgan station area with an expected, higher-activity area like the Loop and Central Business District so I eyeballed the list and started the #1 ranking with the first station outside the CBD.

  1. Armitage (Brown, Purple Express) is the station outside the CBD with the most building permits nearby.
  2. Damen-Milwaukee (Blue)
  3. North/Clybourn (Red)
  4. Addison (Red)
  5. Morgan (Green, Pink)

There you have it, from 2009 to today, the Morgan station had the fifth highest number of building permits outside of the Chicago Central Business District. It beat Fullerton (Red, Brown, Purple) in Lincoln Park, and Roosevelt (elevated and subway combined) in the South Loop. The station’s construction began in 2010 and the grand opening occurred May 24, 2012. During this period Morgan had the second highest amount of aggregated estimated costs at $199,911,953.00, behind North/Clybourn, at $218,118,037.37.

Take this analysis with several grains of Morton salt, though, because the following caveats are important to consider: building permits are really speculative development; much of these may be for kitchen renovations or porch reconstructions; I didn’t look up when it was “for sure” that the station was being built so I don’t know when developers would have become interested.

Looking at a longer period

I will, however, run a few more queries to find how Morgan’s position changes, starting with expanding the query to “all time” data (really the end of 2006 to today). It turns out that when looking through all available years Morgan’s position remains at #5 but other stations change position.

  1. Fullerton
  2. Armitage
  3. Damen-Milwaukee
  4. Addison
  5. Morgan

During this period, which covers the end of 2006 until today, Morgan had the highest aggregated estimated costs of the above five stations, at $236,707,083.00. It beat Fullerton’s amount of $160,825,680.30.

Looking only at “new construction”

Since these include all permit types, including water heater installations and window replacements, it doesn’t give us a good look at economic expansion in the areas surrounding CTA stations. I’ve filtered the data so only “new construction” building permits come through. I’m still interested in stations outside the CBD. Here’s how Morgan performed when looking at purely the quantity of new construction permits issued from 2009 to today:

  1. Armitage, 46 new construction building permits
  2. Southport, 38
  3. Addison (Red), 34
  4. North/Clybourn,
  5. Wellington,
  6. California-Milwaukee,
  7. Belmont (Red)
  8. Ashland (Green, Pink)
  9. Irving Park (Brown)
  10. Fullerton
  11. Damen (Brown)
  12. Division-Milwaukee
  13. Western-Milwaukee
  14. Ashland (Orange)
  15. Damen-Milwaukee
  16. Western-Congress
  17. Paulina
  18. Addison (Brown)
  19. Diversey
  20. Sedgwick
  21. Loyola
  22. Montrose (Brown)
  23. Sox-35th-Dan Ryan
  24. Morgan, 13 new construction building permits

Let’s remove that date filter and look at the whole building permits period of late 2006 to today.

  1. Southport (Brown Line), 80 new construction permits, all-time
  2. Armitage (Brown, Purple), 72
  3. Western-Congress (Blue), 66
  4. Addison (Red), 64
  5. Belmont (Red, Brown), 63
  6. Western-Milwaukee, 59
    Damen-Milwaukee, 59
  7. North/Clybourn, 55
    Diversey, 55
  8. Division-Milwaukee, 53
  9. Sox-35th-Dan Ryan, 51
  10. Wellington, 50
  11. 35-Bronzeville-IIT, 48
  12. Irving Park (Brown), 44
  13. Morgan, 43 new construction permits

Now switching the order method around and Morgan appears better when you look at aggregated estimated costs, from 2009 to today.

  1. Illinois Medical District, $236,020,000.00
  2. North/Clybourn, $172,373,335.00
  3. Loyola, $161,744,075.00
  4. Polk, $106,000,000.00
  5. Grand-Milwaukee, $77m224,500.00
  6. Wellington, $72m802,300.00
  7. Belmont (Red), $71,300,302.00
  8. Morgan, $68,300,800.00

Last query – remove the data filter and look at aggregated costs for the whole building permits period where Morgan maintains a top 10 position.

  1. North/Clybourn, $277029045.00
  2. Illinois Medical District, 236,020,000.00 (same as 2009 to today period)
  3. Polk, $188,794,975.00
  4. Loyola, $185,444,075.00
  5. Belmont (Red), $1635,00,085.00
  6. Fullerton, $129,444,051.00
  7. Wellington, $111,335,051.00
  8. Granville, $99,356,702.00
  9. Morgan, $83,995,800.00

The data I’d really like to have, though, is sales tax receipts for the same years.

This is not a valid PostgreSQL query. The brackets indicate the options I was using to retrieve the above results. The geometries are in or transformed to EPSG 3435 (Illinois StatePlane East Feet) and 1,320 feet is a quarter mile.

SELECT
 COUNT (P .permit_) AS count,
 MIN (C .longname) as name,
 min(lines) as lines, 
 sum(_estimated_cost) as sum
FROM
 permits P left join
 stations_cta C
ON
 ST_DWithin (
  ST_Transform (P .geometry, 3435),
  C .geom,
  1320
 )
[WHERE] [EXTRACT (YEAR FROM issue_date) >= 2009] [_permit_type = 'PERMIT - NEW CONSTRUCTION']
GROUP BY
 C .gid
ORDER BY
 [count,sum] DESC

History of Chicago streets

New bridge as seen from Division Street bridge

A new bridge over Halsted Street, opened in 2012 spanning the North Branch Channel on the west and north sides of Goose Island.

I found this document listing what appears to be all Chicago streets, their locations (relative to the Chicago grid whose origin is State/Madison), previous names, and namesakes. I doubt it’s hard to find, but I wasn’t looking for it. In fact, I was searching with queries people used before they came across Licensed Chicago Contractors.

“where is 435 w hobbie st chicago ill” was the specific query and I found Chicago Streets on the first page of results hosted on the Chicago History Museum’s website. (You’ll notice the hosting domain name has the acronym for the museum’s previous name, Chicago Historical Society.)

Halsted Street, which I often tell people is my favorite street because it goes through so many neighborhoods (with lots of gaps and railroads in between), was named after “two New York brothers William and Caleb who helped to develop the west end of the Loop.” William Butler Ogden, the first mayor of Chicago, named it.

Halsted Street was previously known as Egyptian Road from 1830 to 1837.

Halsted from Chicago Streets document

A screenshot of Halsted Street in the document.