Category: Data

How much vacant housing is there in Chicago?

How many houses, buildings, units, and lots are vacant in Chicago depends on how you measure them. I’m aware of at least five ways to measure housing vacancy, using publicly available data, each with varying degrees of accuracy and coverage.

Each measurement method has its own TAKEAWAY so you can quickly scan to see how many units are potentially vacant. The bottom line, I believe, is that the vacancy rate in Chicago is quite low and vacant units is not a viable and scalable source for the additional housing that Chicago needs. However, I am compelled to add that it’s unclear how many additional homes Chicago needs because there are no analyses that uncover the shortfall or recommend a goal production number.

1. Vacant units, per Census bureau

The US Census Bureau says there are 1,258,704 dwelling units in Chicago, and that 10.2% of them are vacant (128,796 units). The Census bureau has two types of vacant, which generally break down into “listed for rent or for sale and temporarily unoccupied” and “all other reasons”. This data comes from American Community Survey (ACS) 5-year survey.

The Census says 37.3% of Chicago’s vacant units are “temporarily unoccupied” (38,450 units) and 62.7% are vacant for “other reasons” (64,589 units). This data comes from ACS 1-year data, which has larger margins of error than the 5-year survey data. The “other reasons” are what I am calling “truly vacant”.

The most common survey response within “other reasons” is that the unit is currently being renovated or repaired (20.2%). Additional top “other reasons” are:

  • Currently being renovated/repaired: 20.2%
  • Personal/family reasons: 15.6%
  • Needs repairs: 14.6%
  • Preparing to rent/sell: 13.1%
  • Abandoned/possibly condemned: 10.1%

Up For Growth, a national policy organization, has assessed that Chicagoland has an underproduction of 129,218 homes (using 2019 data), and about 3.3 percent of the existing housing stock in the region. Assuming that 80 percent of this underproduction is “assigned” to the City of Chicago, then this underproduction represents about 1.6 times the number of vacant units (vacant for any reason in the Census Bureau’s survey) that likely exist in Chicago.

TAKEAWAY from #1: Fewer than all 64,589 units are “truly” vacant, however. About 21,508 of those units are purportedly being renovated or repaired and will be rented or sold soon.

A large number of the remaining 43,081 “truly” vacant units are unlikely to join the rental or sale market soon. Assuming that all of these 43,081 vacant units are rented or sold then Chicago, using that 80 percent assignment of the 129,218 underproduction, would still have an underproduction of 60,294 homes.

2. Vacant units, per Chicago’s Vacant Building Registry

From a February 2023 snapshot of the Chicago Vacant Building Registry, which requires landlords to register buildings as vacant once they are vacant for more than 30 days, there were 6,521 dwelling units that were reported by owners as vacant. Comparing this to the above Census bureau figure that nearly 65,000 units are truly vacant, this would mean that 90% of vacant units are not registered in the VBR. 

The reasons for vacancy were not included in the VBR data I received from the city. Also, these units are likely already included in the Census figure above and not in addition to.

TAKEAWAY from #2: There are at least 6,500 vacant dwelling units in Chicago. 

3. Buildings reported as vacant to 311

Chicagoans can report to 311 that they suspect a building to be vacant. There is no link to building violation citations or feedback on these reports as to whether the suspicion was founded by a city worker.

Notice how the number of reports dropped by about half from 2022 to 2023. I don’t know if there are fewer suspected buildings to be reported, fewer people are reporting buildings, or there are barriers to reporting and collecting the reports.

TAKEAWAY from #3: This data is probably not reliable to understand the number of vacant buildings in Chicago.

4. Buildings cited as needing to be registered in the Vacant Building Registry

In 2023, Chicago Department of Buildings inspectors cited 24 buildings with a violation for not being registered in the Vacant Building Registry.

TAKEAWAY from #4: If about 90% of vacant units are not registered then there are drastically few citations being issued to force registration.

5. Vacant lots, per Cook County Assessor Office

The CCAO classifies nearly every property in Cook County. The classification 1-00 represents vacant lots. At present there are 32,207 vacant lots in Chicago. 22,645 of these (70.3%) are in “R” zoning districts and allow only residential uses. Another 4,566 lots (14.2%) are in “B” and “C” neighborhood mixed-use zoning districts. 

TAKEAWAY from #5: Tens of thousands of new construction homes could be built on vacant lots.

Addendum: In its Chicago monthly market update for the multifamily housing sector, Colliers brokerage reported that the Chicago MSA “has a vacancy rate of 5.3%, which is below the national rate of 7.6%”.

A kludge to build a rental registry in Cook County 

Chicago should have a rental registry, a database of dwelling units that are rented to tenants, for at least two reasons:

  1. The city can know things about the rental units, including how much they cost, where they are, and if any are vacant and could be occupied if only people knew they were available and how to get in touch with the owner.
  2. The city can know who the owners are and contact them to issue citations or advise them, or fill out for them, emergency rental assistance during pandemics and other times of necessity.

Building and administering a rental registry from scratch would be very expensive – probably tens of millions to start and more than one million annually.

I propose a kludge that uses existing databases and modifies existing standard operating procedures amongst a small group of Cook County and Chicago agencies. A kludge is a workaround. It has other meanings and an uncertain etymology.

An ideal rental registry helps solve at least four problems:

  1. Identify who owns a rental home
  2. The number of rental units are in a building
  3. Rental price
  4. Rental unit availability [see my other blog post about counting vacant units]
A 9-unit apartment building in Little Italy is undergoing renovation.

The kludge has four parts

1. Incorporate data about the number of units declared on Real Estate Transfer Tax forms (which in Cook and many other counties are transmitted to the Illinois Department of Revenue digitally).

There is already a city office that reviews or audits these forms looking for instances where the buyer or seller incorrectly claimed certain exemptions from RETT, because of how the city can lose revenue. That office can also enforce that the number of units was correctly entered on the form. 

2. For banks that hold city deposits, amend legislation to require that their newly issued or refinanced mortgages specify the number of units in the required submitted documentation. The ordinance that regulates banks that hold city deposits was amended a few years ago to require that they report how many loans they issue in Chicago for both commercial and residential properties.

Databases 1 and 2 are checks for each other. 

3. “Hire” the Cook County Assessor’s Office to create and operate the database for the unit count data from 1 and 2 (likely as an augmentation of their existing database).

The database would also store any data the CCAO collects through the commercial valuation data they obtain from third party sources as well as from the owners who volunteer it (Assessor Kaegi is already collecting and publicly publishing this information). 

At this point, with features 1, 2, and 3, we are assembling a pretty broad but incomplete record of where rental units are. It will be come more complete over time as properties transfer (sell) and the details of the transfer (sale), and the properties themselves, are recorded.

It doesn’t have a clue as to the rental prices

4. The Cook County Assessor’s Office creates new property classifications. Property classifications allow for the comparison of like buildings for the purpose of establishing assessed values for all properties that are not tax exempt.

One of the most common classifications in Chicago is “2-11”, for apartment buildings with two to six units. This means that, generally, the value of the ubiquitous two-flats and three-flats get compared to other each other and sometimes to four-flats, etc.

I suggest that there should be a few new property classifications, but I have only one idea so far: classify limited equity and Chicago Housing Trust properties differently. 

Bickerdike is one organization that built a lot of limited equity row houses and detached houses in the 1990s and 2000s but I am not aware of a publicly accessible database identifying them.

These houses represent permanently affordable housing and we should have a better system to track them!

This screenshot of part of a spreadsheet is the apartments data that the Cook County Assessor’s Office collected for the 2021 tax year. 

How broad is the kludge?

  • Using the Real Estate Transfer Tax data from 2022 Q1 to Q3, there were 3,550 buildings in Chicago having 22,217 units transferred. (I don’t know how many were arms length transactions, meaning they were sold to new owners.)
  • In the CCAO’s apartments data collected for the Rogers Park Township, there is semi-detailed information about 715 buildings that have seven or more apartments comprising 18,541 units. Details include the unit size breakdown by bedroom count.

Chicago has 556,099 rented dwelling units in buildings with two or more units (according to the ACS 2021 1-year estimate). In my limited analysis I’ve already found data about 7.4 percent of them, and that’s only for part of the city [1].

Notes, limitations, and updates

[1] There may also be duplicates between the buildings in the RETT database and the CCAO apartments dataset.

These databases would not have information about detached (“single family”), single-unit semi-detached (rowhouses and townhouses), and condos used as rentals. This severely limits the coverage of information. As it stands, Chicago Cityscape has data coverage of unit count information for about 37 percent of multi-family (apartment) buildings.

5th Ward Alderperson Desmond Yancy proposed an ordinance that would establish a rental registry (O2023-0004085). The rationale for such is shown in the screenshot below. (Go directly to the ordinance’s PDF.)

Screenshot of the proposed rental registry benefits.

Creating a PostgreSQL PostGIS function to get around a DataTables Editor limitation

DataTables is a fantastic software that turns any HTML table into an interactive and editable one.

Chicago Cityscape uses DataTables on pretty much every page. DataTables also provides server-side processing for PHP to grab the right data from a database and show it to the user, 10 records at a time (the number of records can be changed by the user to show more records at a time).

Screenshot showing my new function, using the function, and the results.


One of the problems I’ve had to get around is that the DataTables Editor script recognizes a SELECT statement with only one function per field. If there’s a second function that’s manipulating a field in the SELECT statement then the ALIAS won’t work and the JavaScript will not be able to read and show the data properly.

I almost always use the two functions ST_AsGeoJSON and ST_Transform on my geographic data to (1) transform it from the SRID of 3435 to 4326 for displaying on web mercator maps, and (2) converting the GIS data into GeoJSON strings for easy transference to the Leaflet JavaScript library.

This is a known issue – see this thread on the DataTables forums – that Allan, the DataTables creator, has acknowledged and provided an alternative solution for.


It turns out that it’s easy to write a function that combines both functions.

CREATE OR REPLACE FUNCTION ST_TAGJP (field geometry, srid int =4326, simplify int =5)
  RETURNS text
RETURN ST_AsGeoJSON(ST_Transform(field, srid), simplify);

The code above is a function I call ST_TAGJP that combines the two functions I already described, and is flexible by letting the user specify in the arguments the table and field, the SRID to transform (reproject) to, and the simplify variable that can be used in ST_AsGeoJSON that tells it how many decimal points to use in coordinates.

Avoid “VARCHAR” columns in PostgreSQL [PGSQL Phriday #006]

This is a response to Grant.

This is one thing I wish I knew while learning PostgreSQL: Don’t ever use the VARCHAR column type. I have never encountered a situation where that would have been superior to using TEXT.

The problem with VARCHAR column type is that it can be a fixed-width column that doesn’t allow text longer than a certain number of characters. This maximum character length can be introduced by the person who creates the table and adds the column.

The maximum character length can also be created by a GIS program (specifically, QGIS and the ogr2ogr command line tool) as well as some online tools that can convert spreadsheets to SQL and even write the table definition and “CREATE TABLE” statement. In both cases, the program tried to guess the column type and seemed to default to VARCHAR(y character length) instead of TEXT.

The maximum character length is a problem because trying to insert more text than allowed into that field will encounter an error and the INSERT or UPDATE statement will fail. I think it’s also a bad idea to create these columns because it makes it expensive to change your mind later about what length of text will go into that field.

Other things I wish I knew earlier

  • Using transactions to test your query and its affect on your data so that you can easily undo any changes.
  • How and when to use indexes beyond the “btree” index type (for example, “gin” for array and JSONB columns).
  • Add “time_created_at” and “time_updated_at” columns to basically every dataset. This makes it easy to sort things later; sometimes when creating a new dataset it’s not always obvious how you might want to display or manipulate the data until days, weeks, or months later. Sorting data by “when was the information changed last” is a common use case for me.

How I used ST_ClusterDBSCAN to locate clusters of multiple, similar parcels

Alternative headline: A practical example of how to use ST_ClusterDBSCAN to find similar real estate properties.

Oftentimes a developer wants to acquire several adjacent lots for a single redevelopment. Each standard sized lot in Chicago is about 3,125 square feet (25 feet wide and 125 feet deep). Because of downzoning in 2004, and since, the zoning rules for many lots allow only about 3-4 dwelling units each. Multiple lots are required to develop buildings with 6-9 dwelling units, which is a sweet spot in Chicago for design and avoiding having to get an upzone.

Chicago Cityscape has long had Property Finder, a tool to locate parcels that meet exacting specifications given existing lot size, current zoning district, distance to transit, and other criteria.

Now, Chicago Cityscape can locate parcels that are adjacent or near each other that all meet the user’s specified criteria (what the website calls “filters”). This is possible because of the PostGIS function ST_ClusterDBSCAN.

ST_ClusterDBSCAN considers all geospatial features in your result set (whatever matches the WHERE clause) and assigns them to a cluster ID according to two inputs: minimum cluster size, and maximum distance each feature can be from any other feature in order to be considered in the same cluster as that other feature.

The function can also assign a feature with a cluster ID of NULL, indicating that the feature did not meet the clustering criteria and is alone.

Show me what that looks like

Chicago Cityscape gives the user three options to cluster: Small, compact clusters with at least 3 properties each; small, compact clusters with at least 5 properties each; large, loose clusters with at least 10 properties each.

Additionally, Chicago Cityscape lets the user choose between showing parcels that weren’t found in a cluster, or hiding parcels that weren’t found in a cluster. The reason to show parcels that weren’t found in a cluster is to visualize where there are and aren’t clusters of parcels in the same map.

A map of Chicago’s Near West Side community area is shown with clusters of vacant lots. The “show all properties” mode is used, which shows clusters with a thick, black outline. Properties that were not in a cluster are still shown but without the thick black outline (enlarge the photo to see the difference).

Sample query

This query looks at all of the vacant lots within 1 mile of the intersection of Washington Boulevard and Karlov Avenue in the West Garfield Park community area of Chicago. The query looks for clusters of at least 3 features (“minpoints”) that are no more than 25 feet apart (“eps”). (The data are projected in Illinois StatePlane East Feet, rather than a projection that’s in meters because it’s easier for me to work with feet.)

I posted another sample query below that’s used to exclude all of the features that were not assigned to a cluster.

SELECT pin14, ST_ClusterDBSCAN(geom, eps := 25, minpoints := 3) over () AS cid, geom
FROM parcels
WHERE property_class = '1-00'
	AND ST_DWithin(geom,
            ST_GeomFromText('POINT(-87.7278 41.8819)', 4326), 3435),

The screenshot below shows clusters of vacant lots that resulted from the query above. The parcels symbolized in a gray gradient were not assigned to a cluster. Notice how clusters will form across the alleys but not across streets; this is because the streets are wider than 25 feet but most alleys are only 16 feet wide.

The map shows various groups (clusters) of vacant properties in West Garfield Park. Each cluster is symbolized in QGIS using a different color. Properties that are not in a cluster are symbolized by a gray gradient.

Exclusion sample query

This query is the same as above except that a Common Table Expression (CTE) is used (CTEs have the “WITH” keyword at the beginning) to create a subquery. The “WITH” subquery is the one that clusters the parcels and the following query (“SELECT *”) throws out any features returned by the subquery that don’t have a cluster ID (the “cid” field).

with parcels as (
SELECT pin14, ST_ClusterDBSCAN(geom, eps := 25, minpoints := 3) over () AS cid, geom
FROM parcels
WHERE property_class = '1-00'
	AND ST_DWithin(geom,
            ST_GeomFromText('POINT(-87.7278 41.8819)', 4326), 3435),
) select * 
from parcels where cid is not null;

I would also recommend Dan Baston’s blog post from six years ago which has more commentary and explanation, and additional examples of how to use the function.