U.S. Real Estate Market Cooling Down
Inflation adjusted analysis of mainland U.S. same home sales shows a cooling across Michigan and Massachusetts. For your enjoyment I have generated an animated heat map indicating the states that are burning above (red) and below (blue) the historical price trend for that area. The chart runs from 1975 through September 2007.
For every region I have done a regression analysis to determine the long-term historical growth rate. When local prices are way above sustainable long-term averages for that metropolitan area they contribute to a state being colored bright red. When home prices have fallen to be far below what they should be, they contribute to their state being colored deep blue.
Here’s how I did it. I built a mathematical model around the notion that a home price is dependent on volatility within a larger framework of inflation and regional factors such as population and income. To simplify the model I decided to reduce the regional factors into a single growth rate and set out to calculate the long term growth rate for every region.
Of course I needed data. I really liked that the U.S government’s Office of Federal Housing Enterprise Oversight (OFHEO) calculates a quarterly index based on same home sales. In order to calculate their trends they rely entirely on those homes that have sold two or more times. If there are two houses on the same street and one sold in 1980 for $100,000 and again in 1985 for the same price ($100,000) but a nearby house sold for $100,000 in 1980 and sold for $105,000 in 1986 then it can be inferred for that area that home prices did not change between 1980 and 1985 but increased 5% between 1985 and 1986. Clearly having access to lots of data helps establish trends, and that’s exactly what the OFHEO does.
Next I needed to remove inflation from the picture. To do that I chose to use the CPI data published by the Federal Reserve Bank of St. Louis to adjust prices into real dollars.
Then I logged the data and ran a least squared linear regression using the perl Regression.pm package off of CPAN using a constant weight for every data point. For each data point I then calculated the ratio of the prices relative to the trend line.
This was great. I had a massive spreadsheet and could look at any raw number I wanted. After a while of that I decided, “Hey, wouldn’t it be great to plot these trends?” Which I did. But it turned out that viewing a tangle of hundreds of metropolitan areas amounted to viewing a spaghetti of multicolored wiggles. Nonetheless I plowed ahead and graphed the discrete first derivative of every metropolitan area and noticed an unexpected smoothness in the data. So I analyzed the discrete second derivative and said to myself, “Wow, the time-variant trends are so predictable that this would make a great macroeconomic animation!!!”
But before I could make an animation I had to create even a single frame. I looked for charting software and found this amazingly helpful tutorial with sample code from IBM (Thank you IBM!!) that relies on the Cooperative Association for Internet Data Analysis (CAIDA) plot-latlong software (Thank you UCSD and NSF!!) to plot data onto maps.
I then annotated and animated the frames using ImageMagick convert. At this point I had an animated GIF, but I really wanted to let user move forward and back through the frames. So I converted the animation from GIF to FLA format using FFmpeg because that’s how all the cool cats like YouTube do it.
It was then I realized that I needed a bit more glue to make it all work. I researched for a couple hours and in the end decided to use the same javascript code as YouTube: Geoff Stearns’ SWFObject to play embedded shockwave files. To finish off what had started as a little project, I installed Jeroen Wijering’s FLV Media Player to provide the tidy controls and way cool full screen functionality.
All in all it was a nice “all nighter” project. I hope you find the animation educational and perhaps also a few coders out there can learn from some of the inside tricks on how I did it.
December 20th, 2007 at 7:55 pm
Wow, a nice little analysis!
One thing I wonder about is the OFHEO data: what about when property is bought by a developer and “repartitioned” into condos or something? And during the boom in California, it was very common in my area to buy a house, raze it to the ground, and build a bigger house… which means that “same home sales” is a weird number. I expect that it washes out when you’re looking at the aggregate numbers; as a case in point, my whole neighborhood upgraded in this way, so home prices rose together.
Another point: the linear regression feels “funny” to me. I know that this is just a first-pass analysis, but the general implication of a linear assumption is that prices always go up. And maybe that’s reasonable, given how the government is striving to expand the economy, but I wonder if there is some easy way to compare the prices with the economic health of a specific region. (Take, for instance, the influx of Googlers to Mountain View, and how that changes home prices in the area, despite the general California economy.)
Anyway, great job!
December 22nd, 2007 at 4:19 am
Impressive. If you’re going to be blogging more often, you’ve established a high standard to maintain.
On the other hand, you could simply invest in Michigan and never have to work again, so you’ll have plenty time for fascinating posts like this.
December 23rd, 2007 at 5:50 am
I agree with Evo’s comment. Excellent first blog post. Keep it going.
December 24th, 2007 at 4:20 am
I’m from Michigan. You won’t believe the bargains in this incredibly beautiful state.
December 24th, 2007 at 7:06 am
Do you have an interest in keeping this updated as the info continues to be released quarterly?
Wonderful data presentation. Thanks!
December 24th, 2007 at 5:53 pm
Bo– I suspect that you are right in your assumption that the OFHEO data does not take into account extensive remodeling or the supersizing of a small home into a McMansion. Depending on their sources, gov’t statisticians might have access to the reported land and home square footage such as provided by DataQuick in which case they could treat data discrepancies as discontinuities. A similar technique could be employed in the absence of home size data by finding the consensus growth rate and then systematically killing off one outlaying data point at a time until the desired fit quality is reached. A summary of the techniques actually employed by the OFHEO in compiling the data can be found at
http://www.ofheo.gov/Media/Archive/house/hpi_tech.pdf
Your point on non-linear local effects are in my opinion perfectly valid in the case of mononucleated one company towns such as Bethlehem, Pennsylvania whose fortunes rose and fell with the fortunes of Bethlehem Steel. However I think that your example of the Google effect on real estate prices near Mountain View, California is more of a scarcity issue due to low turnover momentarily driving up prices.
Eric– Sure, why not?! I’ve marked my calendar for February 26th when the Q4 data is scheduled to be released.
May 22nd, 2008 at 4:30 pm
[…] comes from today’s OFHEO data release. Methods are the same as my original home price heatmap. The dataset spreadsheet for this release is made available […]
August 26th, 2008 at 5:33 pm
[…] animation above is compiled from data released today by the OFHEO. Methodology is the same as my original home price heatmap. For the data nerds out there, you can download the compiled numerical […]
March 5th, 2009 at 12:09 pm
[…] His full methodology can be found here. […]
May 27th, 2009 at 6:42 am
[…] was generated based on data released today by the FHFA (formerly OFHEO). Methodology is that of my original home price heatmap. For the data hounds, you can download the compiled numerical results for Q1 […]