In recent news, Zillow Inc. shut down their iBuying business, an initiative once forecasted to drive the company to $20 billion+ in annual revenue by 2025. Ultimately, their inability to launch such a program has set them back with significant recuperation costs, including a 25% workforce reduction. The failures of Zillow’s iBuying program were due to far more than just “bad machine learning”.
What is iBuying?
Put simply, when a home buyer wants to sell a home, they go to a real estate agent who lists the home for sale and matches them to a buyer. iBuying is an alternative business model in the real estate industry, most popularly used by tech companies such as Opendoor, where the seller approaches the company and they receive an offer after an appraisal. If the seller accepts the offer, the company then takes possession, performs upkeep/maintenance, and then sells it (hopefully quickly). With this model, the company makes money in two ways:
- The spread between the price they buy the home and the price at which they sell the house
- The fees associated with the sale (which can total approximately 10% of the sale price)
All iBuying today accounts for around 1% of the total sales of the US residential housing market.
Background on Zillow
The company was created in 2006 by people who had previously founded companies such as Expedia and Hotwire.com together in the late 90s. Zillow is primarily an online aggregator of houses for sale and, in its traditional business model, it sells leads to local real estate agents. Prospective home buyers can browse listings online, and if they are interested in learning more they can register interest with the real estate agent that is responsible for the listing. This business model makes Zillow on the order of $1.5 billion a quarter.
Zillow’s move into iBuying
In 2018, Zillow started buying homes as a part of the Zillow Offers program, an iBuying initiative. They saw the rise of companies like Opendoor, Redfin, and Offerpad and felt they could compete in this nascent and rapidly growing market. They had access to vast data on the US housing market, and felt they could quadruple revenues to more than $20 billion per year by 2025 with an iBuying program. According to a Bloomberg article, "the key to the iBuyer business is paying the right price, then making light repairs and selling it quickly at a markup"
What happened and why?
A few weeks ago, on Zillow's Q3 earnings call, CEO Rich Barton announced that the company was shutting down the Offers program, and laying off 25% of staff. They announced quarterly losses of $330 million (vs positive earnings of $40m from Q3 2020). They still have 10k houses on their balance sheet that they have to sell, likely at a loss.
There are a few reasons why this happened:
1. Official line: "Ultimately, in our short tenure operating Zillow Offers, we've experienced a series of extraordinary events, a global pandemic, a temporary freezing of the housing market, and then a supply demand imbalance that led to a rise in home prices at a rate that was without precedent. We have been unable to accurately forecast future home prices at different times in both directions by much more than we modeled [would be] possible. Put simply, our observed error rate has been far more volatile than we ever expected possible." - Rich Barton (Wall Street Journal)
- Basically, they are saying that the algorithms used to price homes were not properly calibrated to the housing market.
2. There is some evidence adding nuance to the claim that the failure was "because of the algorithm". In addition to algorithmic shortcomings, it seems ML/business management pushed for aggressive buying in response to conservative recommendations of the algorithm. In other words, they were pushing the business into paying too much for too many homes.
- "By the Spring, Zillow became fixated on another issue. The forecasting models it used to generate offers had underestimated breakneck home price appreciation in the early months of the year, meaning its pricing algorithms spit out relatively weak offers, preventing it from buying as many homes as it would’ve liked. Zillow turned up the dial in the second quarter, according to a person familiar with the decision, who asked not to be named because the matter is private. The move put Zillow out of step with competitors that had begun to take a more cautious stance, including Redfin Corp., which started making more conservative offers in March" (Bloomberg)
- "Zillow tried to [...] ramp up its home-flipping business to 5,000 transactions a month, which [CEO]Mr. Barton set as a goal, in a housing market that was already low on inventory and was starting to cool off" (NY Times)
3. Paying a more uncertain price (due a rapidly cooling housing market) for homes is not necessarily bad on its own as Zillow can sell them for a profit or even at break-even pricing (since they make fees on the transaction of approx. 10%). However, they could not fix up homes quickly enough because of a labor and supply shortage. There were operational inefficiencies.
- "The longer a company owns a home, the more it pays in mortgage interest, taxes, and insurance. Zillow’s loan covenants also limit the number of homes it can own for more than six months, according to documents viewed by Bloomberg. And the length of time it took Zillow to flip a home was growing. The median hold time for houses it sold in March was about 50 days, according to a Bloomberg analysis of records compiled by Attom Data Solutions. By October that number had increased to 84 days." (Bloomberg)
- To become profitable, the program needed to expand volume "10x". Barton claimed in a letter to employees that they did not have the operational overhead to turn over houses efficiently at that scale.
4. Market for Lemons / Adversarial Selection
- This is more speculative, but this effect outlines how difficult it can be to train a machine learning model to operate in an environment where other actors act in an adversarial way.
- A market for lemons describes an environment where a seller has more knowledge about the good than the buyer. The standard example is in the used car market where a seller knows the car is faulty in an obscure way but the buyer doesn't inspect it thoroughly enough and the transaction is mispriced against the buyer.
- Even if Zillow’s pricing algorithm was superior to that of sellers, they could still lose money in aggregate because sellers would only sell to Zillow when the company’s offer was far better than what they would receive in the open market due to bad qualities of the home missed during price modeling or appraisal.
Lessons
- In practice, machine learning models can sometimes be used as a justification for predetermined business decisions. We should sympathize with our data scientist customers that want to practice it properly. Provide them with the tools to justify their decision making. Given the underlying requirement that Zillow buy 5k houses a quarter, did data scientists appropriately tune models to do so safely? Did they have enough evidence to push back against management if this requirement was pushed.
- Modeling is extremely difficult in the real world, where once in a century pandemics and labor shortages can just...happen. Zillow has a really good data science team. That they had difficulties with this problem speaks to the complexity of machine learning in real life.
- "Zillow’s data sets and algorithms may tell it where and what Americans are looking to buy. But it [could be] less useful for predicting the direction of future home prices and whether a specific house will sell fast or not." (Financial Times)
- Zillow's competitors in the iBuying space have not really been impacted.
- This may be because they started from the ground up to be iBuyers (Opendoor, Offerpad). This comes with operational efficiency from the very beginning, which can be hard to spin up within a historically asset-light tech company.
- Redfin was quoted in the Bloomberg article to have slowed down the pace of buying in the Spring. Institutional overrides of machine learning models can be useful, but there should be guardrails when the decision to override can be catastrophic.
In light of this discussion the Robust Intelligence team ran RIME on a model created from a publicly available Zillow dataset. Some of RIME's core tests identified drops in the model's performance, which would have warned of the inaccuracies of Zillow's price estimation model.
To learn more about how RIME prevents similar issues, reach out here or request a demo to see how we protect against ML model failures.