Sourcing apartment listing data
In the early 1970s, if you wanted to purchase a stock, you would need to engage a broker, who would charge you a fixed commission of nearly 1%. If you wanted to purchase an airline ticket, you would need to contact a travel agent, who would earn a commission of around 7%. And if you wanted to sell a home, you would contact a real estate agent, who would earn a commission of 6%. In 2018, you can do the first two essentially for free. The last one remains as it was in the 1970s.
Why is this the case and, more importantly, what does any of this have to do with machine learning? The reality is, it all comes down to data, and who has access to that data.
You might assume that you could easily access troves of real estate listing data quite easily through APIs or by web scraping real estate websites. You would be wrong. Well, wrong if you intend to follow the terms and conditions of those sites. Real estate data is tightly controlled by the National Association of Realtors (NAR), who run the Multiple Listing Service (MLS). This is a service that aggregates listing data, and is only available to brokers and agents at great expense. So, as you can imagine, they aren't too keen on letting just anyone download it en masse.
This is unfortunate, since opening up this data would undoubtedly lead to useful consumer applications. This seems especially important for a purchase decision that represents the largest portion of a family's budget.
With that said, not all hope is lost, as not every site explicitly bans scraping.