At Brokersavant, we process large quantities of real estate assets ranging from commercial property flyers to large real estate leases and our customers expect a lightning fast turn around. Learn how we leveraged open source technologies and Python libraries to create a system that scales to millions of assets per day without missing a beat.
Processing documents isn't just about loading them using file() and extracting the text right from the document. Bad scans, images, mis-spellings, foreign languages, hundreds of document/image types and other reasons prevent us from taking the easy route to processing document assets we require in our software systems. In this talk, We'll dive into some practices I've learned from solving real world problems extracting documents such as leases, flyers and real estate comparison sheets from various global corporations and fortune 100 companies at scale. We will discuss the following topics that will help take your document processing to the next level: