Middesk is a Y-Combinator graduate (W19) in the financial space that helps companies perform due diligence for B2B loans, acquisitions, and business development.
I upgraded their prototype data pipeline from proof of concept to a robust production-ready CI/CD enabled workflow. I established a testing process for jobs and business logic, implemented automated code deployment, automated job starting, tracking, and restarting, and implemented logging, metrics, and error capturing during job execution. I improved code architecture and eliminated redundant and dead code. I reduced runtime and cost by optimizing joins and reducing job steps. At the end of the engagement I left Middesk a well-documented, automated, cleanly coded pipeline capable of scaling to billions of records, ready for further development.
Shortly after our engagement, Middesk raised $4m from Sequoia, Accel, and Y-Combinator on the strength of their data product.
Our data processing engine was Apache Beam via Scio (a Scala wrapper library), running on Google Cloud’s Dataflow platform.