⇦ All Articles


Middesk is a Y-Combinator graduate (W19) in the financial space that helps companies perform due diligence for B2B loans, acquisitions, and business development.

I upgraded their prototype data pipeline from proof of concept to a robust production-ready CI/CD enabled workflow. I established a testing process for jobs and business logic, implemented automated code deployment, automated job starting, tracking, and restarting, and implemented logging, metrics, and error capturing during job execution. I improved code architecture and eliminated redundant and dead code. I reduced runtime and cost by optimizing joins and reducing job steps. At the end of the engagement I left Middesk a well-documented, automated, cleanly coded pipeline capable of scaling to billions of records, ready for further development.

Shortly after our engagement, Middesk raised $4m from Sequoia, Accel, and Y-Combinator on the strength of their data product.

Our data processing engine was Apache Beam via Scio (a Scala wrapper library), running on Google Cloud’s Dataflow platform.

Published 23 Oct 2017

Randle Unger on Twitter