Nearly all disease will become treatable in our lifetimes, and drug discovery is quickly becoming an engineering discipline. Mandolin is building the “last-mile” delivery infrastructure that gets cutting-edge biologics, cell, and gene therapies to patients faster. Our AI-powered knowledge-worker platform already serves leading infusion clinics, with payers and pharma next in line.
We’re backed by Greylock, SignalFire, Maverick, and founders of famous companies like Yahoo, and led by repeat and exited founders with a team hailing from some of the most technically impressive companies.
Every product decision, ML model, and customer KPI depends on clean, trustworthy data. We’ve outgrown app-database queries and stitched-together dashboards—now we need a governed warehouse that supports real-time analytics, HIPAA-grade security, and feature engineering at scale. You will design that backbone and make data a first-class product across Mandolin.
Warehouse architecture – choose the stack, model the core marts, and enforce lineage and SLA guarantees.
Resilient ELT – build Airflow/Dagster pipelines with versioned schemas, incremental loads, and automated quality checks.
Self-serve analytics – deliver a documented semantic layer and dashboards that non-technical teams trust for daily decisions.
Security & compliance – implement encryption, RBAC, and audit trails that meet HIPAA and SOC 2.
Feature delivery – surface clean, performant datasets to ML, product, and forward-deployed teams.
Cost & latency tuning – keep storage bills low and queries fast as volume and complexity grow.
5 + years building and operating data platforms at scale.
Deep expertise in SQL modeling, orchestration (Airflow, Dagster, etc.), and columnar warehouses (BigQuery, Snowflake, Redshift).
Proven track record shipping business-critical dashboards to non-technical stakeholders.
Strong Python or TypeScript for data tooling and automation.
Security mindset—encryption, RBAC, audit logging in regulated environments.
Healthcare or revenue-cycle analytics.
Change-data-capture from document stores into relational models.
Real-time feature-store or streaming data pipelines.