SYS JS.DEV
BUILD F3CRCW
DATE 2026.04.26
UTC 01:30 UTC
LOC NYC → STANFORD
STATUS OPEN TO ML/SYSTEMS ROLES

Curriculum Vitae

Backend & ML systems engineer. Founding engineer at RocketReach. Five-plus-year tenure across analytics infrastructure, contact-data systems, applied AI, and the multi-provider LLM gateway that backs every AI feature at the company.

Experience

Founding Engineer — RocketReach

OCT 2020 — PRESENT  ·  REMOTE (BELLEVUE → SF → NYC)

One of the first engineers, hired during Series A. Five-plus-year tenure through architectural eras: analytics infrastructure → core data platform → contact-data systems at scale → applied AI / LLM infra. Reverse-chronological below; further detail and case-study links on Work.

§ RocketReach — 2026

  • Designed and shipped the AI-sampled rules model for the community-program matching pipeline. 30× throughput, 92% cost reduction, 2.5× more usable contact data extracted vs. the prior production ML model. Most-valued data contribution to current exit conversations.
  • Built an internal phone-data-normalization package that re-canonicalizes 250M phones across 3B+ data points into a scalable waterfall ingestion system. PM characterized it as the single largest data improvement of the past year.

§ RocketReach — 2025

Email verification + domain-success prediction
  • Rewrote in-lookup email verification to run in parallel with explicit candidate prioritization.
  • Built a full AI-driven domain-success tracking system, doubling prediction success.
  • Drove lookup success rate from ~55% (start of 2024) to ~90% (end of 2025) — direct credit/revenue multiplier.
Contact-data normalization
  • Solo-shipped migration of 4B+ contact records from nested JSON on the profile model into normalized profile_email and profile_phone tables.
  • Real-time sync via low-level Django field-type overrides + custom managers; in-memory dedup; persistence on profile.save().
  • Launched with zero downtime, zero performance regression. Unblocks GDPR removals and monolith decomposition.
5x5 data-partner integration
  • Highly efficient Airflow DAG built almost entirely on Redshift merge joins; ETL processes 150M record updates per month in 4 hours.
  • Pattern (Redshift compute → S3 → Lambda → queue → workers) adopted as the company-wide template for high-throughput data work.
  • Privacy-aware pre-filtering by hashed contact comparison in-query — no PII persisted.
AI data-enrichment service
  • Built the scalable backend AI enrichment service that opened a new upsellable product class (API enrichments).
  • Hackathon prototype that the company subsequently picked up as the multi-quarter GenAI roadmap.
Privacy-removals system
  • Built the GDPR-compliant profile-removal system, made possible by the earlier contact-data normalization (contact-to-profile reverse lookup via hashed contact data).

§ RocketReach — 2024

RocketVerify — custom async SMTP
  • Designed and led a scalable email-verification service implementing a custom async SMTP protocol; scaled to ~500M emails/month.
  • Heavy Playwright integration for the long tail of providers that resist standard SMTP probing.
Lookup parallelization & offline verification
  • Reimplemented lookups multi-threaded; reduced P90 latency from ~12s to ~3s.
  • Built offline verification on top of RocketVerify with ML-based candidate prioritization; cached the most expensive part of lookups ~90% of the time.
Multi-provider LLM gateway (with Cam)
  • Co-built the in-house service that abstracts over Anthropic, OpenAI, xAI, and Gemini behind a single interface, with native websearch.
  • Backbone for every AI feature at the company: industry classification, AI enrichment, domain-success prediction, AI-sampled rules extraction, privacy tooling.
  • Owns routing, failover, rate limits, retries, cost accounting, secret rotation, and unified observability.
Industry classification — LLM labeling
  • Redesigned industry categorization using vectorized semantic industries + LLM-based labeling; lifted fill rate from ~10% to ~99%.
Orphan-profile trimming
  • Heuristic + ML scoring methods to trim indexed profiles. Improved search latency and reduced Elasticsearch index size.

§ RocketReach — 2023

Redshift warehouse
  • Stood up the company's first data warehouse. Ran a Redshift Serverless pilot directly with the CTO and our AWS rep; killed the pilot for being a poor fit for our workload shape, launched provisioned instead.
  • Re-implemented sitemap generation on Redshift with merge-join geometry; 8 days → 4 hours (~48× speedup). Directly load-bearing for SEO, the primary top-of-funnel acquisition channel.
Microservices: SpaCy + Profile-Photo
  • Co-designed scalable FastAPI microservices. Authored most of the Terraform.
  • Solo-led the Profile-Photo microservice; debugged and resolved memory-leak issues in the facial-recognition stack via Python memory profiler.
Brightdata phone enrichment
  • Designed scalable system for Endato data ingestion + matching; produced 32M+ new and boosted phones; substantial Terraform; dashboard for ingestion progress.
Emergency SendGrid replacement
  • Replaced SendGrid with a message-queue model (SES + SNS + SQS) processing email activity in real time.
PeoplePro email enrichment
  • Overhauled the PeoplePro pipeline; added 14M+ new emails (10M more in flight). Source accounts for ~30% of phones sold.
Academic — entity resolution package
  • Built a Python package implementing entity resolution as BERT embeddings + multilayer perceptron. Advised by a JHU CS professor.
  • Read dozens of papers on deep-learning ER; benchmarked embedding families and similarity geometries.

§ RocketReach — 2022

Analytics + data quality
  • Overhauled finance/revenue ETLs into a real-time customer-events model across all payment processors (Recurly, Stripe, Adyen, Braintree); reduced ETL errors to ~0%.
  • Created the data-engineering interview assessment; led 3-month onboarding of three new engineers.
  • Migrated all analytics code out of the rr monolith; built custom query wrappers, job scheduling, Slack alerting, and dev tooling.
  • Overhauled enterprise attribution; first accurate enterprise revenue, cash, and deal-flow reporting.
Core data
  • Sourced, QA'd, and integrated 10+ datasets for UK and French people data — added 2M+ emails and phones.
  • Built DataPerson / DataPersonManager abstractions, dramatically cutting time to integrate a new dataset; substantial query-optimization and indexing work.
  • Designed a precursor to slowly-changing dimensions via snapshot + diff.

§ RocketReach — 2020 / 2021

Analytics infrastructure (zero → one)
  • Built the company's initial OLAP database / analytics warehouse. Authored the first full suite of product reporting.
  • Built attribution infrastructure linking individual transactions and subscriptions across Braintree, Stripe, Recurly, and Adyen back to users; first cash-to-user attribution at the company.
  • Optimized + scaled ETLs to handle a significant volume increase, ~50× ETL performance improvement.
Data-quality + data-partner work
  • Sourced hundreds of contact-data providers; helped negotiate and close new provider deals.
  • Drove integration of the PeoplePro 2021 dataset (~16% of all phones in the database).

Earlier roles & undergrad

Lead Junior Analyst — Resolve Growth Partners

Sep 2019 — Oct 2020

Part-time during school; full-time May–Oct 2020. Selected as 1 of 2 summer analysts on a 4-person investment team managing a $125M growth-equity fund focused on B2B SaaS.

  • Sourced and qualified 2,000+ investment leads; managed a 15,000-company CRM.
  • Built investor decks for life-sciences and field-service-management deals.
  • Performed retention modeling for 25+ software companies.
  • Increased outreach response rate by ~400% on target prospects.

Founder & President — Johns Hopkins Quantitative Finance Society

Jun 2019 — Dec 2021

Co-founded a graduate quant-finance research society as a long-short systematic-trading group with $30K AUM. Faculty advisor: Prof. John Miller (JHU Applied Math & Statistics). 20+ student researchers (8 PhD).

  • Built a strategy-agnostic backtesting engine — local equity-data persistence, portfolio tracking, performance evaluation.
  • Implemented and evaluated pairs-trading and momentum strategies; produced cointegration heatmaps and a recursive fractal generator for non-normal return modeling.
  • Regular technical talks to 100+ JHU students. WorldQuant Challenge Semi-Finalist.

Woodrow Wilson Research Fellow — Johns Hopkins University

Feb 2019 — 2020

Advisor: Prof. Jonathan H. Wright. $10K research fellowship.

  • Investigated the implied–realized volatility gap in equity options markets.

Partner / Associate — A-Level Capital

2019 — 2020

Selected as 1 of 7 from ~100 applicants for the JHU-affiliated VC firm (~$530K inaugural fund).

  • Sourced 40+ companies, held calls with 20 founders, closed 2 deals.
  • Sourced a wearables company that subsequently raised ~70× its initial seed.

Data Analyst — Massachusetts Land Company

Jun 2019 — Sep 2019
  • Built scrapers for statewide MLS housing data across 576 Massachusetts ZIP codes.
  • Conducted broker and developer interviews to inform price-prediction analytics.

Data Science Intern — Empower Schools

Jul 2017 — Sep 2017
  • Analyzed accountability and performance data for 150,000+ students across the Lawrence and Springfield, MA school districts.
  • Built an SVM classifier predicting student college outcomes with ~80% accuracy.
  • Presented findings directly to the Empower Schools CEO/founder and executive team.

Data Science & Bioinformatics Intern — Curoverse Inc. (acq. by Veritas Genetics)

Summer 2017
  • Built an SVM classifier predicting eye color with ~95% accuracy.
  • Presented at the Harvard I2B2 TranSMART Symposium. DOI: 10.5281/zenodo.1045265.

Executive Delegate (Internship) — U.S. Department of Education — Massachusetts State Student Advisory Council

May 2017 — Jun 2018
  • Represented the Greater Boston area on the Massachusetts State Student Advisory Council.
  • Led a project identifying districts with outdated wellness policies — 24% of MA districts non-compliant; collaborated with principals and superintendents on revisions.
  • Co-developed a campaign on the student mental-health crisis in Boston schools.

Education

Johns Hopkins University — B.S. Computer Science and Economics. Graduated 2023, finishing remaining credits while working full-time at RocketReach.

Selected publications & honors

Technical fluency

DomainExpertStrongWorking
Languages Python · SQL Bash · TypeScript
Backend Django · PostgreSQL FastAPI · Pydantic · Aurora · Elasticsearch · Redis pgvector · Playwright · spaCy
Data infra Redshift · query optimization · indexing · ETL perf · ER pipelines Airflow · replication · real-time sync · SCDs
ML / AI Entity resolution (applied + research) BERT / transformer embeddings · MLP · Anthropic / OpenAI / xAI / Gemini APIs · prompt eng · LLM labeling RAG · pgvector · SVM
AWS ECS · Lambda · S3 · SES · SNS · SQS · Aurora · IAM · CloudWatch EventBridge · Athena · ALB/NLB
Infra / DevOps Git Terraform · GH Actions · CircleCI · Jenkins Docker (via ECS)
Observability Datadog · Sentry · CloudWatch · custom Slack alerting

"Expert" = shipped at scale, can defend every design choice. "Strong" = production-debugged. "Working" = competent, haven't pushed limits.

Aggregate scale