Download PDF

Work Experience

Senior Software Engineer

Oct '20Present
Netflix (Remote)

Building Distributing Tracing infrastructure as a member of Observability Engineering.

  • Currently developing a new system for trace data storage, modeled after data lake architectures. 
  • Lead the design and implementation of a new system for trace collection. Developed a sidecar/collector to support higher volume, more reliable transport with lower overhead
  • Developed long-requested aggregation and analytics features in the platform by deriving metrics from trace data, using Druid and Kafka
  • Introduced a new form of sampling for trace data allowing non-ingress, mid-tier applications to make local decisions about sampling a request. Implemented this in platform libraries and rolled this out across the Netflix fleet (design)
  • Scaled tracing infrastructure by 10x to handle increased load from new domains
  • Migrated the core storage system for trace data from Elasticsearch to a managed time-series abstraction, which reduced cost by ~20% while improving search capabilities 

Senior Software Engineer

Sep '18September '20
Lightstep (acq. by ServiceNow) in San Francisco, CA

Built tools for robust root-cause analysis of latency and errors based on distributed traces, system metrics and logs from spans.

  • Core contributor to an industry-first product that provided flexible analysis of large collections of traces (blog)
    • The feature allowed complex filtering and grouping of distributed trace data and identified system attributes that are correlated with high latency & errors
    • As part of this project, designed and migrated a core monolithic service into a distributed leader-worker model that horizontally scales to meet demand
  • Developed a product to visualize system architecture as a nodes & edges diagram derived from distributed trace data (blog)
  • Led a project to integrate with other open-source tracing libraries (Jaeger, Zipkin) to improve data ingestion flexibility and capability
  • Proposed and prototyped several ideas that expanded analysis capabilities and were eventually integrated into the product

Senior Software Engineer, Tech Lead

Jan '16Sep '18
Rally Health in San Francisco, CA
  • Led a team of six engineers building features for Rally Engage. A consumer product that promotes healthy habits through rewards & incentives
    • Built a robust recommendation system to suggest relevant activities based on a user’s health goals and habits which led to increased engagement and participation 
    • Architected and implemented a foundational system for translating strings on-demand based on a user’s locale
    • Migrated a legacy, monolithic application into several scalable, independently deployable services. These microservices had lower latency, higher test coverage and better error handling (blog)
  • Member of Eng. Tech Staff, serving as a technical leader on many large initiatives
    • Led a "Night's Watch" team of ten engineers that triaged and fixed critical performance bottlenecks and successfully scaled services to meet increased demand on Jan 1 

Software Engineer

Jun '13Dec '15
Opower (acq. by Oracle) in Arlington, VA
  • Core contributor to a data integration platform for ingesting 100 billion utility data points/year from over 100+ unique clients (blog)
    • Migrated a legacy data import framework into a batched process, which reduced data import time from several hours to a few minutes 
    • Developed several data validation techniques, improving data quality and accuracy and tools to visualize ETL process and surface data quality issues to client

Created withVisualCV