Objective
Software Engineer and SRE with many years of experience in designing, building, and maintaining large scale distributed systems. Seeking a challenging role to utilize my diverse skills and experience in an innovative environment.
Work Experience
One.app - Jul 2024 - Now
Site Reliability Engineer, Remote from Dallas, TX
- Building observability pipeline
Instacart - Jan 2018 - Feb 2024
Senior Software Engineer, Remote from Dallas, TX
- Designed and built a set of Elasticsearch clusters with transparent multi-region failover and the proxy that enabled it. This saved hours of downtime each year.
- Automated safe EC2 host level maintenance on a large cluster of PostgreSQL nodes to apply security updates and address cloud hardware problems.
- Designed and built an approach for handling imbalance that saved millions of dollars in hosting costs while maintaining system reliability.
- Chaos tested systems including Redis, PostgreSQL, Golang, and Rust to improve overall reliability when host-level problems arise.
- Patent granted for logging query structures while maintaining sizing information.
Skills used: Golang, Rust, Ruby on Rails, Python, AWS EC2, AWS ECS, PostgreSQL, Elasticsearch, Redis, Datadog, Terraform
Twilio - Apr 2017 - Dec 2017
Senior Software Engineer, San Francisco, CA
- Developed automation for integrating Elasticsearch with internal deployment systems.
- Chaos testing and remediation of AWS EC2 hosted Elasticsearch.
Skills used: Scala, Kafka, Java, Elasticsearch, AWS EC2, Datadog
Twitter - Sep 2014 - Mar 2017
Senior Site Reliability Engineer, San Francisco, CA
- Automated day-to-day operations for very large Hadoop clusters (10k+ nodes) including removing nodes for maintenance and re-adding them to the clusters.
- Worked closely with Twitter networking team to maximize available bandwidth for Hadoop without impacting front-end traffic.
- Developed server and service problem detection and remediation tools used to manage hundreds of thousands of servers and millions of task instances.
- Site-wide load testing to ensure uptime during high traffic load times.
Skills used: Scala, Hadoop, NoSQL (Cassandra-like system), Mesos/Aurora, Network QOS
LivingSocial - Jun 2013 - Aug 2014
Senior Big Data Engineer, Washington, DC
- Tripled processing capacity of existing Hadoop cluster through job scheduling optimizations and system tuning.
- Built data platform with vetted data sets for analyst use.
Lotame - Dec 2010 - May 2013
Senior Software Engineer, Columbia, MD
- Provide tools for ad-hoc data analysis using Pig and Hive
- Continually monitor and improve the batch data pipeline
Earlier employment history may be viewed on LinkedIn https://www.linkedin.com/in/jpmeagher/
Education
- Johns Hopkins University, Columbia, MD — MSCS Computer Science
- Colorado School of Mines, Golden, CO — BSEE