Contact Info

Areas of Expertise

  • High Availability

  • Sub-second Services

  • Business Risk Analysis (SLA/KPI)

  • Team Training

  • Large Scale Installations

  • Network Storage Architecture and Implementation

Work Experience

Lookout - November 2014 to Present

Principal Engineer - Cloud Operations architect and performance engineering. Improved operations delivery quality through automation and infrastructure validation.

  • Improved hiring practices to encourage strong candidates that would grow the organization

  • Issue escalation for critical infrastructure (Network, Cassandra, MySQL, Elasticsearch, Kafka, etc)

  • Failure testing procedures and training

  • Migration to AWS from physical DC

  • Monitoring stack improvements (self service!)

  • Infrastructure automation advocacy, training, documentation, and implementation

Jive Software - August 2012 - November 2014

Principal Engineer - Team lead for SaaS operations group. Worked with multiple engineering teams and product groups to enable self-service engineering application deployment.

  • CI/CD pipeline design and implementation

  • Monitoring stack design and improvements

  • Automation architecture and implementation (puppet, ansible)

  • Escalation problem solving for devops team

  • Ops tool set standarization

  • Engineering Advocate inside TechOps Organization

Boltnet, Inc. - February 2010 to August 2012

Systems Architect - First employee, designed, implemented and scaled infrastructure to three data centers serving >15,000 hits/second across billions of landing pages.

  • Work with engineers to resolve scale/load issues

  • Designed and implemented the architecture that runs the BO.LT application (cloud and in-house)

  • Performance monitoring (BGP visualization, OpenNMS, Keynote, log parsing)

  • Release Engineering (git integration, test automation)

  • Configuration management (revision control, deployment)

TiVo Inc. - June 2008 to February 2010

Operations Architect. Worked with multiple operations and engineering groups to help architect and improve several different applications deployed internationally to over 700,000 concurrent clients.

  • Designed and implemented numerous improvements to the tool-set used by administrators to diagnose problems.

  • Improved monitoring coverage through configuration and helping guide monitoring application development (in-house application).

  • Improved procedures used by operations and engineering to deploy new applications and test performance changes.

  • Senior Operations Escalation point for real-time application problems.

  • Liaison with multiple engineering groups for future state architecture steering, bug diagnostics and prioritization.

  • Significantly reduced on-call 'fires' by tracking recurring problems and helping focus limited resources on the most beneficial fixes.

Atomz - January 2001 to June 2008

Designed, built, scaled, upgraded and maintained a large highly available network serving over 50,000 customers out of 4 data centers in 2 countries.

  • Achieved application availability of 99.999% by building a fail-safe BGP/DNS load sharing system

  • Developed real-time(<60s delay) dashboard for monitoring network statistics (traffic levels, request resource utilization, CPU/IO/Mem queue wait times, etc)

  • Built tools to improve eBGP peering, reduce overall customer latency and monitor BGP events on the internet that affect us or vendor networks we rely on

  • Disaster recovery design and implementation

  • Reduced down-time and improved incident response by designing clustering software and automated fail-over procedures

  • Setup monitoring and trend graphing (Cricket, Nagios, Smokeping, Cflowd/flowtools, centralized syslog)

  • Created many cost and time saving tools for network and system maintenance (i.e. RT<→IRC interface for easy ticket management)

  • Designed automatic provisioning system to reduce configuration mistakes and build-out time

  • Helped improve QA procedures for more thorough and automated testing (test case design and tool research)

  • Escalated issue resolution/troubleshooting for multiple business units 24x7

Nortel Networks - contract - November 2000 to December 2000

Responsible for QA and engineering lab machines running Solaris, HP-UX, AIX, OSF/1, IRIX, Linux, SunOS.

  • Implemented backup solution using Amanda

  • Hardware upgrades for Sun and IBM machines

  • Hardware diagnostics for Sun, IBM, SGI, and DEC equipment

  • SLA design for our group in relation to hardware reliability and network quality of service

Sanmina - contract - August 2000 to November 2000

Part of a 7 member team that maintained a production network of Oracle clusters serving 3000 concurrent users from over 75 office locations.

  • Employee and Machine information database converged into LDAP

  • Oracle disaster recovery architecture implementation

  • Sun hardware administration

  • Mail server performance tuning

Cisco - contract - April 2000 to August 2000

Part of a 30 member team which ensured the availability of the engineering infrastructure. This included build, e-mail, FTP, NFS, and web servers.

  • Train staff in UNIX diagnostics

  • Setup monitoring and display stations

  • Lab design and setup for training, using Cisco routers and Sun machines


Data Services

Cassandra, Elasticsearch, MySQL, Kafka, Zookeeper, HDFS, Redis, memcache

Programming Languages

Bourne shell, Nix, python, perl, awk, ruby

Automation Tooling

Chef, Puppet, Ansible, NixOps, Terraform

Operating systems


NixOS, FreeBSD, Solaris, Linux (CentOS, Debian, Ubuntu), OpenBSD, OS X


JunOS, FTOS, IOS, ScreenOS


I love photography, especially sharing ephemeral street art.

I also brew beer with a focus on old beer styles that are higher gravity and age well (24% ABV is my current personal best).