Umair Khan

(+92) 312-2377428 ยท umairedu@gmail.com

I am a Senior DevOps Engineer with over a decade of experience. I excel at building resilient, secure, and cost-effective cloud infrastructures. I have a proven track record of optimizing cloud spend, achieving 99.9% uptime, and enhancing system security.

I led the AWS Well Architected Framework review, earning official AWS certification for the platform. I built cost-efficient Kubernetes-based AWS infrastructure with strong observability and security. I also set up a PCI-DSS compliant fintech platform on DigitalOcean using Kubernetes and Istio. Implementing Istio service mesh with mTLS significantly cut security risks. I standardized deployments with reusable Helm charts, reducing errors and speeding up rollbacks. My monitoring solutions with Prometheus and Grafana, coupled with anomaly detection, support high uptime. I automated AWS cost optimization, cutting cloud spend by 25%.


Experience

Lead DevOps Engineer

SchoolVoice
  • Led and executed the AWS Well Architected Framework review, enabling the platform to earn official AWS Well Architected certification for security, reliability, performance, and cost-optimized architecture.
  • Designed Kubernetes infrastructure with Spot and On-Demand node groups that reduced cloud costs while maintaining high availability during node failures and traffic spikes.
  • Built comprehensive observability stack using Prometheus, Grafana, and AlertManager with custom webhook integrations for Slack notifications and automated playbook execution to reducing mean time to resolution (MTTR).
  • Configured Oathkeeper to validate JWT tokens, extract user information from token payloads, and pass authenticated user context to downstream services.
  • Built real time data replication pipeline from AWS RDS Aurora to OpenSearch using Kafka and Debezium.
Dec 2021 - Present

Lead DevOps Engineer

VentureDive
  • Onboarded a fintech project from scratch on DigitalOcean using Kubernetes and Istio, designing infrastructure patterns that enabled the platform to pass PCI DSS compliance.
  • Implemented Istio service mesh across microservices, reducing security vulnerabilities by 70% through mTLS encryption and centralized policy enforcement, while maintaining sub-100ms latency overhead.
  • Standardized Kubernetes deployments across 15+ projects using reusable Helm charts, reducing deployment errors by 80% and enabling consistent rollback capabilities.
July 2019 - Nov 2021

Sr. DevOps Engineer

Careem
  • Designed and maintained CI/CD pipelines on Jenkins that enabled safe service deployments. Built comprehensive monitoring framework using Prometheus, Grafana, and custom exporters, reducing mean time to detect (MTTD) production issues.
  • Implemented proactive alerting using Prometheus standard deviation functions, enabling early detection of anomalies before they impacted users and maintaining 99.9% service availability.
  • Automated AWS infrastructure cost optimization using Python scripts, identifying and removing orphan resources that reduced monthly cloud spend by 25% without impacting service performance.
  • Automated blue-green deployments using Python scripts that provisioned Elastic Beanstalk via Terraform, monitored CloudWatch metrics, and sent Slack notifications, enabling data-driven deployment decisions and reducing rollback time from hours to minutes.
Jan 2017 - Jul 2019

Sr. System Administrator

Right Solution
  • Designed and maintained AWS infrastructure including EC2, RDS, VPC, and auto-scaling groups, ensuring high availability and cost optimization for production workloads.
  • Deployed and maintained 3-node Proxmox VE high-availability cluster with Ceph storage replication, achieving zero- downtime VM failover capabilities.
  • Designed and implemented high-traffic Linux web infrastructure using Nginx, Apache, and Varnish with Nginx Plus load balancing, supporting millions of requests per day.
Dec 2013 - Jan 2017

Sr. System Administrator

ARPATECH
  • Deployed and managed VMware ESX environments across local and remote data centers with automated VM backup using GhetoVCB, ensuring disaster recovery capabilities.
  • Configured Varnish with Nginx load balancing and auto-failover, improving website performance and maintaining service availability during server failures.
  • Automated infrastructure management using VMware PowerCLi scripts for VM reporting, Mikrotik scripts for firewall management, and bash scripts for remote deployments.
Nov 2011 - Dec 2013

Skills

Kubernetes tools

  • Helm
  • Falco
  • Istio
  • Sealed Secrets
  • Prometheus/Grafana/EFK

DevOps Tools

  • Terraform
  • Terragrunt
  • Ansible
  • Jenkins, Gitlab, Bitbucket, Github CI/CD Pipelines

GitOps tools

  • ArgoCd
  • Flux
  • Kapitan

Clouds

  • AWS
  • DigitalOcean
  • GCP

Programming

  • Python
  • PHP
  • Bash

Certifications


Education

Preston University

B.S. Computer Science