Manikranth Bheemineni

Sr. Site Reliability Engineer / DevOps Engineer

Build, Maintain, Improve ∞ Repeat

I build self-healing, secure multi-cloud platforms at global scale

Manikranth Bheemineni

About Me

Reliability engineer because even computers need therapy sometimes. I speak fluent YAML and dream in Terraform. I architect platforms that serve 50M+ users with 99.99% uptime across global regions.

With deep expertise in Azure, AWS, and GCP, I specialize in Infrastructure as Code, containerization, and implementing resilient cloud architectures. I've reduced MTTR by 40% through automation, improved platform availability to 99.99%, and prevented over $1.2M in annual revenue loss.

My approach combines technical excellence with strategic thinking, ensuring infrastructure is not just reliable but also secure, cost-effective, and aligned with business objectives. I'm also passionate about ML Operations and fine-tuning LLMs for infrastructure automation.

Cloud Architecture
DevOps
Infrastructure
Security
ML & LLMs

Professional Experience

Feb 2021 - Present

Senior Site Reliability Engineer

PwC

  • Led SRE guild of 15 engineers, setting roadmap and conducting quarterly performance check-ins
  • Architect and manage multi-cloud infrastructure (GCP, AWS, Azure) serving 50M+ users with 99.99% uptime across global regions
  • Improved platform availability by 0.05% (from 99.94% → 99.99%), preventing ~$1.2M in annual revenue loss
  • Designed and implemented service mesh architecture enabling secure service-to-service communication with mTLS encryption, reducing security incidents by 85% and avoiding an estimated $800K in potential breach costs
  • Built ChatGPT-powered SRE assistant that leverages OpenAI to help SREs filter through documentation for faster incident resolution, reducing MTTR by 40%
  • Implemented comprehensive BGP-based network architecture connecting multi-cloud environments with automated failover capabilities
  • Developed auto-remediation playbooks that resolved 95% of PagerDuty alerts without human intervention, reducing operational overhead by approximately $250K annually
Oct 2023 - Jan 2024

Security Architect / REGO Developer

Coinbase

  • Designed and implemented policy-as-code framework using Rego/OPA for multi-cloud environments, enforcing security compliance across 300+ Kubernetes clusters processing $1B+ daily transactions
  • Created security architecture for service mesh implementation, ensuring encrypted communication between microservices with mTLS
  • Developed automated cloud security monitoring solution that integrates with GuardDuty, Security Hub, and custom SIEM tooling
  • Implemented runtime scanning (Falco) and endpoint protection for container workloads, detecting 98% of threats in real-time
  • Developed VPC architecture with segmentation and least-privilege access controls, reducing attack surface by 60%
  • Created automated security testing framework for infrastructure changes, reducing security review cycles by 75%
Dec 2019 - Feb 2021

Cloud Architect / SRE Engineer

FedEx

  • Led migration from Snowflake to Databricks with focus on security and compliance, reducing data processing costs by 40%
  • Implemented fault-tolerant and high-availability solutions for critical shipping systems handling 15M+ packages daily
  • Designed serverless microservices architecture using Lambda and Docker, reducing infrastructure costs by 35%
  • Orchestrated backup processes for mission-critical systems with 99.999% recovery success rate
  • Managed AWS services including EC2, RDS, VPC, S3, CloudWatch, reducing manual operational overhead by 65%
Jan 2019 - Nov 2019

DevOps / SRE Engineer

FedEx

  • Developed infrastructure as code using Terraform/Puppet/CloudFormation, achieving 100% automation of deployment processes
  • Built developer environments and automated provisioning tasks, reducing environment setup time from 3 days to 30 minutes
  • Set up CI/CD pipelines using Jenkins, Maven, GitHub, and CHEF, enabling 20+ daily deployments with 99.8% success rate
  • Implemented auto-scaling, load balancing, and cloud infrastructure monitoring, handling 300% traffic spikes without performance degradation

Leadership & Strategy

Team Building & Mentorship

  • Chaired company-wide Security Infrastructure Working Group, establishing cross-organizational standards adopted by 5 business units
  • Developed SRE team charter and OKRs; held monthly engineering-strategy syncs; managed hiring process for 3 new SRE roles
  • Participate in global 3-week rotating 24/7 on-call schedule, handling avg. 5 sev-1 incidents per quarter with < 15 min MTTR
  • Mentored 8 junior engineers in SRE practices, cloud security, and network architecture, improving team retention by 25%

Process Improvement & Strategic Initiatives

  • Developed on-call handbook and incident management procedures adopted across organization
  • Established SRE maturity model to benchmark and improve reliability practices across 5 engineering departments
  • Led cross-functional initiatives to reduce technical debt, resulting in 30% decrease in incident frequency
  • Delivered technical presentations at cloud security conferences (AWS re:Invent, KubeCon) on infrastructure security at scale
  • Directed $1.2M reliability and security initiative through bi-weekly steering committee meetings with senior stakeholders

Technical Skills

Cloud Platforms

AWS
Azure
GCP
Firebase

Infrastructure as Code

Terraform
CloudFormation
Ansible
Puppet
Chef

Containers & Orchestration

Docker
Kubernetes
Service Mesh
OPA/Rego

CI/CD & DevOps

Jenkins
GitLab CI/CD
Azure DevOps
Snyk

Monitoring & Observability

Datadog
Grafana
Prometheus
Nagios
CloudWatch
PagerDuty
Splunk

Messaging & Integration

RabbitMQ
Kafka
gRPC
RESTful APIs

Machine Learning & AI

LLM Fine-Tuning
Prompt Engineering
ML Operations
AI Infrastructure

Security

mTLS
BGP
IAM/RBAC
Vulnerability Management

Projects & Experiments

PromptCraft-AI

Experiments with prompt engineering and LLM optimization that occasionally produce something useful!

AI Python NLP
View Project

RSVP-website

A smart event RSVP system with AI-powered suggestions - because even your party planning deserves machine learning assistance.

React Firebase ML
View Project

Project_Eve

My digital assistant experiment that's only slightly less judgmental than HAL 9000. Explores natural language processing and conversational AI.

NLP Python TensorFlow
View Project

Paper Plane

A fully-featured digital invitation platform where creating invitations is simple and sharing happens at light speed!

React Firebase UX/UI
Live Demo

Certifications

AWS Solutions Architect Associate

Amazon Web Services

Azure Administrator Associate

Microsoft Azure

Implementing Microsoft Azure Infrastructure Solutions

Microsoft Azure

Associate Google Cloud Certification

Google Cloud Platform

Education

2018

Master's in Engineering Management

Eastern Michigan University, Michigan, USA

2015

Bachelor of Electrical Communication Engineering

J B Institute of Engineering and Technology, India

Volunteering

2018

Graduate Assistant

Jan 2016 - April 2018

  • Worked as a Teaching Assistant for the course titled "Switching & Routing"
  • Conducted 10 lab sessions every week to help students master the network
2017

Nonprofit Leadership Alliance

Active Member, 2017 - 2018

The Certified Nonprofit Professional (CNP) credential is the only national credential that combines critical skills and knowledge, practical experience and a national perspective.

Get In Touch