Milad Jahandideh MJ

Milad Jahandideh

Site Reliability Engineer | Tech Lead

About

Site Reliability Engineer and Tech Lead with 7+ years of experience designing, operating, and scaling high-availability cloud infrastructure serving thousands of users. At ArvanCloud, I lead SRE initiatives across a large-scale IaaS platform built on OpenStack, Ceph, and Kubernetes — owning reliability, observability, and infrastructure automation across distributed systems.

I combine deep technical execution with engineering leadership: defining SLOs, driving incident culture, and guiding teams to build reliable systems at scale. My background spans private cloud infrastructure, container orchestration, network engineering, and full-cycle DevOps automation.

Experience

Site Reliability Engineer / Tech Lead
ArvanCloud.ir · Full-time
Nov 2020 – Present

ArvanCloud is a leading Iranian cloud provider delivering Infrastructure as a Service (IaaS) at scale, built on OpenStack, Ceph, and Kubernetes.

  • Lead the SRE chapter for ArvanCloud's IaaS platform, defining SLOs, owning incident response processes, and establishing reliability standards across core infrastructure services.
  • Architected and delivered a VPC project connecting three OpenStack clusters via VXLAN overlays and BGP EVPN routing using OVN and Open vSwitch, enabling secure and unified inter-cluster networking at scale; published an open-source OVN/OVS CLI cheatsheet on GitHub.
  • Designed and scaled the observability stack using Prometheus, Grafana, and custom alerting rules, significantly reducing MTTR and improving incident detection across distributed systems.
  • Deployed and operated multiple production Kubernetes clusters, managing dozens of microservices via Helm charts and GitOps workflows with ArgoCD.
  • Integrated Ceph RBD with OpenStack Cinder for persistent block storage and deployed Ceph CSI for Kubernetes persistent volume provisioning; published an open-source Ceph CLI cheatsheet on GitHub.
  • Implemented Load Balancer as a Service (LBaaS) using OpenStack Octavia, extending self-service networking capabilities for cloud tenants.
  • Built and maintained CI/CD pipelines with GitLab CI and standardized Infrastructure-as-Code practices with Ansible and Terraform, enabling consistent and automated deployments.
  • Participate in on-call rotations, lead incident response, and author post-mortems to drive systemic reliability improvements.
  • Maintain operational documentation including architecture diagrams, runbooks, and on-call playbooks to support team scaling and knowledge transfer.
Linux System Administrator
Mahsan.co · Full-time
Dec 2018 – Nov 2020
  • Administered and maintained a large-scale Linux server environment, supporting deployments across hundreds of servers for a defense-sector organization.
  • Deployed VMware ESXi virtualization infrastructure, enabling isolated development and testing environments for engineering teams.
  • Eliminated manual toil by automating infrastructure operations with Ansible and Shell Scripting.
  • Containerized and migrated monolithic applications to LXC, improving resource efficiency and deployment repeatability.
  • Built a centralized logging platform using the ELK Stack to aggregate and analyze logs from thousands of servers, enabling proactive issue detection.
  • Configured Zabbix monitoring with alerting, enabling proactive capacity tracking and server health visibility.
Embedded Systems Developer
Adeeco · Full-time
Dec 2017 – Nov 2018
  • Developed firmware for industrial embedded systems using AVR and ARM microcontrollers.
  • Designed and built hardware-software integrated solutions for automation and control applications.

Education

MS – Technology and Innovation Management
Iran University of Science and Technology
BS – Information and Communications Technology
Shamsipour Technical and Vocational College
AS – Electronic Engineering
Technical and Vocational University
Diploma – Electronics
Vocational School

Projects

Melec.ir
Founder & Webmaster · Electronics & Microcontroller education community

Skills