Available for opportunities

Venkat Namala

AI Infrastructure Architect

GPU • Kubernetes • Hybrid Cloud Platforms

Leander, TX 512-991-3999 vsrn09@gmail.com

About Me

AI Infrastructure Architect with 18+ years of experience designing and operating large-scale distributed platforms. Specializing in GPU-accelerated Kubernetes, HPC, hybrid cloud, and enterprise AI infrastructure across multi-cloud architectures. Trusted technical leader delivering resilient, cost-efficient, and secure AI platforms across semiconductor, financial services, government, healthcare, retail, and energy domains.

GPU & HPC

CUDA, RDMA, InfiniBand, SLURM, topology-aware scheduling for distributed AI training

Kubernetes & OpenShift

Enterprise-grade container orchestration across AKS, EKS, GKE, bare-metal, and VMware

Multi-Cloud

AWS, Azure, GCP, IBM Cloud — architecting resilient, cost-efficient AI platforms

DevSecOps

GitOps, CI/CD, IaC automation with Terraform, Helm, ArgoCD, and security governance

Experience

18+ years building platforms at scale

  • Architected GPU-accelerated Kubernetes and OpenShift platforms supporting large-scale AI/ML training and inference workloads across on-prem bare-metal, VMware, and Azure environments
  • Integrated SLURM with Kubernetes to enable hybrid, GPU-aware scheduling across research, AI, and HPC workloads
  • Engineered RDMA and InfiniBand-enabled clusters with topology-aware scheduling for low-latency distributed training
  • Defined GPU capacity planning, bin-packing, and utilization strategies across heterogeneous AMD and NVIDIA GPU fleets
  • Built enterprise-grade observability for AI platforms using Prometheus, Grafana, Azure Monitor with GPU-level metrics
  • Implemented IaC automation using Terraform, MAAS, Helm, and GitOps pipelines for GPU-enabled Kubernetes infrastructure
OpenShift 4.xKubernetesCanonical MAASAzureAWSGCPGPU (AMD/NVIDIA)SLURMRDMA/InfiniBandTerraformHelmArgo CDPrometheusGrafana

Technical Skills

Cloud Platforms

AWSAzureGoogle CloudIBM CloudCloud RunCloud FunctionsLambdaAPI Gateway

Kubernetes & Containers

KubernetesRed Hat OpenShiftAKSEKSGKEGKE AutopilotBare Metal K8sCanonical MAASVMware vSphereDockerPCF

AI / HPC

GPU SchedulingSLURMCUDA-aware WorkloadsRDMAInfiniBandGPU Direct StorageTopology-Aware Scheduling

DevOps & IaC

TerraformAnsibleHelmArgoCDGitHub ActionsJenkinsTektonSonarQubeJFrog Artifactory

Observability

PrometheusGrafanaLokiELK StackSplunkAzure MonitorCloudWatchGoogle Cloud Operations

Data & Streaming

KafkaConfluentIBM MQAzure Event HubGoogle Pub/SubCDC PipelinesActiveMQ

Languages

JavaPythonGoJavaScriptSQLC#C++Bash

Security

OAuth2OIDCSAMLRBACIAMBinary AuthorizationZero Trust

Storage

OpenShift Data FoundationCephAzure StorageGCSS3Parallel Storage

Certifications & Competencies

Certifications

Certified Kubernetes Administrator (CKA)

CNCF / Linux Foundation

IBM Cloud Pak for Multicloud Management - Architect

IBM

MuleSoft Certified Developer

MuleSoft

PG Program - Multi-Cloud Solution Architect

UT Austin

PG Program - Azure GenAI for Business

UT Austin

Core Competencies

AI Infrastructure Architecture
GPU & HPC Platforms (CUDA, RDMA, InfiniBand)
Kubernetes & OpenShift Engineering
SLURM + Kubernetes Hybrid Scheduling
Hybrid & Multi-Cloud AI Platforms
DevSecOps & GitOps Automation
High-Performance Storage & Data Pipelines
Observability & Reliability Engineering
Security, IAM & Multi-Tenant Isolation
Architecture Governance & Reference Models

Education

Bachelor of Technology in Computer Science and Engineering — SV University

Leadership & Community

Telugu Cultural Association of Austin — President (2017-2018), Board of Director (2018-2022)

Let's Connect

Interested in discussing AI infrastructure, Kubernetes platforms, or cloud architecture? I'd love to hear from you.

Email

vsrn09@gmail.com

Phone

512-991-3999

Location

Leander, TX

Send me an email