About

I'm Michał Wojdylak, an AI Infrastructure Engineer who builds and operates the systems that take machine learning from notebooks to reliable, scalable production services. My work sits at the intersection of machine learning, distributed systems, and cloud infrastructure.

I focus on the parts of AI that have to work at 3am: serving large language models with predictable latency, designing inference platforms that scale with demand, optimizing GPU utilization and cost, and building the MLOps tooling that lets teams ship models safely and often.

I care about clean architecture, observability, reproducibility, and systems that are simple to reason about. This blog is where I share what I learn building production AI infrastructure.

Skills

AI Infrastructure

  • GPU cluster orchestration
  • Distributed training
  • Model serving
  • Autoscaling inference

AWS

  • EKS
  • SageMaker
  • EC2 / GPU instances
  • S3
  • Lambda
  • IAM

LLM Deployment

  • vLLM
  • TGI
  • Triton Inference Server
  • Quantization
  • KV caching

MLOps

  • MLflow
  • Kubeflow
  • CI/CD for models
  • Feature stores
  • Model registry

Computer Vision

  • PyTorch
  • ONNX
  • TensorRT
  • Real-time pipelines
  • Edge deployment

Cloud Architecture

  • Kubernetes
  • Terraform
  • Docker
  • Service mesh
  • Observability

Inference Optimization

  • Batching & streaming
  • Latency tuning
  • Throughput scaling
  • Cost optimization