Loading...
AI Systems Research Track

Modern AI Systems Explained Deeply

Transformers → Training → Inference → Optimization → Agents → Evaluation → Production

8 Pillars of Production AI Systems (2025)

Model Architecture
  • Transformers
  • Mixture-of-Experts
  • State Space Models
  • Multimodal
  • Long-context
Training & Fine-tuning
  • LoRA / QLoRA
  • FSDP / DeepSpeed
  • ZeRO-Offload
  • FlashAttention-3
  • Unsloth
Inference & Serving
  • vLLM
  • TGI
  • TensorRT-LLM
  • SGLang
  • PagedAttention
  • Speculative Decoding
Retrieval & Memory
  • RAG
  • Vector DBs
  • HyDE
  • Re-ranking
  • Graph RAG
  • Memory-augmented LLMs
Agents & Tool Use
  • ReAct
  • Toolformer
  • LangGraph
  • AutoGen
  • CrewAI
  • OpenAI Swarm
Evaluation & Benchmarks
  • MT-Bench
  • Arena-Hard
  • LiveBench
  • GPQA
  • HumanEval
  • Big-Bench Hard
Safety & Alignment
  • RLHF / DPO
  • Constitutional AI
  • Red teaming
  • Adversarial robustness
  • Jailbreak defense
MLOps & Deployment
  • MLflow
  • Weights & Biases
  • LangSmith
  • PromptLayer
  • BentoML
  • Modal
  • RunPod

Recommended Learning Path (2025)

1

Foundations (1–2 months)

  • Understand transformers from scratch (nanoGPT, Karpathy lectures)
  • Master PyTorch fundamentals
  • Learn tokenization, attention, positional encodings
2

Fine-tuning & Optimization (1–2 months)

  • QLoRA / LoRA on consumer GPUs
  • FlashAttention, Unsloth, bitsandbytes
  • Gradient checkpointing, mixed precision, ZeRO
3

Production Inference (1 month)

  • vLLM vs TGI vs TensorRT-LLM
  • PagedAttention, continuous batching
  • Quantization (AWQ, GPTQ, GGUF)
4

RAG & Agents (1–2 months)

  • Vector DBs: Chroma, Weaviate, Pinecone, Qdrant
  • Advanced RAG: HyDE, self-query, multi-query
  • Build agents with LangGraph / AutoGen
5

Evaluation & Safety (ongoing)

  • LLM-as-a-judge, MT-Bench style evals
  • Red teaming & jailbreak resistance
  • Constitutional AI / RLHF basics

Weekly paper discussions • Live implementation sessions • Open research projects