Loading...

Home/Resources

AI Systems Research Track

Modern AI Systems
Explained Deeply

Transformers → Training → Inference → Optimization → Agents → Evaluation → Production

8 Pillars of Production AI Systems (2025)

Model Architecture

•Transformers
•Mixture-of-Experts
•State Space Models
•Multimodal
•Long-context

Training & Fine-tuning

•LoRA / QLoRA
•FSDP / DeepSpeed
•ZeRO-Offload
•FlashAttention-3
•Unsloth

Inference & Serving

•vLLM
•TGI
•TensorRT-LLM
•SGLang
•PagedAttention
•Speculative Decoding

Retrieval & Memory

•RAG
•Vector DBs
•HyDE
•Re-ranking
•Graph RAG
•Memory-augmented LLMs

Agents & Tool Use

•ReAct
•Toolformer
•LangGraph
•AutoGen
•CrewAI
•OpenAI Swarm

Evaluation & Benchmarks

•MT-Bench
•Arena-Hard
•LiveBench
•GPQA
•HumanEval
•Big-Bench Hard

Safety & Alignment

•RLHF / DPO
•Constitutional AI
•Red teaming
•Adversarial robustness
•Jailbreak defense

MLOps & Deployment

•MLflow
•Weights & Biases
•LangSmith
•PromptLayer
•BentoML
•Modal
•RunPod

Recommended Learning Path (2025)

1

Foundations (1–2 months)

Understand transformers from scratch (nanoGPT, Karpathy lectures)
Master PyTorch fundamentals
Learn tokenization, attention, positional encodings

2

Fine-tuning & Optimization (1–2 months)

QLoRA / LoRA on consumer GPUs
FlashAttention, Unsloth, bitsandbytes
Gradient checkpointing, mixed precision, ZeRO

3

Production Inference (1 month)

vLLM vs TGI vs TensorRT-LLM
PagedAttention, continuous batching
Quantization (AWQ, GPTQ, GGUF)

4

RAG & Agents (1–2 months)

Vector DBs: Chroma, Weaviate, Pinecone, Qdrant
Advanced RAG: HyDE, self-query, multi-query
Build agents with LangGraph / AutoGen

5

Evaluation & Safety (ongoing)

LLM-as-a-judge, MT-Bench style evals
Red teaming & jailbreak resistance
Constitutional AI / RLHF basics

Weekly paper discussions • Live implementation sessions • Open research projects