Loading...
Development

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

1. AI Applications – Overview (2025 Landscape)

DomainKey Applications (2025)Leading Examples / Companies
HealthcareDiagnosis, Drug Discovery, Personalized MedicineGoogle DeepMind (AlphaFold 3), IBM Watson Health, Tempus
FinanceFraud Detection, Algorithmic Trading, Credit ScoringJPMorgan LOXM, PayPal fraud system, Upstart
TransportationAutonomous Vehicles, Traffic OptimizationTesla FSD v13, Waymo, Uber ATG
EducationPersonalized Tutoring, Automated GradingDuolingo, Khan Academy AI, Gradescope
EntertainmentContent Generation, Game AI, RecommendationNetflix, Midjourney, OpenAI Sora
ManufacturingPredictive Maintenance, Quality ControlSiemens MindSphere, GE Predix
AgriculturePrecision Farming, Crop MonitoringJohn Deere See & Spray, Blue River Technology
Defense & SecuritySurveillance, Cyber Defense, Drone SwarmsPalantir, Anduril, Israel’s Lavender system

2. Language Models (LLMs) – The Core of Modern AI

Evolution of Language Models

YearModel FamilySizeBreakthrough
2017TransformerAttention is All You Need paper
2018GPT-1117MGenerative Pre-training
2019GPT-21.5BZero-shot capabilities
2020GPT-3175BFew-shot learning
2023GPT-4 / Claude 2~1.7TMultimodal (text + image)
2024Llama 3 / Grok-2405B–1T+Open-source catching up
2025GPT-5 class models>10TReasoning, planning, long context (1M+)

Key Concepts (2025)

  • Pre-training → Instruction Tuning → Alignment (RLHF/RLAIF/DPO)
  • Retrieval-Augmented Generation (RAG) – LLMs + external knowledge
  • Mixture of Experts (MoE) – Only activate needed parameters (e.g., Mixtral, Grok-1)
  • Multimodal Models – Text + Image + Audio + Video (GPT-4o, Gemini 1.5, Claude 3.5)

Real-Life Impact (2025)

  • 70%+ of code on GitHub is now AI-generated (GitHub Copilot, Cursor)
  • Customer support: 90% of queries handled by AI agents (Ada, Intercom AI)
  • Education: Personalized tutors for millions (Khanmigo, Duolingo Max)

3. Information Retrieval (IR)

Finding relevant documents from large collections.

Classic IR → Modern Neural IR (2025)

ApproachMethodExample Tools (2025)
Boolean RetrievalAND, OR, NOTOld search engines
Vector Space ModelTF-IDF + Cosine similarityElasticsearch (classic)
BM25Probabilistic rankingStill used in many systems
Dense RetrievalEmbeddings (BERT, ColBERT)Cohere, Jina AI, Voyage AI
Hybrid RetrievalBM25 + Dense + Re-rankingMost production systems
Learned Sparse (SPLADE)Combines best of bothTop performer in BEIR benchmark

Real-Life: Google Search (2025) = MUM + Dense passages + Re-ranking with Gemini

4. Information Extraction (IE)

Extracting structured data from unstructured text.

Sub-tasks

  • Named Entity Recognition (NER) → Person, Org, Location
  • Relation Extraction → (Elon Musk, CEO_of, Tesla)
  • Event Extraction → (Company X, Acquired, Company Y, $10B, 2025)
  • Template Filling

2025 State-of-the-Art

  • Fine-tuned LLMs (GPT-4, Llama-3-70B-Instruct) outperform traditional models
  • Prompt engineering + JSON output mode = best IE system

Example Prompt for IE (2025 style)

Extract all company acquisitions from the text. Return as JSON:
{
  "acquisitions": [
    {"buyer": "...", "target": "...", "amount_usd": ..., "date": "..."}
  ]
}
Text: "Microsoft acquired Activision Blizzard for $69 billion in October 2023..."

5. Natural Language Processing (NLP) Pipeline (2025)

TaskTraditional Method2025 Method
TokenizationRule-basedByte-Pair Encoding (BPE), Tiktoken
POS TaggingHMM, CRFBuilt-in to LLMs
ParsingPCFGRarely needed (LLMs understand syntax)
Sentiment AnalysisVADER, TextBlobPrompt GPT-4o or Claude 3.5
Text ClassificationBERT fine-tuningFew-shot with Llama-3 405B
SummarizationExtractive (TextRank)Abstractive with Gemini 1.5 Flash
Question AnsweringBiDAFRAG with long-context models

6. Machine Translation (MT)

Evolution

  • Rule-based (1950s–1990s)
  • Statistical MT (1990s–2010s) → Google Translate (old)
  • Neural MT (2016+) → Transformer-based
  • 2025: SeamlessM4T v2, NLLB-200, Google Translate (Universal)

Zero-Shot & Multilingual Models (2025)

  • One model translates 200+ languages
  • Real-time voice-to-voice (e.g., Google Meet live translation)

7. Speech Processing

Speech Recognition (ASR) – 2025

  • Whisper (OpenAI) – Best open model
  • Google USM, Deepgram, AssemblyAI – Real-time, high accuracy
  • Word Error Rate (WER) < 3% on clean English

Text-to-Speech (TTS)

  • ElevenLabs, PlayHT, Respeecher – Voice cloning in seconds
  • Emotion & style control

End-to-End Voice AI (2025)

  • GPT-4o voice mode: Real-time conversation with emotion detection

8. Robotics – The Physical Embodiment of AI

A. Robot Hardware (2025)

Component2025 TechnologyExample Robots
ActuatorsHigh-torque brushless motors, series elasticBoston Dynamics Atlas, Tesla Bot
SensorsLiDAR, RGB-D cameras, tactile skins, IMUsFigure 01, Agility Robotics Digit
ComputeNVIDIA Jetson Orin NX (275 TOPS), custom AI chipsAll modern humanoid robots
BatteriesSolid-state batteries (higher density)Longer operation time

B. Perception

  • Computer Vision: YOLOv10, Segment Anything Model 2 (SAM-2)
  • SLAM (Simultaneous Localization & Mapping): ORB-SLAM3, Kimera
  • Tactile Sensing: GelSight, DIGIT sensors

C. Planning & Decision Making

  • Task & Motion Planning (TAMP)
  • Large Language Models for high-level planning (2025 breakthrough)
    • SayCan, Code as Policies, RT-2

Example: LLM + Robotics (2025)

# Pseudo-code: Robot uses LLM for planning
user_command = "Make me a cup of tea"
high_level_plan = llm.generate_plan(user_command)
# Output: 1. Go to kitchen 2. Find kettle 3. Fill with water...

for step in high_level_plan:
    low_level_actions = vision_language_model(step + current_camera_image)
    execute(low_level_actions)

D. Movement & Control

  • Reinforcement Learning (RL) for locomotion
  • Model Predictive Control (MPC)
  • Whole-body control (Boston Dynamics)

Leading Humanoid Robots (November 2025)

RobotCompanyStatus (2025)
AtlasBoston DynamicsElectric version, super agile
Optimus Gen 2TeslaWalking in factories
Figure 01Figure AIWorking in BMW plant (pilot)
ApolloApptronikWarehouse tasks
AmecaEngineered ArtsBest face/expressions

Summary Table – Unit V (2025 Perspective)

AreaDominant Technology (2025)Killer Application
Language ModelsMultimodal Transformers (10T+)AI assistants, code generation
Information RetrievalDense + Hybrid RetrievalSemantic search engines
NLPPrompting + Fine-tuning LLMsChatbots, content creation
Machine TranslationMultilingual seamless modelsReal-time global communication
SpeechEnd-to-end neural (Whisper, USM)Voice AI agents
RoboticsLLM-guided + Vision + RL controlHumanoid robots in homes/factories

Key Takeaway for 2025–2030
We are moving from “AI that talks” → “AI that sees, hears, and acts in the physical world.”
The next revolution = Embodied AI (Robots + LLMs) and AI Agents that can autonomously achieve complex goals.

You now have the complete big picture of AI applications in 2025! 🚀