LLM/SLM Engineering

Building and Scaling Production Language Model

Duration: 9 Weeks, 2 sessions per week
Time Needed: 2 hours lecture per session
Assignments: 4+ hours hands-on per week
Project: Required project presentation at the end of the course

👋  Welcome! Explore the wonders of Large language models. At the end of this course, you will be able to build and deploy LLM applications with confidence. You will be equipped with the knowledge on how LLM works from tokens and embeddings to fine-tuning your own models and using agents to build production grade applications.

✨ The course concludes with a Live Demo Day where you’ll showcase your innovative projects!

🧑‍💻  Your LLM Journey

  • Foundations: Begin with the transformer model, exploring each layer as we build GPT from scratch. Master concepts like positional encoding, self-attention, and multi-head attention. Get ready to tinker with the mechanics to uncover the “magic” behind LLMs.
  • Enabling LLMs: Learn the best practices for prompt engineering, retrieval-augmented generation, fine-tuning, and agent integration. By this stage, you’ll understand how to seamlessly incorporate LLMs into your applications and identify the best approach for your use case.
  • Deployment & Monitoring: Discover how to evaluate, deploy, and monitor LLM performance. Explore setups like on-premises deployments, Hugging Face Spaces, and monitoring tools such as Weights & Biases (WandB) and LangFuse.

🧑‍🤝‍🧑 Learning Activities and Support

  • Every week, you will go through assignments and group discussions.
  • One-on-one office hours will also be available in case you need more TLC.

🚀 Get ready for an interactive and exciting learning experience!

Part 1: The Foundations

The foundation explores the “first principles” of large language models by thoroughly dissecting the Transformer architecture. This is a deep-dive into each and every component and layer of the Transformer model so that learners understand the intuition behind the learning capabilities of these advanced models.
  
Session #1: Great Expectations
  • Introduction
  • AI Today and Tomorrow
  • Overview of AI Engineering
  • What’s SOTA (State of the Art)
Session #2: The Transformer Model (1/2)
  • The Transformer Architecture
  • Encoder and Decoder Blocks
  • BART, BERT, GPT-style
Session #3: The Transformer Model (2/2)
  • Write Encoder-Decoder Transformer Model from scratch
  • Explain each component along the way
Session #4: The Building Blocks (1/2)
We will look into each layer of transformer model and use this visualization tool.
  • Tokenization: Text to Tokens
  • Positional Encoding: Understanding the Order in Context
  • Normalization, Dropout and Residual Connections
  • Embedding Layers: Words to Numbers
Session #5: The Building Blocks (2/2)
  • Attention Layers: Self-attention and Multi-head attention
  • Feedforward Network Layers: Firing the engines
  • LayerNorms: Stabilizing the network
  • Projection Layer: Shaping the output
Session #6: Next Token Prediction
  • Temperature  From precision to creativity
  • Decoding Methods – Top K, Top P and more
  • Guardrails – controlling the outputs
Session #7: Pre-Training & Alignment
  • Loss and Cross Entropy Loss: The model’s compass
  • RLHF (Reinforcement Learning with Human Feedback)

Part 2: Enabling LLMs

The second part focuses on best practices for integrating large language models (LLMs) into applications, beginning with Retrieval Augmented Generation (RAG) and advanced techniques to improve context relevance. It further explores fine-tuning models using efficient parameter adaptation, quantization, and practical implementation with small language models (SLMs). Additionally, it delves into optimizing inference through agents, highlighting function calling, LangGraph, and reasoning-action frameworks for enhanced performance and functionality.

Session #8: Enabling LLMs – Retrieval Augmented Generation (1/2)
  • Why Context Matters?
  • Retrieval Augmented Generation
  • Introduction to Vector Database
  • RAG Evaluation (RAGAS)
Session #9: Enabling LLMs – Retrieval Augmented Generation (2/2)
  • Langchain, LlamaIndex RAG
  • Fusion, Reciprocal Rank Fusion
  • RAPTOR and other variants of RAG
  • Your First RAG application
Session #10: Enabling LLMs – Fine-Tuning Embeddings and Models
  • Understanding parameter-efficient methods and low-rank adaptation
  • Quantization
  • Fine-tuning SLM on HuggingFace
Session #11: Enabling LLMs – Improving Inference with Agents
  • Function calling
  • LangGraph 
  • The Reasoning-Action (ReAct) Agents
Session #12: Project Preparation
  • Innovation and Ideation
  • Industry Use Cases
  • Brainstorming Techniques and Strategies

Part 3: Productioning LLM Apps

The final part of the program focuses on evaluating and optimizing LLM applications, covering benchmarks, monitoring performance, and scaling effectively. Participants will learn practical techniques like caching prompts, managing requests, and creating efficient data pipelines to build scalable applications. It also includes deploying open-source solutions, measuring performance, and fine-tuning hardware for efficiency. The program culminates with a demo day, where learners showcase their projects to industry experts and the public, followed by a certification ceremony.

Session #13: Evaluation and Monitoring LLM Applications
  • Massive Text Embedding Benchmark (MTEB)
  • Monitoring and Visibility: Efficient inference, scaling, and tracking performance
Session #14: Improving Inference and Optimization Strategies 
  • Semantic Chunking
  • Prompt and Embedding Caching
  • Request Queues
  • Data pipelines
  • Building Scalable RAG application
Session #15: Deploying Scalable Open-Source Endpoints
  • Measuring with LangFuse
  • HuggingFace Text Generation
  • GPT-Generated Unified Format (GGUF)
  • Rightsizing the GPU
Session #16: Demo Day Rehearsal
  • Demo Day Rehearsal and Feedback
Session #17: Demo Day and Graduation
Live LLM Demo with industry practitioners and the public. 
  • Public Presentation and Demo Day
  • Graduation and Certification Ceremony

Jumpstart your AI Career Now! Learn the state of the art in AI and join the AI bandwagon.
Price: P 50,000.00

Sign up here now!
Note: A skill assessment will be conducted prior to acceptance in this program.
Scholarships and discounts available for highly qualified applicants.

Scroll to Top