Course Curriculum

    1. Overview

    1. Rationale for LLM and Agent Evaluation

    2. Components of LLM Evaluation

    3. Tasks and Benchmark Datasets for Evaluation

    4. Challenges in LLM Evaluation

    5. Quiz: LLM Evaluation Fundamentals

    1. Classic and Contextual Embedding Approaches

    2. BLUE, ROUGE and BERT Score

    3. Evaluating RAG-Based Applications

    4. Faithfulness

    5. Answer Relevancy

    6. Context Precision

    7. Context Recall

    8. Evaluation of RAG Applications Using RAGAS

    1. Evaluating a RAG Application Using RAGAs

    2. LLM-As-A-Judge Evaluation

    3. Classic Evaluation Metrics

    4. Semantic Similarity With BERTScore

    1. Slide Deck

    2. OpenAI API Key Setup

About this course

  • Free
  • 20 lessons
  • 1.5 hours of video content