victorhg/AAI520 - Natural Language Processing and GenAI.md

## AAI520 - Natural Language Processing and GenAI.md

      
    Raw
  

              AAI520 - Natural Language Processing and GenAI.md
            
          
    Natural Language Processing and GenAI - AAI 520

Livro Texto: Speech and Language Processing - Dan Jurafsky and James H. Martin
Reading Resources and Modules

Module 1 :

Textbook:
Speech and Language Processing —3rd ed. (Book website).
Jurafsky, D., & Martin, J. H. (2025)

Read Chapter 2: Regular Expressions, Tokenization, Edit Distance Links to an external site. pages 13 to 29 (Sections 2.2 - 2.8):

Section 2.2: Words
Section 2.3: Corpora
Section 2.4: Simple Unix Tools for Word Tokenization
Section 2.5: Word and Subword Tokenization
Section 2.6: Word Normalization, Lemmatization and Stemming
Section 2.7: Sentence Segmentation
Section 2.8: Minimum Edit Distance


Articles:

Read the Introduction to NLP - Part 1: Preprocessing Text in Python .
Read the Natural Language Processing | Text Preprocessing | Spacy vs NLTK
Read the NLP: Tokenization, Stemming, Lemmatization, and Part of Speech Tagging.

Required Media


Watch the Natural Language Processing In 10 Minutes 
Watch the Natural Language Processing In 5 Minutes 

Module 2: Named Entity Recognition (NER) and Part-of-Speech (PoS) Tagging

Textbook:
Speech and Language Processing .—3rd ed. (Book website).
Jurafsky, D., & Martin, J. H. (2025)

Read Chapter 17 Sequence Labeling for Parts of Speech and Named Entities  pages 362 to 375 (Sections 17.1 - 17.4):

17.1 (Mostly) English Word Classes
17.2 Part-of-Speech Tagging
17.3 Named Entities and Named Entity Tagging
17.4 HMM Part-of-Speech Tagging


Articles:
Read the Linguistic Features-Entity Linking
Read the Part of Speech Tagging for Beginners
Read the Understanding Named Entity Recognition: What Is It And How To Use It In Natural Language Processing?
Required Media

Watch the Text processing, POS tagging, and Named entity recognition - Part 2
Module 3: Transformers

Tunstall, L., Von Werra, L., & Wolf, T. (2022). Natural Language Processing with Transformers. O'Reilly Media

Read Chapter 1 Hello Transformers pages 1 to 14.
Read Chapter 3 Transformer Anatomy pages 57 to 75.

Textbook:
Speech and Language Processing .—3rd ed. (Book website).
Jurafsky, D., & Martin, J. H. (2025)

Read Chapter 9 The Transformer Links to an external site. pages 184 to 200 (Sections 9.1- 9.5):

9.1 Attention
9.2 Transformer Blocks
9.3 Parallelizing computation using a single matrix
9.4 The input: embeddings for token and position
9.5 The Language Modeling Head


Articles:

Read the The Illustrated Transformer 

Required Media


Watch the Attention Is All You Need
Watch the What Are Transformers?

Module 4: Large Language Models(LLMs)

Textbook:
Speech and Language Processing —3rd ed. (Book website).

Read Chapter 10 Large Language Models pages 203 to 219 (Sections 10.1-10.6):

10.1 Large Language Models with Transformers
10.2 Sampling for LLM Generation
10.3 Pretraining Large Language Models
10.4 Evaluating Large Language Models
10.5 Dealing with Scale
10.6 Potential Harms from Language Models


Articles:

Read the What is LLM (Large Language Model) 
Read the How Large Language Models Work: From Zero to Chat GPT 
Read the Language Models are Few-Shot Learners .
Read the Build a Retrieval Augmented Generation (RAG) App: Part 1 
Read the Build a Retrieval Augmented Generation (RAG) App: Part 2 

Required Media


Watch the What is Retrieval-Augmented Generation (RAG)? . (06:35)
Watch the LangChain Master Class For Beginners 2024   (3:17:15) Closed captioning available.

Module 5: Prompt Engineering

Textbook:
Speech and Language Processing—3rd ed. (Book website).

Read Chapter 12: Model Alignment, Prompting, and In-Context Learning pages 242 to 258 (Sections 12.1-12.7):

12.1 Prompting
12.2 Post-training and Model Alignment
12.3 Model Alignment: Instruction Tuning
12.4 Chain-of-Thought Prompting
12.5 Automatic Prompt Optimization
12.6 Evaluating Prompted Language Models
12.7 Model Alignment with Human Preferences: RLHF and DPO


Articles:

Read the Prompt Engineering Guide Links to an external site. webpage.
Read the Prompt Engineering - Write Clear Instructions 

Required Media

Watch the Prompt Engineering Tutorial (41:36) video
Module 6: Building solutions with Hugging Face

Textbook:
Speech and Language Processing —3rd ed. (Book website).

Read Chapter 14: Question Answering, Information Retrieval, and Retrieval Augmented Generationpages 289 to 304 (Sections 14.1-14.4):

14.1 Information Retrieval
14.2 Information Retrieval with Dense Vectors
14.3 Answering Questions with RAG
14.4 Evaluating Question Answering


Articles:

Read the Hugging Face LLM Course: Two Roadmaps - The LLM Scientist and The LLM Engineer
Read the Hugging Face LLM Course: Transformer Models 
Read the How to Fine-tune Open LLMs in 2025 with Hugging Face

Required Media

Watch the What is HuggingFace?
Watch the What is Retrieval-Augmented Generation (RAG)?
Watch the RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models 
Watch the LoRA explained (and a bit about precision and quantization) 
Module 7: Agentic. AI

Required Readings


Register for the Hugging Face Agents Course, and then:

Read the Unit 0 - Onboarding
Read the Unit 1 - Introduction to Agents
Read the Unit 2 - Introduction to Agentic Frameworks


Review the Microsoft AutoGen GitHub A Framework for Building AI Agents and Applications 
Read the Anthropic Building Effective Agents
Read the Anthropic New Capabilities for Building Agents

Required Media

Watch the New Hugging Face Agents - Full Tutorial  (24:37)
Recommended Readings


In the Hugging Face Agents Course

Read the Unit 3 - Use Case for Agentic RAG
Read the Unit 4 - Final Project - Create, Test, and Certify Your Agent
No results found