Livro Texto: Speech and Language Processing - Dan Jurafsky and James H. Martin
Textbook: Speech and Language Processing —3rd ed. (Book website). Jurafsky, D., & Martin, J. H. (2025)
- Read Chapter 2: Regular Expressions, Tokenization, Edit Distance Links to an external site. pages 13 to 29 (Sections 2.2 - 2.8):
- Section 2.2: Words
- Section 2.3: Corpora
- Section 2.4: Simple Unix Tools for Word Tokenization
- Section 2.5: Word and Subword Tokenization
- Section 2.6: Word Normalization, Lemmatization and Stemming
- Section 2.7: Sentence Segmentation
- Section 2.8: Minimum Edit Distance
Articles:
- Read the Introduction to NLP - Part 1: Preprocessing Text in Python .
- Read the Natural Language Processing | Text Preprocessing | Spacy vs NLTK
- Read the NLP: Tokenization, Stemming, Lemmatization, and Part of Speech Tagging.
- Watch the Natural Language Processing In 10 Minutes
- Watch the Natural Language Processing In 5 Minutes
Textbook: Speech and Language Processing .—3rd ed. (Book website). Jurafsky, D., & Martin, J. H. (2025)
- Read Chapter 17 Sequence Labeling for Parts of Speech and Named Entities pages 362 to 375 (Sections 17.1 - 17.4):
- 17.1 (Mostly) English Word Classes
- 17.2 Part-of-Speech Tagging
- 17.3 Named Entities and Named Entity Tagging
- 17.4 HMM Part-of-Speech Tagging
Articles:
Read the Linguistic Features-Entity Linking Read the Part of Speech Tagging for Beginners Read the Understanding Named Entity Recognition: What Is It And How To Use It In Natural Language Processing?
Watch the Text processing, POS tagging, and Named entity recognition - Part 2
Tunstall, L., Von Werra, L., & Wolf, T. (2022). Natural Language Processing with Transformers. O'Reilly Media
- Read Chapter 1 Hello Transformers pages 1 to 14.
- Read Chapter 3 Transformer Anatomy pages 57 to 75.
Textbook: Speech and Language Processing .—3rd ed. (Book website). Jurafsky, D., & Martin, J. H. (2025)
- Read Chapter 9 The Transformer Links to an external site. pages 184 to 200 (Sections 9.1- 9.5):
- 9.1 Attention
- 9.2 Transformer Blocks
- 9.3 Parallelizing computation using a single matrix
- 9.4 The input: embeddings for token and position
- 9.5 The Language Modeling Head
Articles:
- Read the The Illustrated Transformer
- Watch the Attention Is All You Need
- Watch the What Are Transformers?
Textbook: Speech and Language Processing —3rd ed. (Book website).
- Read Chapter 10 Large Language Models pages 203 to 219 (Sections 10.1-10.6):
- 10.1 Large Language Models with Transformers
- 10.2 Sampling for LLM Generation
- 10.3 Pretraining Large Language Models
- 10.4 Evaluating Large Language Models
- 10.5 Dealing with Scale
- 10.6 Potential Harms from Language Models
Articles:
- Read the What is LLM (Large Language Model)
- Read the How Large Language Models Work: From Zero to Chat GPT
- Read the Language Models are Few-Shot Learners .
- Read the Build a Retrieval Augmented Generation (RAG) App: Part 1
- Read the Build a Retrieval Augmented Generation (RAG) App: Part 2
- Watch the What is Retrieval-Augmented Generation (RAG)? . (06:35)
- Watch the LangChain Master Class For Beginners 2024 (3:17:15) Closed captioning available.
Textbook: Speech and Language Processing—3rd ed. (Book website).
- Read Chapter 12: Model Alignment, Prompting, and In-Context Learning pages 242 to 258 (Sections 12.1-12.7):
- 12.1 Prompting
- 12.2 Post-training and Model Alignment
- 12.3 Model Alignment: Instruction Tuning
- 12.4 Chain-of-Thought Prompting
- 12.5 Automatic Prompt Optimization
- 12.6 Evaluating Prompted Language Models
- 12.7 Model Alignment with Human Preferences: RLHF and DPO
Articles:
- Read the Prompt Engineering Guide Links to an external site. webpage.
- Read the Prompt Engineering - Write Clear Instructions
Watch the Prompt Engineering Tutorial (41:36) video
Textbook: Speech and Language Processing —3rd ed. (Book website).
- Read Chapter 14: Question Answering, Information Retrieval, and Retrieval Augmented Generationpages 289 to 304 (Sections 14.1-14.4):
- 14.1 Information Retrieval
- 14.2 Information Retrieval with Dense Vectors
- 14.3 Answering Questions with RAG
- 14.4 Evaluating Question Answering
Articles:
- Read the Hugging Face LLM Course: Two Roadmaps - The LLM Scientist and The LLM Engineer
- Read the Hugging Face LLM Course: Transformer Models
- Read the How to Fine-tune Open LLMs in 2025 with Hugging Face
Watch the What is HuggingFace? Watch the What is Retrieval-Augmented Generation (RAG)? Watch the RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models Watch the LoRA explained (and a bit about precision and quantization)
- Register for the Hugging Face Agents Course, and then:
- Read the Unit 0 - Onboarding
- Read the Unit 1 - Introduction to Agents
- Read the Unit 2 - Introduction to Agentic Frameworks
- Review the Microsoft AutoGen GitHub A Framework for Building AI Agents and Applications
- Read the Anthropic Building Effective Agents
- Read the Anthropic New Capabilities for Building Agents
Watch the New Hugging Face Agents - Full Tutorial (24:37)