Writings | Noura Abdelhafez

Efficiently Serving LLMs

Exploring techniques such as vectorization, KV caching, continious batching, and LoRA.

6 min read · May 29, 2024

2024 · genai llmops · course-summary
Generative AI with LLMs

Exploring the details behind large language models including how LLMs work, and the best practices behind training, tuning and deploying them.

18 min read · May 28, 2024

2024 · genai llms · course-summary
Calculate LLMs GPU Requirements

How much vRAM do you actually need?

3 min read · April 16, 2024

2024 · genai llms math · post