QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

52,593

1,906 28

Published 2024-02-27

CXOs, VPs, & Directors, I'm offering AI services here: shawhintalebi.com/

In this video, I discuss how to fine-tune an LLM using QLoRA (i.e. Quantized Low-rank Adaptation). Example code is provided for training a custom YouTube comment responder using Mistral-7b-Instruct.

More Resources:
👉 Series Playlist:    • Large Language Models (LLMs)
🎥 Fine-tuning with OpenAI:    • 3 Ways to Make a Custom AI Assistant ...

📰 Read more: medium.com/towards-data-science/qlora-how-to-fine-…
💻 Colab: colab.research.google.com/drive/1AErkPgDderPW0dgE2…
💻 GitHub: github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/ql…
🤗 Model: huggingface.co/shawhin/shawgpt-ft
🤗 Dataset: huggingface.co/datasets/shawhin/shawgpt-youtube-co…

[1] Fine-tuning LLMs:    • Fine-tuning Large Language Models (LL...
[2] ZeRO paper: arxiv.org/abs/1910.02054
[3] QLoRA paper: arxiv.org/abs/2305.14314
[4] Phi-1 paper: arxiv.org/abs/2306.11644
[5] LoRA paper: arxiv.org/abs/2106.09685

--
Book a call: calendly.com/shawhintalebi

Socials
medium.com/@shawhin
www.linkedin.com/in/shawhintalebi/
twitter.com/ShawhinT
www.instagram.com/shawhintalebi/

The Data Entrepreneurs
🎥 YouTube:    / @thedataentrepreneurs
👉 Discord: discord.gg/RSqZbF9ygh
📰 Medium: medium.com/the-data-entrepreneurs
📅 Events: lu.ma/tde
🗞️ Newsletter: the-data-entrepreneurs.ck.page/profile

Support ❤️
www.buymeacoffee.com/shawhint

Intro - 0:00
Fine-tuning (recap) - 0:45
LLMs are (computationally) expensive - 1:22
What is Quantization? - 4:49
4 Ingredients of QLoRA - 7:10
Ingredient 1: 4-bit NormalFloat - 7:28
Ingredient 2: Double Quantization - 9:54
Ingredient 3: Paged Optimizer - 13:45
Ingredient 4: LoRA - 15:40
Bringing it all together - 18:24
Example code: Fine-tuning Mistral-7b-Instruct for YT Comments - 20:35
What's Next? - 35:22

All Comments (21)

@ShawhinTalebi 5 months ago

👉More on LLMs: youtube.com/playlist?list=PLz-ep5RbHosU2hnz5ejezwa… -- References [1] Fine-tuning LLMs: https://youtu.be/eC6Hd1hFvos [2] ZeRO paper: arxiv.org/abs/1910.02054 [3] QLoRA paper: arxiv.org/abs/2305.14314 [4] Phi-1 paper: arxiv.org/abs/2306.11644 [5] LoRA paper: arxiv.org/abs/2106.09685
@chris_zazzman 4 months ago

Amazing work Shaw - complex concepts broken down to 'bit-sized bytes' for humans. Appreciate your time & efforts :)
@manyagupta6375 4 months ago

Your explanations are amazing and the content is great. This is the best playlist on LLMs on YouTube.
@MrCancerbero1983 4 months ago

This is the best explanation that i've ever heard, thanks for all the work!!
@soonheng1577 2 months ago

wow, you are the genius of explaining super hard math concept into layman understandable terms with good visual representation. Keep it coming.
@africanbuffalo 5 months ago

Thank you Shaw for yet another awesome video succinctly explaining complex topics!
@RohitJain-ls2ov 5 months ago

Exactly what I was looking for! Thanks for the video. Keep going!
@liubovnesterenko956 3 months ago

Thank you for this amazing video, great explanations, very clear and easy to understand!
@Ali-me4tv 5 months ago

So far the best explanation on Youtube about this topic
@el_artmaga_ 4 months ago

Great video and your slides are very well organized!
4 months ago

Learned a lot. Great video and very accessible. Well Done!
@bim-techs 3 months ago

Amazing video ! You are the best, man ! Thank you so much.
@aldotanca9430 4 months ago

Loved this, very informative and clear!
@operitivo4635 2 months ago

First I thought omg this video is horrible but its actually excellent! (I wanted a practical fast way to get my LLM finetuned using my own data, but found it really isnt that easy). After this I understood a lot better what is going on in the background.
@FrancescoFiamingo99 21 days ago

dear Shaw, i listen to the video so many times, and aside that is extremely well done and i learn so much, you should emphasize (or even do an ad hoc video) the fact that key for the finetuning with "one" gpu is the usage of the "quantifized" model of mistral, overall i m sure that many users, wodul like to know more about this models, i m sure that not many knows how to use the most important LMM (quantized) on their own colab or even pc.....like base of their own application.... :)
@ai4sme 4 months ago

Amazing explanation!!! Thank you Shaw!
@BobTheZealot 4 months ago

Great content, thank you!
@younespiro 5 months ago

thank u for sharing this knowledge , we need more videos like this
@telmorubioetxabe4638 14 days ago

Amazing work! Thanks mate :)
@ifycadeau 5 months ago

Another fire video in the books!