QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

52,593
28
Published 2024-02-27
CXOs, VPs, & Directors, I'm offering AI services here: shawhintalebi.com/

In this video, I discuss how to fine-tune an LLM using QLoRA (i.e. Quantized Low-rank Adaptation). Example code is provided for training a custom YouTube comment responder using Mistral-7b-Instruct.

More Resources:
👉 Series Playlist:    • Large Language Models (LLMs)  
🎥 Fine-tuning with OpenAI:    • 3 Ways to Make a Custom AI Assistant ...  

📰 Read more: medium.com/towards-data-science/qlora-how-to-fine-…
💻 Colab: colab.research.google.com/drive/1AErkPgDderPW0dgE2…
💻 GitHub: github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/ql…
🤗 Model: huggingface.co/shawhin/shawgpt-ft
🤗 Dataset: huggingface.co/datasets/shawhin/shawgpt-youtube-co…

[1] Fine-tuning LLMs:    • Fine-tuning Large Language Models (LL...  
[2] ZeRO paper: arxiv.org/abs/1910.02054
[3] QLoRA paper: arxiv.org/abs/2305.14314
[4] Phi-1 paper: arxiv.org/abs/2306.11644
[5] LoRA paper: arxiv.org/abs/2106.09685

--
Book a call: calendly.com/shawhintalebi

Socials
medium.com/@shawhin
www.linkedin.com/in/shawhintalebi/
twitter.com/ShawhinT
www.instagram.com/shawhintalebi/

The Data Entrepreneurs
🎥 YouTube:    / @thedataentrepreneurs  
👉 Discord: discord.gg/RSqZbF9ygh
📰 Medium: medium.com/the-data-entrepreneurs
📅 Events: lu.ma/tde
🗞️ Newsletter: the-data-entrepreneurs.ck.page/profile

Support ❤️
www.buymeacoffee.com/shawhint

Intro - 0:00
Fine-tuning (recap) - 0:45
LLMs are (computationally) expensive - 1:22
What is Quantization? - 4:49
4 Ingredients of QLoRA - 7:10
Ingredient 1: 4-bit NormalFloat - 7:28
Ingredient 2: Double Quantization - 9:54
Ingredient 3: Paged Optimizer - 13:45
Ingredient 4: LoRA - 15:40
Bringing it all together - 18:24
Example code: Fine-tuning Mistral-7b-Instruct for YT Comments - 20:35
What's Next? - 35:22

All Comments (21)
  • @chris_zazzman
    Amazing work Shaw - complex concepts broken down to 'bit-sized bytes' for humans. Appreciate your time & efforts :)
  • @manyagupta6375
    Your explanations are amazing and the content is great. This is the best playlist on LLMs on YouTube.
  • @MrCancerbero1983
    This is the best explanation that i've ever heard, thanks for all the work!!
  • @soonheng1577
    wow, you are the genius of explaining super hard math concept into layman understandable terms with good visual representation. Keep it coming.
  • @africanbuffalo
    Thank you Shaw for yet another awesome video succinctly explaining complex topics!
  • @RohitJain-ls2ov
    Exactly what I was looking for! Thanks for the video. Keep going!
  • Thank you for this amazing video, great explanations, very clear and easy to understand!
  • @Ali-me4tv
    So far the best explanation on Youtube about this topic
  • @el_artmaga_
    Great video and your slides are very well organized!
  • Learned a lot. Great video and very accessible. Well Done!
  • @bim-techs
    Amazing video ! You are the best, man ! Thank you so much.
  • @operitivo4635
    First I thought omg this video is horrible but its actually excellent! (I wanted a practical fast way to get my LLM finetuned using my own data, but found it really isnt that easy). After this I understood a lot better what is going on in the background.
  • dear Shaw, i listen to the video so many times, and aside that is extremely well done and i learn so much, you should emphasize (or even do an ad hoc video) the fact that key for the finetuning with "one" gpu is the usage of the "quantifized" model of mistral, overall i m sure that many users, wodul like to know more about this models, i m sure that not many knows how to use the most important LMM (quantized) on their own colab or even pc.....like base of their own application.... :)
  • @ai4sme
    Amazing explanation!!! Thank you Shaw!
  • @younespiro
    thank u for sharing this knowledge , we need more videos like this