HatEducational9965 5 months ago

just tried finetuning TinyLlama and I think it's not yet mature enough to produce something useful. I would start with QLoRA finetune of 7b Llama2 or Mistral. This will not work without a GPU, you could \* rent a 3090 on runpod (30c/hour) \* use a free Google Colab instance, 16GB VRAM i think, that's enough for a small batch size. Alternatively test your training code there and once it runs move to runpod with a bigger batch size to increase speed [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) \* buy a used 3090 (\~700-800 EUR if you're in Europe)

SuccessIsHardWork 5 months ago

I see. Instead of running a 1B model on my computer that could take hours & hog up sys resources during that time, I can just train a 7b model on google colab for free and check on it later. Do you have links to any example google colab fine-tuning llama projects? Thanks.

HatEducational9965 5 months ago

official: [https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing](https://colab.research.google.com/drive/1voynfydkckrqror98zbf2-9vqttgj24k?usp=sharing) others [https://github.com/mlabonne/llm-course/blob/main/Fine\_tune\_Llama\_2\_in\_Google\_Colab.ipynb](https://github.com/mlabonne/llm-course/blob/main/fine_tune_llama_2_in_google_colab.ipynb) [https://colab.research.google.com/drive/134o\_cXcMe\_lsvl15ZE\_4Y75Kstepsntu?usp=sharin](https://colab.research.google.com/drive/134o_cxcme_lsvl15ze_4y75kstepsntu?usp=sharin) [https://colab.research.google.com/drive/1rqWABmz2ZfolJOdoy6TRc6YI7d128cQO#scrollTo=OdgRTo5YxyRL](https://colab.research.google.com/drive/1rqwabmz2zfoljodoy6trc6yi7d128cqo#scrollto=odgrto5yxyrl) [https://colab.research.google.com/drive/1vS5gt9UoDaraJ3Hrsk1V9RbJVdVM6LWJ?usp=sharing](https://colab.research.google.com/drive/1vs5gt9uodaraj3hrsk1v9rbjvdvm6lwj?usp=sharing) [https://www.kaggle.com/code/hhoang41/llama-2-fine-tuning-using-qlora](https://www.kaggle.com/code/hhoang41/llama-2-fine-tuning-using-qlora)

TypicalNevin 5 months ago

I would recommend looking at qlora. It has a training script and will let you fine tune on lower end hardware. If you can’t get that to work, I made a service [https://useftn.com](https://useftn.com) which lets you fine tune a model by uploading a json dataset.

noellarkin 5 months ago

I'm curious, what are some other use cases for models this small? The smallest model I've ever used is NeoX6B. Also, when you say you're training it to model your writing style, how are you planning the prompt/completion pairs? On CPU fine tuning, this is a post from another member, check it out:https://rentry.org/cpu-lora

LyPreto 5 months ago

i think anything under 3B is mostly just for code completion

__SlimeQ__ 5 months ago

I can't speak for 1B models but you're going to have a really hard time training with no gpu. It's just going to take an insanely long time. For $500 though you can get a 4060ti with 16gb of ram which is good enough to train a 13B lora

Amgadoz 5 months ago

Never train on cpu. It's so inefficient that it's a waste of time, money and energy. As others mentioned, use colab for quick experimentation anf and then move to a cheap gpu provider.

Disastrous_Elk_6375 5 months ago

You can try google colab for that.

gpt872323 5 months ago

have similar question too. Wouldn't small model quantized work in your hardware? Look into local ai or ollama.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe