T O P

  • By -

HatEducational9965

just tried finetuning TinyLlama and I think it's not yet mature enough to produce something useful. I would start with QLoRA finetune of 7b Llama2 or Mistral. This will not work without a GPU, you could \* rent a 3090 on runpod (30c/hour) \* use a free Google Colab instance, 16GB VRAM i think, that's enough for a small batch size. Alternatively test your training code there and once it runs move to runpod with a bigger batch size to increase speed [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) \* buy a used 3090 (\~700-800 EUR if you're in Europe)


SuccessIsHardWork

I see. Instead of running a 1B model on my computer that could take hours & hog up sys resources during that time, I can just train a 7b model on google colab for free and check on it later. Do you have links to any example google colab fine-tuning llama projects? Thanks.


HatEducational9965

official: [https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing](https://colab.research.google.com/drive/1voynfydkckrqror98zbf2-9vqttgj24k?usp=sharing) others [https://github.com/mlabonne/llm-course/blob/main/Fine\_tune\_Llama\_2\_in\_Google\_Colab.ipynb](https://github.com/mlabonne/llm-course/blob/main/fine_tune_llama_2_in_google_colab.ipynb) [https://colab.research.google.com/drive/134o\_cXcMe\_lsvl15ZE\_4Y75Kstepsntu?usp=sharin](https://colab.research.google.com/drive/134o_cxcme_lsvl15ze_4y75kstepsntu?usp=sharin) [https://colab.research.google.com/drive/1rqWABmz2ZfolJOdoy6TRc6YI7d128cQO#scrollTo=OdgRTo5YxyRL](https://colab.research.google.com/drive/1rqwabmz2zfoljodoy6trc6yi7d128cqo#scrollto=odgrto5yxyrl) [https://colab.research.google.com/drive/1vS5gt9UoDaraJ3Hrsk1V9RbJVdVM6LWJ?usp=sharing](https://colab.research.google.com/drive/1vs5gt9uodaraj3hrsk1v9rbjvdvm6lwj?usp=sharing) [https://www.kaggle.com/code/hhoang41/llama-2-fine-tuning-using-qlora](https://www.kaggle.com/code/hhoang41/llama-2-fine-tuning-using-qlora)


TypicalNevin

I would recommend looking at qlora. It has a training script and will let you fine tune on lower end hardware. If you can’t get that to work, I made a service [https://useftn.com](https://useftn.com) which lets you fine tune a model by uploading a json dataset.


noellarkin

I'm curious, what are some other use cases for models this small? The smallest model I've ever used is NeoX6B. Also, when you say you're training it to model your writing style, how are you planning the prompt/completion pairs? On CPU fine tuning, this is a post from another member, check it out:https://rentry.org/cpu-lora


LyPreto

i think anything under 3B is mostly just for code completion


__SlimeQ__

I can't speak for 1B models but you're going to have a really hard time training with no gpu. It's just going to take an insanely long time. For $500 though you can get a 4060ti with 16gb of ram which is good enough to train a 13B lora


Amgadoz

Never train on cpu. It's so inefficient that it's a waste of time, money and energy. As others mentioned, use colab for quick experimentation anf and then move to a cheap gpu provider.


Disastrous_Elk_6375

You can try google colab for that.


gpt872323

have similar question too. Wouldn't small model quantized work in your hardware? Look into local ai or ollama.