Starcoder finetuning - How to select the GPU and how to estimate the time it will take to finetune

621 Views Asked by At

I'd like to finetune Starcoder (https://huggingface.co/bigcode/starcoder) on my dataset and on a GCP VM instance.

It's says in the documentation that for training the model, they used 512 Tesla A100 GPUs and it took 24 days.

I also saw the model (.bin) files in files section of huggingFace (https://huggingface.co/bigcode/starcoder/tree/main)

The total size of the model is ~64GB

Based on all this information,

  1. How do I decide which GPU is best for finetuning on my dataset ?
  2. How to estimate the time it will take finetune ? (based on assumptions on parameters like epoch=1, for instance)
  3. Are there any other factors that are considered to choose hardware / calculate time ?
0

There are 0 best solutions below