Starcoder finetuning - How to select the GPU and how to estimate the time it will take to finetune

621 Views Asked by Aadesh Kulkarni At 01 June 2023 at 17:22

I'd like to finetune Starcoder (https://huggingface.co/bigcode/starcoder) on my dataset and on a GCP VM instance.

It's says in the documentation that for training the model, they used 512 Tesla A100 GPUs and it took 24 days.

I also saw the model (.bin) files in files section of huggingFace (https://huggingface.co/bigcode/starcoder/tree/main)

The total size of the model is ~64GB

Based on all this information,

How do I decide which GPU is best for finetuning on my dataset ?
How to estimate the time it will take finetune ? (based on assumptions on parameters like epoch=1, for instance)
Are there any other factors that are considered to choose hardware / calculate time ?

There are 0 best solutions below