I was trying to train a model with peft qLora training. Lora config and peft training args are like below:
lora_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none",
lora_dropout=0.05, # Conventional
task_type="CAUSAL_LM",
)
peft_model = get_peft_model(original_model,
lora_config)
output_dir = f'./peft-bn-mistral-training-{str(int(time.time()))}'
peft_training_args = TrainingArguments(
output_dir=output_dir,
auto_find_batch_size=True,
learning_rate=1e-3, # Higher learning rate than full fine-tuning.
num_train_epochs=1,
logging_steps=1,
max_steps=1
)
device = torch.device("cuda:0")
peft_trainer = Trainer(
model=peft_model.to(device),
args=peft_training_args,
train_dataset=tokenized_datasets["train"],
)
peft_trainer.train()
The code is resulting error like this:
TypeError Traceback (most recent call last)
<ipython-input-46-b47531775ae7> in <cell line: 1>()
----> 1 peft_trainer.train()
2
3 peft_model_path="./peft-bn-mistral-checkpoint-local"
4
5 peft_trainer.model.save_pretrained(peft_model_path)
1326 current_device_index = current_device.index if isinstance(current_device, torch.device) else current_device
1327
-> 1328 if torch.device(current_device_index) != self.device:
1329 # if on the first device (GPU 0) we don't care
1330 if (self.device.index is not None) or (current_device_index != 0):
TypeError: Device() received an invalid combination of arguments - got (NoneType), but expected one of:
* (torch.device device)
didn't match because some of the arguments have invalid types: (!NoneType!)
* (str type, int index)
I tried tweaked difference settings of introducing the device argument to model, but it consistently results the error above.
Note that i used BitsAndBytesConfig module from transformers for tokenizer.
TIA
To fix the problem, I had to assign the model to gpu before passing into parameter.
before:
after: