For an exercise I trained GPT-2 on a certain dataset for sequence classification (binary classification on sentiment). Specifically, I trained the untrained classification head as it comes from AutoModelForSequenceClassification.from_pretrained("gpt2") according to the warning message
Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Then, on top of the trained model, I used LoRA from the PEFT library to fine-tune this model further, i.e.,
lora_config = LoraConfig(
r=8,
lora_alpha=8,
task_type=TaskType.SEQ_CLS,
fan_in_fan_out=True # GPT-2 requires this
)
lora_model = get_peft_model(model, lora_config) # here model is the trained GPT-2 model
lora_trainer = Trainer(
model=lora_model,
...
)
lora_trainer.train()
lora_model.save_pretrained('gpt-2_lora')
which worked well up to this point. Then, when loading the model back, i.e.,
lora_model = AutoPeftModelForSequenceClassification.from_pretrained(
'gpt-2_lora',
num_labels=2,
id2label=id2label, # some mapping
label2id=label2id # never mind now
)
I get the same warning:
Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
So apparently, PEFT is ignorant of the trained classification layer of GPT-2.
Question #1: What do I need to do to make PEFT consider the fully trained version of GPT?
Ignoring this for the moment, I wanted to proceed with an evaluation of the PEFT model, i.e.,
task_evaluator = evaluator('text-classification')
eval_results = task_evaluator.compute(
model_or_pipeline=lora_model,
tokenizer=tokenizer,
data=ds_test,
input_column=dataset_textfield_name,
metric=evaluate.combine(['accuracy', 'f1']),
label_mapping=label2id
)
print(eval_results)
which earned me a lengthy error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/peft/peft_model.py:529, in PeftModel.__getattr__(self, name)
528 try:
--> 529 return super().__getattr__(name) # defer to nn.Module's logic
530 except AttributeError:
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1614, in Module.__getattr__(self, name)
1613 return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
1615 type(self).__name__, name))
AttributeError: 'PeftModelForSequenceClassification' object has no attribute 'task'
During handling of the above exception, another exception occurred:
... [more messages of similar kind]
AttributeError: 'GPT2ForSequenceClassification' object has no attribute 'task'
Question #2: what is wrong here? What do I need to do?
The task evaluator works when used on plain GPT-2 (without PEFT).