how can I feed GPT2 with prespecified embeddings?

23 Views Asked by wgcdsb At 24 February 2024 at 20:06

As the title suggested, I just wonder how can I feed GPT-2 with prespecified embeddings for each word when using the huggingface transformer?

For example, I have embeddings for "q1", "q2" and "q3", say a 3-by-10 matrix (obtained from some other information).

If I use GPT-2 tokenizer, the problems are:

token is not 1-to-1 for each word, and there will be 6 tokens. But what I need after fine-tuning is the contexulized embedding for each word, not for each token.
I loose a lot of information if I cannot use my embeddings

There are 0 best solutions below