how can I feed GPT2 with prespecified embeddings?

23 Views Asked by At

As the title suggested, I just wonder how can I feed GPT-2 with prespecified embeddings for each word when using the huggingface transformer?

For example, I have embeddings for "q1", "q2" and "q3", say a 3-by-10 matrix (obtained from some other information).

If I use GPT-2 tokenizer, the problems are:

  1. token is not 1-to-1 for each word, and there will be 6 tokens. But what I need after fine-tuning is the contexulized embedding for each word, not for each token.
  2. I loose a lot of information if I cannot use my embeddings
0

There are 0 best solutions below