Trying to set conflicting doc.ents: '(4708, 4717, 'Companies worked at')' and '(4681, 4717, 'Degree')'

558 Views Asked by At

[enter image description here][1]

Trying to set conflicting doc.ents: '(4708, 4717, 'Companies worked at')' and '(4681, 4717, 'Degree')'. A token can only be part of one entity, so make sure the entities you're setting don't overlap. To work with overlapping entities, consider using doc.spans instead.

1

There are 1 best solutions below

0
Hannibal On

The EntityRecognizer does not support overlapping entities. To solve this issue, you have a couple of choices:

  • Keep one of those entities
  • Use two EntityRecognizers in your pipeline, one for degree and another for companies_worked_at. Then use "set_extensions" to maintain both annotations (the recognizers will overwrite each other).
  • Use the SpanCategorizer instead of the EntityRecognizer (https://spacy.io/api/spancategorizer)