I am having issues with the spaCy Entity Ruler component. I am trying to add new patterns to it in this way:
def add_rule_patterns(input_references, model_ruler):
ruler = model_ruler
ref_patterns = []
print("Adding new patterns")
for ref in input_references.iloc[:,0]:
ref_patterns.append({"label":"SYS", "pattern": {"TEXT": {"FUZZY": ref}}})
pattern = [{"label":"SYS", "pattern": {"TEXT": {"FUZZY": ref}}}]
# print(f"Adding pattern: {ref}. Stored as: {pattern}")
ruler.add_patterns(ref_patterns)
for pattern in ruler.patterns:
print(f"Added pattern for: {ref}. Stored as: {pattern}")
return ref_patterns
Then when preparing the nlp model:
nlp = spacy.blank('fr')
nlp.add_pipe("ner")
ner = nlp.get_pipe("ner")
config = {
"phrase_matcher_attr": None,
"validate": True,
"overwrite_ents": True,
"ent_id_sep": "||",
}
nlp.add_pipe("entity_ruler", before="ner", config=config)
ruler = nlp.get_pipe("entity_ruler")
Finally, to add the patterns:
patterns = add_rule_patterns(df_sys, ruler)
ruler.add_patterns(patterns)
print(ruler.patterns)
The issue is that "ruler" as instantiated does not contain any "add_patterns()" method. If i add the ruler like this:
from spacy.pipeline import EntityRuler
ruler = EntityRuler(nlp, overwrite_ents=True)
The method is present, but the patterns are not added to the pipe. I've looked into some related questions, like this or this one, but I still can't seem to add the new patterns. Am I missing something or is it some known issue with spaCy?