I am trying to make a matcher in spacy that pulls country names, including abbreviations. For example, Kenya, KE, and KEN should all be matched as Kenya. I built a simple matcher but it is not returning anything back.
Simple code below tried in Jupyter notebook
import spacy
import pycountry
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
for country in pycountry.countries:
name = country.name
pattern1 = [{'LOWER': name}]
pattern2 = [{'LOWER': country.alpha_2}]
pattern3 = [{'LOWER': country.alpha_3}]
patterns = [pattern1, pattern2, pattern3]
matcher.add(name, patterns)
doc = nlp(u"Kenya is a beautiful country. It is next to Somalia. KEN is in Africa. China is making investments there. It is near the UAE and SAU")
found_matches = matcher(doc)
print(found_matches)