spaCy custom tokenizer to separate word with underscore and also to include the whole word

240 Views Asked by At

After referring to the link: How to tokenize word with hyphen in Spacy
I got to know how to tokenize by separating words containing hyphen/underscore but my requirement is to tokenize by separating it and also to include that whole word.
For example:
Input: bs_it
Output: ["bs", "it", "bs_it"]

0

There are 0 best solutions below