spaCy custom tokenizer to separate word with underscore and also to include the whole word

240 Views Asked by Shakkir Moulana At 17 August 2022 at 14:50

After referring to the link: How to tokenize word with hyphen in Spacy
I got to know how to tokenize by separating words containing hyphen/underscore but my requirement is to tokenize by separating it and also to include that whole word.
For example:
Input: bs_it
Output: ["bs", "it", "bs_it"]

Original Q&A

spaCy custom tokenizer to separate word with underscore and also to include the whole word

There are 0 best solutions below

Related Questions in REGEX

Related Questions in NLP

Related Questions in SPACY

Related Questions in TOKENIZE

Related Questions in LINGUISTICS

Trending Questions

Popular # Hahtags

Popular Questions