Multi-Class Document Classification with both known and un-known classes

94 Views Asked by Quan Nguyen Ha At 28 March 2022 at 10:37

Currently, I am building a multi-class document classifier which has to classify either 3 known classes, namely "Financial Report", "Insurance_Sheet", "Endorsement", and 1 unknown class which is "Random Doc". The following methods have been trialed, but did not prove a good result as quite a number of random documents have been classified as the known classes: "Financial Report", "Insurance_Sheet", "Endorsement".

Method 1: TD-IDF + Linear SVC
Method 2: Word2Vec for word embedding, then average those word-embedding to get the embedding vector for each document then feed to a classification model.
Method 3: Doc2Vec to get the embedding vector for each document and then feed to a classification model.

Can you help suggest a good approach for this case ? Thanks a lot.

Original Q&A

Multi-Class Document Classification with both known and un-known classes

There are 0 best solutions below

Related Questions in TEXT-CLASSIFICATION

Related Questions in MULTILABEL-CLASSIFICATION

Related Questions in MULTICLASS-CLASSIFICATION

Related Questions in DOCUMENT-CLASSIFICATION

Trending Questions

Popular # Hahtags

Popular Questions