Can't figure out output character encoding for MeCab

485 Views Asked by At

I'm trying to parse some Japanese text, and I can't seem to figure out the output encoding.

This is the output I'm getting:

これは ̾��,����,*,*,*,*,*
本   ̾��,����,*,*,*,*,*
です  ̾��,����,*,*,*,*,*
。   ̾��,������³,*,*,*,*,*
EOS

Steps I took:

  1. git clone https://github.com/taku910/mecab
  2. cd mecab/mecab
  3. ./configure --enable-utf8-only --with-charset=utf8
  4. make
  5. sudo make install
  6. mecab -o ~/Desktop/output.txt ~/Desktop/input.txt, where input.txt contains "これは本です。"

Using OSX 10.15.3

0

There are 0 best solutions below