If I have chinese word list: like reference = ['我', '是', '好' ,'人'], hypothesis = ['我', '是', '善良的','人] . Could I use the: nltk.translate.bleu_score.sentence_bleu(references, hypothesis) for chinese translation? it is the same as English? How about Japanese? I mean If I have word list(chinese and japanese ) like english. Thanks!
BLEU scores:could I use nltk.translate.bleu_score.sentence_bleu for calculating scores of bleu in chinese
11.3k Views Asked by tktktk0711 At
1
There are 1 best solutions below
Related Questions in PYTHON-2.7
- Telnet function in Python
- symbol not found in flat namespace '__PyTrash_begin
- Python 2.7 requirements won't install without virtualenv
- Python search for the errors in the JSON
- spectrogram for a .cdf file
- SSL Error and InsecurePlatformWarning when installing packages using pip on Python 2.7
- Canonical way to ensure float point division across py2 and py3?
- Unable to execute Python Script directly
- Pip from Python 2.7.10 installed via pyenv-win cannot install any packages
- Arcpy: Python stops ExportToPDF through list after some iterations
- Python2 unable to pickle string
- Reading Excelsheets using openpyxl and Python
- How can I store a function in an array in python?
- " 'Word2Vec' object has no attribute 'load_parent_word2vec_format' " error
- How to execute a nodejs function from the python code?
Related Questions in NLTK
- Issue in loading model in recommender system using streamlit
- The chatbot code works well on the console but not when deployed on the website
- Comparison between stemmiation and lemmatization
- How can i get the first content of a python synsets list?
- NameError: name 'sense2vec_instance' is not defined
- Problems with training a model with pytorch
- How I get precision, recall, and f1-score from nltk.naivebayesclassifier?
- removing paywall language from piece of text (pandas)
- How do I randomize responses?
- Why is my NLTK bot not working correctly?
- Inserting XML tags at specific part of file without disrupting format
- Why does KMeansClusterer from NLTK take a long time to execute with my user-item rating matrix?
- Shorten product title to a specific length using python nlp libraries
- NLTK, SSL Certificate Error, No module named pip
- how to include NLTK wordnet in a PYPI package
Related Questions in BLEU
- Can't use the BLEU offline
- Can this modified BLEU score problem be solved greedily?
- How to specify additional parameters when using HuggingFace Evaluate's evaluate.combine() method?
- How to execute Seq2SeqTrainer compute_metric() once to verify the correctness of the function?
- Bug report nltk.translate.bleu_score stopped working on tokens less than or equal to 3
- Correct NMT metrics with Fairseq on non-latin languages
- NLTK sentence_bleu() returns 0 while evaluating Chinese sentences
- keras_nlp.metrics.Bleu ValueError: y_pred must be of rank 0, 1 or 2. Found rank: 3
- IronPython: no module named 'nltk'
- Compute BLEU score of a Pandas DataFrame with valid rows filtered
- Compute corpus-level BLEU score for translations in Python via SacreBLEU
- Rouge Score averaged across documents or per question
- Keras BLEU metric results in an error during LSTM training
- Why I am getting less BLEU score?
- Early stopping based on BLEU in FairSeq
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
TL;DR
Yes.
In Long
BLEU score measures n-grams and its agnostic to languages but its dependent on the fact the language sentences can be split into tokens. So yes, it can compare Chinese/Japanese...
Note the caveats of using BLEU score at sentence level. BLEU was never created with sentence level comparison in mind, here's a nice discussion: https://github.com/nltk/nltk/issues/1838
Most probably, you'll see the warning when you have really short sentences, e.g.
You can use the smoothing functions in https://github.com/alvations/nltk/blob/develop/nltk/translate/bleu_score.py#L425 to overcome short sentences.