Azure Cognitive Services Translator: any criteria to determine whether translation gave reliable results or not?

106 Views Asked by At

We'd like to show some indication on the UI to draw user's attention if translation was unsuccessful. By 'unsuccessful' I mean not internal server errors or similar but the case when the results are not that good or reliable. It can be caused by a bad input text for instance.

The thing is that the API always returns something even if we send dummy data like 'ksalnfdljknrwwonfjlksnfk nasnfoaewrnnfsjklnfs 294#ffklsdfl' into it. In those cases the API normally returns a copy of the input. And we would like to get some score (e.g. a number from 0 to 1) or anything like that.

I was thinking about checking if output == input, but, unfortunately, sometimes word's translations are identical in different languages, e.g 'hamster' in English and German. Then I was thinking about additional preliminary input language detection to indicate if the input data are nonsense. The good thing about Detection API is that it returns a score. But it's not clear what threshold to use for the Score value.

So, hypothetically, IF input's score is < 0.3 (for instance) AND input == output THEN show an error message.

What do you think about such an approach? What is your solution for this problem? What constants could I use for thresholds? Perhaps there are other criterias or Azure API has other parameters which we are not aware about but could be helpful.

Here is a piece of distilled code in C# to send a request to API:

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;

namespace Relesys.Core.Translation
{
    public class TranslationService
    {
    private static readonly string _endpoint = "https://api.cognitive.microsofttranslator.com";
    private static readonly string _subscriptionKey = "******************************";

    private HttpClient _httpClient;

    public TranslationService()
    {
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Clear();
    }

    public async Task<string> SendTranslateRequestAsync(string originalText)
    {
        List<string> routeAttributes = new List<string>
        {
            "api-version=3.0", // always set api version
            "textType=html", // The translate service should expect text from an html text input (as it might contain html)
            "to=de",
            "from=en"
        };

        using (HttpRequestMessage request = new HttpRequestMessage())
        {
            // Body content to send
            object[] body = new object[] { new { Text = originalText } };
            string requestBody = JsonConvert.SerializeObject(body);

            request.Method = HttpMethod.Post;
            request.RequestUri = new Uri($"{_endpoint}/translate?{string.Join("&", routeAttributes)}");
            request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
            request.Headers.Add("Ocp-Apim-Subscription-Key", _subscriptionKey);

            HttpResponseMessage response = await _httpClient.SendAsync(request).ConfigureAwait(false);

            if (response.IsSuccessStatusCode)
            {
                string jsonResult = await response.Content.ReadAsStringAsync();

                return jsonResult;
            }

            string failedResult = await response.Content.ReadAsStringAsync();

            throw new Exception(failedResult);
        }
    }
}
}
0

There are 0 best solutions below