top of page
  • Writer's pictureresearchsparkhub

Translation accuracy of ChatGPT and GoogleTranslate for Tourists

Updated: Oct 31, 2023


This project resulted in the following publications:

  1. Phase I: https://medium.com/@kolarsanjana/chatgpt-powered-tourist-aid-applications-proficient-in-hindi-yet-to-master-telugu-and-kannada-709bc8cb73fd

  2. Phase II: https://arxiv.org/pdf/2307.15376v1.pdf titled "Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada" by Sanjana Kolar and Rohit Kumar

  3. Phase III: https://dravidianlangtech.github.io/2023/ titled "Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada" by Sanjana Kolar and Rohit Kumar, in Proceedings of Third Workshop on Speech and Language Technologies, Sep 2023, Bulgaria.


  • Project 23-001: Subjective Evaluation of Translation accuracy of ChatGPT and GoogleTranslate in 2-3 languages (ex: Spanish, French and Urdu) you are familiar with

  • Areas: Humanities, AI/ML

  • Expected Outcome:

    • Analyze and report the efficiency and limitations of the translation abilities of large language models and highlight areas for improvements.

  • Learning Objectives:

    • Understand the state-of-the-art in translation by large language models LLM such as ChatGPT.

    • Analyze their effectiveness in applications such as tourist domain

  • Steps:

    • To conduct this experiment, we need to collect parallel data (i.e., source text and the translated text) from English to target languages. Each of these translations have to be reviewed by native speakers and grade them.

    • The steps involved in data collection and obtaining the grading results are:

    • Step 1: Collect 50 questions that a tourist would ask/seek answers for, after landing in India. For example, Greetings, asking information about Travel, Food, Police, and Hospital. Pls also see Additional Guidance for Step 1.

    • Step 2: Use ChatGPT and Google Translate to obtain translation in Hindi, Telugu, and Kannada. In order to achieve good results, obtain familiarity with using ChatGPT and Google Translate

    • Step 3: Prepare the data in a numbers file or Google Excel, with the following five columns: English Text, Hindi Translated Text, Score for Question1, Score for Question 2, Comments

    • Step 4: Using the format above, prepare the data for the target languages

    • Step 5: Recruit 5 volunteers in each of the target languages

    • Step 6: Give each volunteer an excel sheet, and spend 10 minutes explaining the task and context. The task is to rate each translation using the following questions:

      • "On a scale of 1 to 5, how accurately does the translated text convey the meaning of the original text?" Where 1- Bad, 2 - Poor, 3 - Fair, 4 - Good, 5 - Excellent.

      • "On a scale of 1 to 5, how fluent is the translated text?", where 1- Bad, 2 - Poor, 3 - Fair, 4 - Good, 5 - Excellent.

      • The volunteer should score each sentence for two questions in the excel sheet. And also note any comments/feedback for each question.

    • Step 7: Compute average of scores obtained for each question from 5 native speakers.

How to make it to a peer-reviewed paper: Augment the analysis with quantitative metrics such as BLEU scores and qualitative analysis

78 views0 comments

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page