Google Translate and the power of Plagiarism?

The efficacy of Google’s Translation engine was recently put to the test by a New York Times article. After reviewing the article and the results of the various tests I couldn’t help but notice that the power of Google’s translation efforts are rooted in plagiarism.

Most providers of translation services use a technology known as Translation Memory.  This technology basically stores translation created and reviewed by human translators in a database. The idea behind this effort is to store the sentences so that if the same sentence comes up again the same translation is reused. This technology provider for faster completion times, higher accuracy, more consistency and of course a lower cost.

Google is basically building a Translation Memory engine. We all know that Google is incredibly good at indexing information. The effort behind Google Translation indexes web content as well as many published works through it’s Google Books project.  The NY times article  shows that Google does a reasonably good job with the translation of published works. This is an obvious result for books that have been translated into many languages. The example of the passage from “The Little Prince” is obviously quite good. The book has been translated into more than180 languages and has sold over 80 million copies worldwide.

So it is easy to understand that the dialog from “The Little Prince” and other popular books should be easy for Google to convert into other languages. Afterall, the work has already been done by humans. Google is just indexing the material.

Isn’t this a form of Plagiarism? If I submit a passage for translation to Google and it spits back text coming from the translation of a published work, that is plagiarism. Plagiarism is generally defined as the unauthorized use or close imitation of the language and thoughts of another author and the representation of them as one’s own original work. I wonder if Google checked with Antoine de Saint-Exupéry’s estate before indexing his book and the subsequent translations of the “The Little Prince”?

The natural question is how do Language Service providers handle the large body of content in their own translation memories. In our firm, we actually store the data  separated by each client. This is by design. We have non disclosure agreements with most of our clients. We translate some very sensitive information like market research, clinical studies, marketing materials, internal memos and legal documentation. Sharing this information with no consideration to our clients is not a good idea.

The other side of this discussion is what does Google do with materials you translate and refine using their tools? Re-use it of course. The idea behind the newly minted DocTranslator is that you will pretranslate a document using the Google Translation engine and then refine the translation to meet your needs. The refined translation now goes back into the database for reuse.

The internet has become a valuable tool for competitive intelligence. Will translation be the next tool for valuable competitive intelligence. Will applications come out that query the Google engine for terminology in an effort to discover what competitors are up to? I would assume that if companies start openly using Google as their translation engine this could happen.

4 Responses to Google Translate and the power of Plagiarism?

  1. john says:

    i dont agree at all.how can it be plagiarism.Taking your view seeing as every word or expression has probably already been spoken and in the future will be in googles databases the translating a simple expression like “walking the dog “is plagiarism.The translate tool is not the plagiarizer, in the same way as guns dont kill people, people do. Whoever types into the translator tool what they want translated is responsible So if i type in the works of Shakespeare in enligsh to get french and publish it i am a plagiarizer. Is google because it analysed it word by word and sentence by sentence to get the best translation. Obviously not. I think what they are doing is excellent and over time it will contribute to more and more accurate translations.I can envisage them building on the technology and incorporating the database with voice translations , building on millions of fine tuned results based on different dialects, accents,expression etc to have an almost perfect translation in any language .Star trek universal translator here we come

  2. Peter says:

    Thanks for the note John. I agree with most of what you are saying, basically progress on automated translation tools and interpretation tools (via Google Voice, etc) is not negative. However, we are talking about corporate material and privacy issues or issues related to intellectual property.

    So if one my clients that handles patent applications decides to use the Google tool set are they aware that sensitive information relative to the patent application is now out in the open and available for reuse by anyone that happens to submit a similar sentence?

    How about Marketing campaigns?

    How about sensitive recall letters? Would clients want this information out in the open? I think not.

    The post was intended to shed light on the fact that the material submitted to Google via Google translate becomes publicly accessible and available for reuse by others without their knowledge. Google Voice opens up a whole different issue. Do you really want someone cataloging all your conversations for reuse in an interpretation situation ?

  3. Pingback: Can a translation violate copyright laws? | Argo Translation Inc

  4. Randall Talamantez says:

    Web based translation have come a long way since the time they first appeared. At the very beginning, they would just translate text word by word, not regard any other aspects, this result in the translated text practically useless. Much of that has been changed with the emergence of the Google translation. It can now hand in pretty good translations of websites. But the web based translations still have some limitations. How should we decide whether we shall do the translation on the web or get a human translator involved? ,’

    Look at our very own online site as well
    http://caramoantourpackage.com/index.php

  5. Peter says:

    Thanks for the note Randall, the question really comes to evaluating your customer needs. If you think that your customer base is OK with a machine translation which will of course have errors and some strange content then you should be fine. If instead your customer base requires accurate information or there is some risk in providing inaccurate content then you really should go with human translation.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

We use the Gravatar Service to include avatars associated with your email address.