Machine translation of the law

In this post, I would like to give you a taster of the controversial subject of machine translation (MT). Full references are given below so that you can read further if you are interested.

Potentially, MT could, inter alia, reduce costs, widen access to content, process large volumes of data in order to identify items of interest, make translators’ work more interesting by taking over repetitive tasks, and facilitate communication, for example in social networks or where very unfamiliar languages are involved. The key caveat is that users be clear about the limitations of MT with respect to the translation skopos (or purpose).

Yates (2006) examined the accuracy of an MT system by comparing its translations with professional human translations. She aimed to discover whether it could provide translations that were sufficiently accurate for use in law libraries, in particular to “provide a ‘gisting-level’ translation service”. Human assessment was used in this study, which found that the tool’s “performance (…) was poor”, and the recommendation for law librarians was to consider MT output “questionable at best”.

Somers, a highly respected MT scholar, in his 2007 article, wrote a “reply to Yates”, in particular criticizing the assessment methods used in her study. He concluded that MT systems could provide “rough translation from which sufficient information can be got to indicate whether a full translation (and of which parts) is necessary”.

Kit & Wong (2008) made a comparative assessment of the translation of legal texts carried out by different MT systems, using automatic evaluation tools rather than human assessors, analyzing various language pairs. The genre scope of the study, namely four European treaties and a United Nations Declaration, was limited. Since such data provided much of the ‘learning material’ for MT systems, the tools would naturally perform better in this genre. The conclusions of the study included the reliability of automatic assessment tools and large-scale data sets, the choice of tool according to the language pair, and finally the fact that usefulness depends on how such tools are used.

Mule & Johnson (2010) examined how MT could assist “limited-English-proficient” clients in having access to legal information. They state that “machine translation alone (…) has been found to be unacceptably inaccurate”, but offer suggestions to improve the results, such as amending the way in which the text is written, and list a number of factors to be taken into account such as “potential harm”, “importance of agency reputation or trust”, “possibility of gisting” and “extent of dissemination required”.

So, what conclusions can we draw from this taster? Well, all of the authors seem to converge in saying that we need to carefully choose how we make use of MT. Somers made a clear distinction between MT for “assimilation” and for “dissemination” (2005, pp. 128-129). Let us hope that those commissioning and using translation clearly bear in mind the skopos at all times, and that the strengths of both machines and human translators will be utilized in the most appropriate ways possible in the future.


Mule, J., & Johnson, C. (2010). How effective is machine translation of legal information? Clearinghouse Review Journal of Poverty Law and Policy, 44(32). Retrieved February 10, 2011 from the Lexis Library database.

Kit, C., & Wong, T.M. (2008). Comparative evaluation of online machine translation systems with legal texts. Law Library Journal, 100(2), pp. 299-321.

Somers, H. (2005). Round-trip translation: What is it good for? [Electronic version]. In T. Baldwin, J. Curran, & M. van Zaanan (Eds.). Proceedings of the Australasian Language Technology Workshop 2005 (pp. 127-133). Carlton: Australasian Language Technology Association.

Somers, H. (2007). The use of machine translation by law librarians – a reply to Yates. Law Library Journal, 99(3), 611-619. Retrieved February 20, 2011 from:

Yates, S. (2006). Scaling the tower of Babel Fish: an analysis of the machine translation of legal information [Electronic version]. Law Library Journal, 98(3), 481- 500.

8 thoughts on “Machine translation of the law

  1. Since the majority of studies indicate MT usefulness for sifting the available data, it seems to me that it would be much more productive to use simplified MT tools for retrieving TRANSLATED keyword combinations for each document retrieved as a result of the search. This would be much cheaper. I believe that this method is better because it is cheaper and faster. On the other hand, since nobody have any doubts about the inaccuracy and reliability of MT, MT and this simplified method will yield almost the same result. MT gives a translation that can’t be trusted or even understood plus some raw data (practically the same keywords). The TRANSLATED keyword search will not require expensive software and post-MT guesswork.
    In addition, of course MT would be much better if authors used a formalized (precise) language, but where these authors are supposed to come from?

  2. Thanks very much for your input Leonid, and welcome to the blog!

    Just to be sure I’ve got this clear, are you saying that MT would be better for retrieving texts in languages the searcher doesn’t necessarily understand, through keyword translation?

  3. Yes, exactly. One would normally search using keywords (we do it every day in Google or at the EPO web site). The difference will be that the search is conducted using an intermediary (an interface that translates the search statement and then translate the search results – all in keywords. I do not have any idea why such software has not be developed. It is quite obvious that MT results cannot be used anyway in 95% of cases.

  4. Back in psychology seminars in the 70s (experimental psychology, memory, learning, psycholinguistics, cybernetics, not much Freud) we asked questions like, could a computer ever ride a bicycle, read handwriting or translate. At the time Chomsky’s famous citation of the results of computer translation summed up MT. As far as I know computers can’t ride bicycles or really read handwriting yet. Can they translate?
    They were working on machine translation at MIT in those days. To test the routines they translated a phrase into Russian and then translated it back into English again. The phrase “the spirit is strong but the flesh is weak” came famously back as “The vodka is strong but the meat is rotten”. I’ve just tried the same phrase again on Google translate

    and this is what came back “The spirit of a strong but the flesh is weak”. Much better than fifty odd years ago. The improvement lies in the fact that the vodka translation is intelligible and makes perfect sense and as a consequence you would never know it was wrong, while the Google translation is unintelligible and makes no sense and therefore you really know it is wrong.
    I have often pondered over whether a computer could translate, or even have a conversation (if it could have a conversation, translation you just be a step away). I come to the conclusion that it would have to be aware of its environment, to do so (context is essential in language and translation). A computer that could translate would be impossible to distinguish from an intelligent sentient being, it would be able to pass Turing test under any conditions.
    To come down out of the clouds, years ago I was given a legal-financial translation to do. I think it was on double taxation. It was Italian into English. On completing it, I remarked to the client (a firm of accountants specialising in tax) that it was very strange that new companies should pay tax in the first year only and then be able to operate tax free thereafter. They looked alarmed. They had copied and pasted (in the early days of PCs) several paragraphs from the law in question and when they looked they found they had missed a preceding paragraph which contained a crucial “not” which reversed the meaning of all that followed.
    In my humble opinion, it will be a long time before translators will no longer be needed.
    Now computer aided translation is a different kettle of fish.
    Jim (James Davis)

  5. This interview with a Google Translate research scientist is quite interesting:

    I took away three things:
    1) The use of English as a ‘hub’ between other languages

    2) This quote: “I mean that from an 80-20 standpoint, where 80% of the use cases we’ll be able to address effectively. The last 20% will be incredibly hard. That speaks to the fact that machine translation won’t be a substitute for a human translator.”

    3) And this – I love it! “So, it’s Wikipedia, my grandmother and statistics.”

  6. Phhew!!. We still have a 20% chance of keeping our jobs. Wikipedia is powerful, I have no idea how I ever managed to translate without it, statistics can topple governments, who the **** is his grandmother for ****’s sake?

  7. Pingback: Legal considerations – machine translation and copyright « From Words to Deeds: translation & the law

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.