Legal considerations – machine translation and copyright | From Words to Deeds: translation & the law

This guest post is published under a GNU version 2 licence, and comes from the Open Translation Tools Manual (more about that in a forthcoming post). It was written by Ed Bice in 2009, with modifications by Thom Hastings also in 2009. Despite being 3 years old, I think it brings up some very interesting topics for discussion. I look forward to reading your comments!

American copyright law considers a translation a derivative work. As such translators must obtain permission from the copyright or derivative right holder of the source language text. With regard to online translation, we expect that as Machine Translation (MT) and Hybrid Distributed Translation (HDT- strategies combining human and machine translation) come of age significant changes will need to be made to the legal framework to accommodate these technologies.

In his excellent research paper, Rebuilding Babel: Copyright and the Future of Machine Translation Online (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=940041), the legal scholar Erik Ketzan posits that Machine Translation (MT) will “create massive copyright infringement on an unprecedented global scale.” Ketzan argues that “the law needs to pave the way for companies to develop online translators [MT] and forestall a chilling effect on innovation that may result from legal uncertainty; software companies may not pursue online translators because of the threat of litigation. Online MT needs to be protected because it is socially, politically, and commercially beneficial. Technology may have put man on the moon, but machine translation has the potential to take us farther, across the gulf of comprehension that lies between people from different places.” Ketzan agues the need, “for protection through the creation of effective licenses and statutory clarification of online MT’s non-infringing nature.”

We believe there is an unmet need for research and legal advocacy addressing the status of both Machine Translation and social translation of webpages and other digital content.

First, though, courtesy of Ketzan, let us take a quick stroll through the history of translation copyright in the West.

Among the notable unfortunate moments in the history of translation copyright, William Tyndale (patron saint of translators) was given neither a take down notice nor a cease and desist but was instead burned at the stake for his unauthorized English translation of the Bible in 1536. [The Radical Reformation xxvii (Michael G. Baylor et al. eds., Cambridge University Press 1991); David Daniel, William Tyndale: A Biography (Yale University Press 1994)] England’s first copyright law, The 1710 Statute of Anne did offer publishing protection for a fixed number of years, but did not protect authors with regard to translation rights. Two 18th Century court cases did address the issue, concluding: “a translation might not be the same with the reprinting the original, on account that the translator has bestowed his care and pains upon it.” [Burnett v. Chetwood, 2 Mer 441, 35 Eng Rep 1008-9 (Ch 1720) via Ketzan] And, “Certain bona fide imitations, translations, and abridgments are different [from copies]; and in respect of the property, may be considered new works: but colourable and fraudulent variations will not do.” [Millar v. Taylor, 4 Burr 2303; 98 Eng Rep 201 (1769) via Ketzan] In fact it was not until 1911 that English law granted a work’s author the right to control translations.

Ketzan continues, “American law did not recognize unauthorized translations as copyright violations until the late nineteenth century. [Naomi Abe Voegtli, Rethinking Derivative Rights, 63 Brook. L. Rev. 1213, 1233 (1997)] In Stowe v. Thomas, a Pennsylvania District Court held that an unauthorized German translation of Uncle Tom’s Cabin (German being commonly spoken in Pennsylvania) did not constitute a “copy” under copyright law. [Stowe v. Thomas, 23 F. Cas. 201, 207 (C.C.E.D. Pa. 1853) (“A “copy” of a book must, therefore be a transcript of the language in which the conceptions of the author are clothed; of something printed and embodied in a tangible shape. The same conceptions clothed in another language cannot constitute the same composition, nor can it be called a transcript or “copy” of the same “book.” ”)] Congress explicitly reversed this holding in the 1870 Copyright Act, which recognized a form of derivative right in translations by providing that “authors may reserve the right to dramatize or to translate their own works.” [Act of July 8, 1870, ch. 230 § 86, 41st Cong., 2d Sess., 16 Stat. 198] The 1909 Act maintained and expanded this translation derivative right, granting authors the right to ‘translate the copyrighted work into other language or dialects.’”

What emerges from Ketzan’s analysis is a history of translation copyright in which absence of rights followed from the linguistically defined boundaries of the publishing industry of the day. There was no incentive to lobby for translation rights, because the publishing houses did not work across markets. This offers clear analogy to social online translation. The structural fact that content producers (with notable exceptions) are generally not targeting and have no good way of monetizing a single piece of content to a global audience, works to the favor of online translators. Linking a Japanese speaker who has read a lengthy translation of an article from Huffington Post back to that English language site is meaningless to the user and to Huffington post.

Back to a consideration of legal issues related to MT.

The copyright implications of MT have not been addressed by a Federal court in the United States. As mentioned above, the exclusive right to authorize deriviative works belongs to the owner of the copyright. Ketzan: “It is beyond question that translations constitute derivative works, which are actionable if not authorized by the copyright holder of the original. [7 U.S.C. § 101 (definition of “derivative work” includes “translation”). See also 1-2 Nimmer on copyright, supra note 122, at § 2.04, §3.01.] But is MT output actually a “translation,” and therefore a derivative work, under the Copyright Act? Just as the copyright laws do not expressly require “human” authorship. [Urantia Found. v. Maaherra, 114 F.3d 955, 958 (9th Cir. 1997) (addressing the bizarre question of whether a book purportedly authored by celestial beings may be copyrighted; “The copyrightlaws, of course, do not expressly require “human” authorship”)] Title 17 does not explicitly require translation, or any other derivative work, to be performed by a human. Sound recordings and art reproductions, like MT, can be created at the touch of a single button and create derivative works under § 101. While an argument can be made that, theoretically, MT is not “translation,” [Le ton beau de Marot, supra note 40, at 515-518] a plain language reading suggests that machine translation performs what it says: translation. As such, machine translation of a text creates derivative work under the Copyright Act and [an MT service provider] may be liable for copyright infringement if that translation is unauthorized.”

The notion of ‘implied license’ is a well established concept that a content owner who publishes to the web understands and agrees to enable web crawlers to index their content and web browsers to change their CSS and and fonting. In a Nevada court ruling [Field v. Google, Inc., 412 F. Supp. 2d at 1115-16] it was ruled that the burden is on the website owner to opt out of Google indexing. EG, the web-page owner has the ability to opt out of Google indexing therefore cannot argue that rights are infringed by Google. Currently, Google translation services note to the end user that they should use: <meta name=”google” value=”notranslate”>

The notion of ‘fair use’ cannot be invoked for translations that constitute full web-page translations due to the fact that one of the considerations for meeting the ‘fair use’criterion is “the amount and substantiality of the portion used in relation to the copyrighted work as a whole.” In the case of a full page machine generated translation, MT fails to meet ‘fair use.’

Ketzan asserts that MT service providers could find protection under the Digital Millenium Copyright Act (DMCA), which offers broad immunity to Internet Service Providers (ISPs) for content that might be presented over websites they serve. The question is whether an MT company could be defined as a ‘service provider’. The DMCA defines a service provider as “a provider of online services or network access.” This is perhaps the strongest legal defense for companies that provide provide MT services onto the web.

Social Translation projects such as Global Voices and Meedan.net preform social translation of content on the web (Meedan.net also provides Machine Translation services similar to Google). The projects assert their translation work is a non-commercial public service, they provide links to original sources, and they encourage any content producers who do not want translation services to notify them.

If there was a global equation to describe the velocity of innovation, collaboration, knowledge creation, and knowledge sharing – a sort of global understanding index – it would be limited by the scope and rate of the transfer of knowledge and information across language communities. Limiting this flow of information limits our ability to ‘compete globally’ (to borrow a phrase with irony) against our baser tendencies toward zenophobia, ignorance, and narrow thinking. We need a worldwide agreement that the worldwide web is, indeed, worldwide, and that in publishing a piece of content to the web, we should embrace the idea that people from outside our language have a right to put on a pair of glasses (MT algorithms/HDT services) that might allow them to attempt to decipher our words and ideas. Efforts to restrict translation rights may similarly limit our ability to successfully navigate our inevitably global future.