Stop press – EU’s DGT translation memories triple in size

Due to the release of new data, the translation memories made available by the Directorate-General for Translation at the European Commission Joint Research Centre have tripled in size.

I have tweeted this, but as I’m sure many of you will be interested, I’m posting it on the blog too.

The translation memories are parallel texts of the entire body of European legislation, comprising all the treaties, regulations and directives adopted by the European Union (EU), in 22 languages: Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish and Swedish.

You can download them freely from this page – beware it will take quite a while 🙂

The instructions for extracting the memories are on the same page. Whereas the first version included documents published up to 2006, the current files go up to 2010 – just to make things really clear (!) the data goes to 2010, the files are called DGT-TM-2011 and they were released in April 2012!!

9 thoughts on “Stop press – EU’s DGT translation memories triple in size

  1. Thanks hugely for this tip, will try a download when I have a spare afternoon! But might these TMs be so vast as to be unmanageable? Would be interested in others’ experiences. S

    • No Sue, they’re fine in my experience when used with Trados for example. Does depend on computer oomph though, as some many things do these days. But my 5-year old Mac copes!

  2. Actually, Juliette, the data set has increased by more than a factor of three in some languages. I think the last published set had some 300,000 TUs for Dutch/English, and now there are nearly two million. I wonder what I’ll see for German.

  3. Pingback: Stop press – EU translation memories updated *again*! « From Words to Deeds: translation & the law

  4. Pingback: Most read posts 2012 « From Words to Deeds: translation & the law

  5. Pingback: What exactly is corpus linguistics? | From Words to Deeds: translation & the law

  6. Pingback: Weekly favorites (Apr 23-29) | Lingua Greca Translations

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s