Copyright and privacy protection of LR for sharing

There are several legal questions and concerns for translators who may want to share their translation memories - in particular how to get the TM (language resources) legally "ready" for sharing. This includes in particular the following issues:

  1. How to prepare TMs as they typically contain names, phone numbers and other personal information. Also, the meta-data typically contains information about the translator in charge (personal data). How to deal with this?
  2. How can privacy protection be implemented in such cases?
  3. What is the situation of the tmx as regards the copyright? How is this ensured?


  • Personal Data

    • The concepts of personal data and processing are very broad
      – ​​‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’) [art. 4 GDPR]
      e.g. name, address, phone number, IP address, e-mail address, salary…
      ‘processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means [art. 4 GDPR]
    • Personal data can only be processed under strictly defined conditions
      – In principle, processing is lawful when the data subject has consented
      – BUT, copyright law obliges to mention the name of the author whenever a work is used: so, names of translators shall not be anonymized!
    • If data are anonymized, they can be freely processed (they are no longer personal data)
    • Conclusion: the documents shall be anonymized before they are transferred to ELRC


    • Anonymization consists of “breaking the link” between the data and a natural person
      The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable [recital 26 GDPR]
    • Anonymization needs to be definitive (no re-identification possible)
    • For more information about anonymization techniques see:
      – Article 29 Data Protection Working Party: Opinion 05/2014 on Anonymisation Techniques
    • ELRC can provide you with on-site assistance for data anonymization

    TMX and Intellectual Property

    • Individual words (and their translations) are not protected by copyright
    • However, translation units are often longer than individual words
    • Moreover, compilations (such as a TMX) can be protected by copyright if they are original in their selection or arrangement
    • Nevertheless, the chances for a TMX to be protected by copyright seem low
    • However, a TMX can be protected as a database by the sui generis right
      – This right belongs to the entity that financed the making of the TMX (i.e. the translation)
    • For legal security, you shall grant the recipient a license to use a TMX, covering both copyright and the sui generis right
      – Within the ELRC initiative, we recommend the donators to grant an ELRC partner a license that allows data to be used for human language technology R&D purposes, and in particular by the European Commission for the use within its MT system.