Transparency and Copyright: considerations on the AI Act by MCA and CAE

November 25, 2024

The European AI Office has recently made a significant stride in shaping the landscape of artificial intelligence (AI) in Europe. Michael Culture Association and Culture Action Europe, in the framework of the Action Group on Digital and AI, are publishing a Considerations regarding the implementation  of the European Union’s Artificial Intelligence Act (find the document at the bottom of the article).

To understand the main features of this position paper, here is a breakdown of what the AI Office is, the draft’s key provisions, and the feedback gathered so far.

What is the AI Office?

The EU AI Office is a central body facilitating the enforcement and the monitoring of the AI Act, the European regulation for AI development and use everywhere in the Union. It has been established after the Act entered into force and reunites members of  different Directorate-General of the EU Commission. 

The drafting of the General-Purpose AI Codes of Practice is one of its key initiatives in connection with civil society:  these documents will guide the actions of GPAI providers. Transparency and copyright are two key issues for the cultural and creative sector – article 50 and article 53 of the Act regulate these aspects, as explained here in detail by Culture Action Europe. 

What is GPAI? We usually use this term regarding Artificial General Intelligence technologies with generative capabilities trained on a broad set of unlabelled data that can be used for different tasks. For example, ChatGPT falls in this category. Find more here in the European Parliament brief.

The first draft was created by independent experts serving as Chairs and Vice-Chairs across four thematic working groups composed of more than 1000 stakeholders. MCA and CAE are part of Working group 1: Transparency and Copyright. This initial draft serves as a starting point to refine guiding principles and objectives, aiming to establish clear measures and performance indicators. Stakeholder feedback will shape the next iterations of the Code, through a consultative process embedded in the Working Groups

Key Provisions in the Draft Code

The first draft (you find here the full version) focuses, among others, on transparency and copyright compliance.

Transparency Measures

Transparency means making the GPAI development (from the data used for training to how the machine works) clear and comprehensible, so people in the EU can trust this kind of tools. For AI providers, the draft proposes some guiding rules.
First of all, providers must keep detailed documentation about their AI models. This includes:

  • How they were built and trained: Information about the data used, the model’s design, and testing results.
  • Intended uses: Descriptions of what the AI can (and cannot) do and any restrictions on its use.
  • Acceptable Use Policy: A clear set of rules about how the AI can be used safely and ethically. For example, it might specify that the AI cannot be used to spread misinformation. 

Providers must give enough details to help developers or businesses using the AI understand its strengths, weaknesses, and limitations. This is like giving a user manual with a new gadget. Where possible, providers should share some of this information publicly – this builds trust and ensures accountability.

Copyright Compliance

Copyright rules protect creators’ rights, ensuring their work isn’t used without permission. The AI Act in art.53 states that AI providers must comply with the Copyright legislation present in the EU (referring to the Copyright Directive from 2019). The draft of the code sets general rules for this compliance.

  1. Upstream compliance – before using data: AI providers need to check that any data they use to train their models is legally obtained. For instance, if a dataset contains songs or books, the creators must be accessible online, following the rules set out by the Copyright Directive.
  2. Downstream compliance – preventing misuse: AI providers should take steps to ensure their models don’t generate outputs that break copyright laws. For example, an AI should not create a song that’s too similar to a copyrighted one without permission.
  3. Transparency in Data use: Providers must be open about where their training data comes from and how they ensure copyright rules are followed.
  4. Text and Data Mining (TDM) exception: TDM is a method where large amounts of text or data are analyzed to train AI, basically crawling for data online and taking what is available. Providers can use this method under certain conditions, like respecting tools such as “robots.txt,”, a command in the code of the website that tells web crawlers (automated programs) what they can and cannot access.
  5. Commitment to Rights Holders: Providers are encouraged to collaborate with artists, authors, and other creators to respect their rights and find fair solutions.

Feedback from the Working Groups

The four working groups have provided valuable insights into the draft’s provisions. Here’s a summary of their feedback in the section concerning concerning copyright and transparency

  • Using the “robots.txt” method, to manage copyright reserves, is not enough – it can be a difficult technical solution to be implemented and can be insufficient. Organizations of rightholders proposed other solutions for opting out of the TDM exception, more technically accessible for everyone and more direct.  
  • Ensuring transparency about data sources and access permissions for model development – stakeholders highlighted the need for clearer guidance on compliance and contractual relationships with data providers
  • Creators should be remunerated if their copyrighted works were used without permission to train AI models. Many organizations  advocated for a regulation in the Code of Practice that makes AI providers responsible for compensating for this unauthorized use.
  • In general, the opt-out option is not seen as a sustainable solutions for creators and rightholders, since it shifts the power in the direction of the AI provider and developer – licensing systems and opt-in options (in this case, instead of saying that you do not want your data to be used to train the machine, the training can happen only if you explicitly say you allow for the use) have been discussed 

This group explored obligations also for transparency: making publicly funded AI models available for scrutiny and clarifying transparency requirements for synthetic and private training data. Participants stressed the need to refine transparency measures to ensure clarity and feasibility.

What’s in it for the cultural heritage sector?

The situation of the heritage sector is very specific: in this field the notion of copyright takes a different shape.

In the Copyright Directive, Article 14 ensures that digital reproductions of public domain artworks—like paintings or sculptures that are no longer copyrighted—can be freely reused across all member states. This means museums cannot claim copyright on digital copies of these public works anymore, making it easier for people to access and enjoy cultural heritage, especially online.

Additionally, the TDM exceptions, is differentiated in purposes: Article 3 states that data-mining for the “purposes of scientific research” by research organizations and cultural heritage institutions is always granted and represents exception to the legislation.  These exceptions allow cultural institutions to use TDM on publicly available works for purposes like scientific research or other approved activities. This opens up new possibilities for analyzing and reusing cultural content: MCA, especially in the framework of the Common European Data Space for Cultural Heritage, is an important advocate for this.

MCA and CAE position

We are participating in this discussion in order to bring the voice of our Action Group on Digital & AI. Together with the group’s members, we drafted a Position statement, exploring considerations around the AI Act and its impact on the cultural and creative sector.

Access the considerations here

Share