
Open Source et IA : les apports du Règlement sur l’Intelligence Artificielle européen
The use of artificial intelligence (AI) raises major legal, ethical and societal issues, to which legislators are attempting to respond through new regulatory frameworks. The European Union has adopted Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act) (hereinafter « AI Act »), which provides a derogatory regime for Open Source AI systems, but with formulations that reveal a narrow interpretation of what « Open Source » means in the context of AI.
Around a series of four articles « Open Source and AI », inno3 proposes an analysis of the AI Act, its impact on the Open Source ecosystem leading to action items to implement.
- In this first article « Open Source and AI: What the European AI Act Brings to the Table », the aim is to provide key reference points on the place of Open Source in this Regulation around general-purpose AI models (GPAI) and AI systems.
- In the second article « Open Source and AI: Three European Regulations, Three Logics and One Ecosystem« , the objective is to place the AI Act in a broader context of a set of European regulations (NPLD, CRA, etc.) and to analyse their impacts on the European Open Source ecosystem and its international dynamics.
- In the final article « Open Source and AI: What Are We Really Talking About?« , the aim is, on the one hand, to revisit the different definitions at work on « open » AI, their points of divergence and, on the other hand, to propose action items to implement.
Key Takeaways
- Free and open AI systems are excluded from the scope of the AI Act in law, but this exclusion is largely formal: copyright compliance obligations (Article 53(1)(c)) and publication of a summary of training data (Article 53(1)(d)) remain, creating a hybrid regime.
- The AI Act defines « Open Source » as access to code, parameters and architecture under a free and open licence — a minimalist definition that guarantees neither the accessibility of training data nor the practical possibility of model replication.
- The definition adopted opens the door to « open washing »: publication of allegedly open models (LLaMA, Mistral 7B-Instruct) while imposing restrictive use clauses that contradict the philosophy of free and open source.
- The main obligation thus remains: to respect copyright in the construction of training data, and to document this data transparently.
- The obligations have existed since August 2025, but the exercise of the Commission’s sanctioning power will only take effect from August 2026.
Legal Framework of the AI Act: Between Regulation and Innovation
Context, Objectives and Issues of the Regulation
Published two years after the public launch of ChatGPT, the AI Act responds to the rapid development and growing integration of new AI and generative AI technologies across all sectors. It includes a number of measures designed to address urgent needs: protecting European citizens against the risks posed by algorithmic bias, system opacity, and misuse that could infringe on fundamental rights.
This architecture is intended to be « technologically neutral » and proportionate: the European legislator aims to regulate risky uses without hindering innovation. Recital 2 of the AI Act emphasizes that the regulation aims to promote « the development and adoption of safe and trustworthy AI systems » while ensuring « a high level of protection of health, safety and fundamental rights ».
The text introduces a risk-based approach, classifying AI systems into four levels: unacceptable risk (prohibited practices, Article 5), high risk (high-risk systems, Annex III and regulated products), limited risk (transparency obligations, Article 50) and minimal risk (no specific obligations). It thus invites private actors to integrate a culture of risk and compliance (Anglo-Saxon compliance logic that GDPR has already fully adopted).
To this is added a specific regime for general-purpose AI models (GPAI), dealt with in Chapter V of the regulation.
The AI Act thus regulates two distinct and interconnected objects:
- General-purpose AI models or GPAI (also referred to as base models or foundation models) are the underlying, powerful and versatile models on which the various AI systems are based.
- AI systems are applications built on an AI model for a specific purpose (medical chatbot, recruitment system, credit rating tool, etc.).
General-Purpose AI Models
The regulation does not define the notion of an AI model as such, so it must be deduced from the reading of the AI Act that an AI model consists of the model architecture, the model parameters (including weights) and the inference code allowing the model to be executed.
Focusing on models most subject to drift, the AI Act is only concerned with certain AI models, called general-purpose AI models (Article 3.63):
- « An AI model, including where the AI model is trained using a large amount of data using large-scale self-supervision […] »: the precision aims to ensure that large AI models are included in the scope of application, even if they are trained in a different way from « classical » AI models (without human supervision and with massive amounts of data and colossal computing capacities);
- « which displays significant generality and is capable of competently executing a wide variety of distinct tasks, »
- « regardless of the manner in which the model is put on the market, and which can be integrated into a variety of downstream systems or applications, with the exception of AI models used for research, development or prototyping activities before their putting on the market. »
The Commission’s Guidelines of 18 July 2025 on the scope of obligations of GPAI model providers clarify this definition. For a model to be qualified as a GPAI, it must meet two cumulative conditions: significant general character (ability to competently execute a wide range of distinct tasks) and large-scale training. A model limited to narrow tasks (voice transcription, image enhancement) is not a GPAI, even if it uses large-scale training techniques.
The AI Act further distinguishes « ordinary » GPAIs from GPAIs with systemic risk (Article 51) subject to greater obligations. A GPAI model is presumed to present systemic risk when the cumulative computing power used for its training exceeds 10^25 FLOP (floating point operations). This presumption may be reversed or extended by the Commission. About a dozen suppliers worldwide currently exceed this threshold (OpenAI, Google, Meta, Anthropic, xAI, DeepSeek, among others).
AI Systems (AIS)
An AI system is defined by the Regulation as « an automated system that is designed to operate at different levels of autonomy and may display a capacity for adaptation after deployment, and which, for explicit or implicit objectives, infers from the inputs it receives how to generate outputs such as predictions, content, recommendations or decisions that may influence physical or virtual environments » (Article 3.1).
A general-purpose AI system is defined as « an AI system that is based on a general-purpose AI model and that has the capacity to serve a variety of purposes, both for direct use and for integration into other AI systems » (Article 3.66).
In accordance with the risk-level classification presented above, AI systems are subject to differentiated obligations. Systems with unacceptable risk (e.g. exploitation of persons’ vulnerabilities) are prohibited, high-risk AI systems (e.g. in medicine) are subject to the strictest regulation (sensitive areas: education, employment, public services, justice, medical devices), systems with limited risk (chatbot) must comply with certain transparency obligations, and minimal risk systems (e.g. spell-checker) are not subject to specific obligations.
Actors Concerned by the Regulation
The AI Act provides specific obligations for each type of operator involved in making an AI system or general-purpose AI model available.
| Actor | Definition | Main Obligations |
|---|---|---|
| Provider | Natural or legal person who develops an AI system or GPAI model and places it on the market or puts it into service under their own name | Technical compliance, documentation, security testing, risk management system |
| Distributor | Actor who markets a product containing an AI system already placed on the market by the provider | Verification of compliance, respect for supplied documentation |
| Importer | Actor who imports products containing an AI system from third countries onto the European market | Verification of compliance, assurance of compliance with provider’s obligations |
| Deployer | Natural or legal person using an AI system (in particular a high-risk AI system) in a professional context or for a service | Monitoring of operation, documentation of use, compliance with specific obligations |
Furthermore, Article 25 of the Regulation makes responsible (in the same way as a provider) any operator intervening throughout the value chain of a high-risk AI system if 1) they market under their own name a high-risk AI system already placed on the market, or 2) make a substantial modification to a high-risk AI system already placed on the market; or 3) modify the intended purpose of an AI system, including a general-purpose one, such that the AI system concerned becomes a high-risk AI system.
Finally, providers and deployers of high-risk or limited-risk systems are subject to important transparency obligations (Article 50):
- Users must be informed when they interact directly with an AI system;
- Users must be informed when they are exposed to an emotion recognition system or a biometric categorization system;
- All content created or modified by AI (images, videos, texts or sounds) must indicate this;
- News articles written by AI must mention this, unless a human journalist has reviewed and validated the content.
The AI Act thus organizes AI systems according to the level of risk they present, and provides specific responsibilities for operators involved in making these systems available to the public. This legal structure creates a framework in which Open Source occupies a particular place.
Open Source Artificial Intelligence Within the AI Act
The European Commission’s initial proposal of April 2021 contained no provision on general-purpose AI models (GPAI), with the original text focusing exclusively on AI systems. The emergence of large language models (recall ChatGPT in November 2022) and the growing awareness of their impact led the European Parliament and Council to add Chapter V on GPAI models during the trilogue negotiations in late 2023 (including the addition of an AI Office responsible for supervising this chapter).
Thus, while the exclusion of Open Source AI systems appeared in the text from the earliest versions, the relief regime for GPAI models (Article 53(2)) was hastily designed by drawing on concepts from existing software licences to define openness in the context of AI. This integration is not entirely smooth, with AI challenging and reinventing the very notion of Open Source.
The AI Act thus provides two distinct mechanisms for Open Source AI, which apply to different objects (AI systems vs. GPAI models) and produce different legal effects (exclusion vs. relief). This duality is a frequent source of confusion.
Exclusion of Open Source AI Systems from the Scope of the Regulation
Article 2, paragraph 12 of the AI Act provides:
« This regulation shall not apply to AI components provided under free and open source licences, unless they are placed on the market or put into service as high-risk AI systems or as AI systems falling under Article 5 or Article 50. »
AI systems published under free and open source licences are not included in the scope of the Regulation except in 3 cases (Article 2.12):
- High-risk AI systems (Title III, Chapters 2 and 3): an AI system classified as high-risk under Annex III (biometrics, critical infrastructure, education, employment, law enforcement, justice, etc.) remains fully subject to the AI Act, even if distributed under an open source licence.
- Prohibited practices (Article 5): an AI system falling under absolute prohibitions (subliminal manipulation, social scoring, etc.) cannot benefit from any exemption.
- Transparency obligations (Article 50): AI systems designed to interact with natural persons, generate synthetic content (text, image, audio, video — « deepfakes ») or detect emotions remain subject to transparency obligations, even under an Open Source licence.
Open Source AI systems are not defined, but mentioned in recital 102 of the AI Act as systems whose « software and data, including models, are published under a free and open source licence by means of which they can be freely shared and which allows users to freely consult, use, modify and redistribute this software and data or modified versions thereof ».
The same recital clarifies that this exception is justified by the fact that systems distributed under a free and open source licence are considered to guarantee « high levels of transparency and openness » insofar as « their parameters, including weights, information on the model architecture and information on model usage are made public », and thus Open Source AI systems contribute « substantially to research and innovation in the market ».
This exception is nonetheless limited to the absence of commercial activity: free and open AI systems that are commercialized, monetized, provided as part of a commercial activity or integrated into products and commercial uses « should not benefit from the exceptions provided for free and open source AI components » (recital 103).
In practice, the limitation of this exemption for Open Source AI systems renders it purely formal, since only Open Source AI systems presenting no or minimal risk are concerned. However, no obligation applies to AI systems falling into this category (Open Source or otherwise): the exclusion in Article 2(12) alleviates no concrete obligation in reality and should be seen more as a political statement.
A comparative reading of recital 102 of the AI Act and recital 14 of the NPLD suggests that these recitals have as their primary purpose to recall the European Union’s support for the development of free and open AI systems and software, a political justification for the exclusion or exemption regimes provided for by the two texts.
Derogatory Regime for Open Source General-Purpose AI Models
Article 53, paragraph 2 provides that the obligations relating to technical documentation (paragraph 1, point a) and information of downstream AI system providers (paragraph 1, point b) « do not apply to providers of AI models that are made available under a free and open source licence that permits access, use, modification and distribution of the model, and whose parameters, including weights, information regarding model architecture and information regarding model usage, are made publicly accessible ».
This obligation targets the relationship between the model provider (upstream) and AI system providers who integrate it into their products (downstream), with compliance by the former being necessary for compliance by the latter (the exemption thus justified by the public nature of information in an open model).
A derogatory regime is provided in Article 53.2, supported by recitals 102 and 104, for general-purpose AI models published under a free and open source licence, provided that the model does not present systemic risk. This is not an exclusion from the scope of the Regulation, but a preferential regime that takes account of the Open Source mechanism and which concretely translates to:
- Exemption from detailed technical documentation obligations;
- Reduced transparency requirements;
- Elimination of the obligation to appoint an authorized representative.
Recital 103 clarifies that « AI components provided for a price or otherwise monetized should not benefit from the exceptions provided for free and open source AI components ».
The Commission’s Guidelines of 18 July 2025 brought important clarification: the mere fact of making a model available on an open repository (HuggingFace, GitHub) does not in itself constitute monetization. However, commercial strategies built around the model (hosting services, fine-tuning, paid API) may cause loss of the benefit of the exemption. This distinction is essential for Open Source ecosystem actors who adopt mixed business models (open core, professional services, dual licensing). The demarcation line remains however fuzzy and will likely be clarified by the practice of the AI Office.
This monetization criterion echoes the CRA mechanism (recital 18), which excludes open source software « developed or provided outside the scope of a commercial activity ». The analyses in the CRA x Open Source guide appear to be applicable to the AI Act as a whole.
In practice, Article 2(12) excludes from the scope of the AI Act Open Source AI components intended for minimal or limited risk uses (i.e. the vast majority of community projects). Thus, a developer who publishes an image classification model under the Apache-2.0 licence for general purposes is not subject to the AI Act. However, if that same model is integrated into a medical diagnostic device (high-risk AI system under Annex I), the AI Act’s obligations apply fully to the provider of the integrated AI system. In the Open Source value chain, it will typically be the deployer who integrates the component who will be responsible, not the original community developer.
Thus, even Open Source GPAI models benefiting from the relief in Article 53(2) remain subject to two obligations:
a) Copyright compliance policy (Article 53(1)(c)):
The provider of a GPAI model (open or closed) must implement a policy aimed at respecting Union law on copyright and related rights, and in particular to identify and respect rights reservations expressed under Article 4 of Directive (EU) 2019/790 (text and data mining exception).
For Open Source AI actors, these developments open up strategic terrain not to be neglected. Rapidly increasing TDM opt-outs reduce the corpus of freely usable data and thus the very capacity to develop competitive models in Europe.
b) Documentation and sharing of a summary of training data (Article 53(1)(d)).
The provider must draw up and make publicly available a sufficiently detailed summary of the content used to train the model, in accordance with the template provided by the AI Office. This template, published in July 2025, requires documentation of:
- The types of content used;
- Data sources;
- Collection methods;
- And (for data from web scraping) a narrative description including the crawler used and the main domains consulted.
A direct parallel can be seen with SBOMs (Software Bill of Materials) made mandatory by the CRA for products containing digital components: just as the SBOM documents software components and their known vulnerabilities, the training data summary aims to document the « ingredients » of the AI model. This convergence reflects a common regulatory logic of traceability of the digital supply chain.
Respect for copyright risks becoming an increasingly important point of friction in light of regulatory developments on this matter. Thus, the European Copyright Society highlighted, in its January 2025 opinion on copyright and generative AI, the practical difficulties in implementing the TDM exception in the context of large-scale model training, particularly concerning the detection and compliance with opt-outs expressed by rights holders. These questions echo the work on the articulation between text and data mining and copyright in the context of AI.
More recently, the European Parliament adopted on 10 March 2026 the report of the Committee on Legal Affairs (Copyright and generative artificial intelligence opportunities and challenges), which recommends the establishment of a rebuttable presumption: insofar as an AI provider does not demonstrate full transparency on their training data, any relevant protected work would be presumed to have been used. A similar mechanism in France is in the process of being extended (imposing on the provider to prove that it has not used a work where there is evidence that makes this exploitation likely), with the Conseil d’État having validated the constitutionality of the proposed Loi Darcos on 19 March 2026. The Loi Darcos was adopted by the Senate on 8 April 2026 in first reading.
Summary: The Dual Open Source Regime of the AI Act
| Type of Object | Mechanism | Condition | Legal Effect |
|---|---|---|---|
| AI System (AIS) | Exclusion (Article 2.12) if non-commercial use | Free and open source licence + publication of code and data | Complete exemption from AI Act (except if high-risk, prohibited, or transparency required) |
| GPAI Model | Relief (Article 53.2) | Free and open source licence + publication of weights, architecture and usage information | Exemption from technical documentation and information to downstream providers; obligation maintained on copyright and data summary |
Image credit: @2006, Señor Codo, blotter_explosion. This file is licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license.
Ressources associées
Organisation impliquée
Commission européenne




