The EU AI Act Newsletter #59: Key Deadline Extended
The general-purpose AI consultation deadline has been extended to 18 September upon the request of a coalition of eleven tech industry associations.
Welcome to the EU AI Act Newsletter, a brief biweekly newsletter by the Future of Life Institute providing you with up-to-date developments and analyses of the EU artificial intelligence law.
Legislative Process
General-purpose AI consultation deadline extended: According to Euractiv's Eliza Gkritsi, a group of eleven tech industry associations requested the European Commission to extend the deadline for submitting inputs on the Code of Practice for general-purpose AI. In a letter dated 8 August, the group argued that the six-week consultation period, scheduled for the middle of summer, restricted their ability to provide meaningful input. They were seeking a minimum two-week extension. The associations, representing companies from the EU and US, included industry groups like Allied for Startups, the American Chamber of Commerce to the European Union and several national tech associations from France, Germany and Poland. Signatories included organisations representing big tech companies such as Google, Meta, Oracle, Amazon, Microsoft and Samsung. The appeal was supported by both European and US-based groups. The European AI Office has now extended the submission deadline to 18 September:
Analyses
The practical impact on programmers: A research team led by professors Holger Hermanns and Anne Lauber-Rönsberg analysed the impact of the AI Act on programmers. Their findings, set to be published in autumn, indicate that the Act mainly affects the development of high-risk AI systems, such as those used in job applicant screening or medical software. Most programmers will notice little change, as the Act imposes strict rules only on AI that impacts such sensitive areas. Non-contentious AI systems, like those in video games, remain largely unaffected. The Act requires that programmers in high-risk areas ensure the training data they use is fit for purpose, and that they keep records to reconstruct which events occurred at what time. Despite these rules, research and development will continue without restrictions. Hermanns and his colleagues view the AI Act positively, seeing it as a successful attempt to establish a reasonable and fair legal framework for AI across Europe, without hindering the continent's global competitiveness.
OpenAI hesitant to use a text watermarking tool: According to Euractiv's Jacob Wulff Wold, OpenAI has created a text watermarking tool that could help ChatGPT comply with the AI Act, which mandates that AI-generated content be marked as such by August 2026. Despite having the tool ready for a year, OpenAI has not yet released it, fearing the loss of users. OpenAI is concerned about potential negative impacts on non-native English speakers, and about the tool's effectiveness against tampering methods, such as using translation or another generative model. A survey indicated that nearly 30% of loyal ChatGPT users would reduce their usage if watermarks were implemented without competitors doing the same, though in another survey, 80% worldwide support AI detection tools. The EU’s draft AI Pact encourages companies to mark AI-generated content voluntarily, and OpenAI has expressed its commitment to complying with the regulation.
AI-enabled biological tools not in scope of the Act: John Halstead, Research Scholar at the Centre for the Governance of AI, published an article about experts raising concerns that AI models could heighten the risk of bioweapon attacks, particularly with large language models like ChatGPT, which could potentially enable novices to create and release deadly viruses. However, the risks associated with AI-enabled biological tools have received less attention. These tools, through advancing scientific research, could also facilitate an increased number of lethal attacks, posing significant risks of their own. Current policies, focused mainly on chatbots, may not adequately address the dangers posed by biological tools. For instance, mandatory reporting requirements based on training compute might not effectively identify high-risk AI-enabled biological tools. The EU's AI Act, while extensive, does not currently cover these tools, as they are neither classified as "high-risk" under the Act, nor general-purpose. The Act allows for future amendments to include new risk areas in Annex III, which could contain AI-enabled biological tools.
Limitations of foundation model evaluations: Elliot Jones and Mahi Hardalupas, researchers at the Ada Lovelace Institute, together with William Agnew, Carnegie Bosh Fellow at Carnegie Mellon University, published an article reviewing the ways evaluations of foundation models are conceptualised and used in the field. Global policy efforts to ensure the safety of advanced AI systems have focused on evaluating foundation models to identify and mitigate risks. These evaluations aim to understand a model's capabilities, risks, performance, behaviour and social impact. For example, the EU's AI Act mandates developers to evaluate foundation models for systemic risks and has established an AI Office to oversee these evaluations. However, Jones, Hardalupas and Agnew argue that evaluations alone are insufficient for ensuring the safety of foundation models in real-world conditions. The lack of agreed terminology or methods, along with the voluntary nature of evaluations, leads to inconsistencies in quality. Finally, the authors state that current policies allow companies to conduct evaluations selectively, often without ensuring that the results lead to actions preventing unsafe products from entering the market.
The limits of compute thresholds: Aidan Peppin, Policy and Responsible AI Lead at Cohere For AI, has written an overview of evidence for the limitations of compute-based thresholds, to support policymakers implementing risk-based governance of AI models. AI governance frameworks, such as the White House Executive Order on AI Safety and the EU AI Act, have established thresholds based on the amount of computing power (measured in FLOPs) used to train AI models. Models exceeding these thresholds, such as 10²⁶ FLOPs in the US or 10²⁵ FLOPs in the EU, are subject to higher scrutiny and regulatory obligations. These thresholds offer a quantifiable, externally verifiable and universally applicable metric, making them appealing for policymakers aiming to manage AI risks. However, the effectiveness of these compute-based thresholds is debated. Increased training compute does not necessarily equate to higher risk, as model capabilities and performance are influenced also by other factors like data quality, optimisation techniques and algorithmic architectures. Peppin encourages policymakers to explore alternative or complementary assessment methods, consider dynamic rather than static thresholds, and clarify FLOPs calculation methods.