International AI Safety Report 2025

Publication Date

January 1, 2025

Page Number

298

Link to Report

Download

Authors

International AI Safety Report 2025

This first global review of advanced AI systems analyzes their capabilities, risks, and safety measures. The report, led by Yoshua Bengio, brings together 100 experts from 30 countries and organizations (UN, EU, OECD) to guide policymakers on AI safety. While acknowledging AI’s societal benefits, it emphasizes the need for evidence-based risk management. Calls for international collaboration to align AI development with ethical principles. Its findings will shape global AI governance through the upcoming AI Action Summit in Paris.

Key Findings

  1. Capabilities of General-Purpose AI (GPAI):
    • GPAI systems have rapidly advanced, excelling in programming, scientific analysis, and strategic planning.
    • These systems can now act autonomously as AI agents, planning and executing actions to achieve goals and raise critical governance concerns.
  2. AI Risks:
    • Malicious Use: Includes cyberattacks, deepfakes, misinformation campaigns, and even the creation of biological weapons.
    • System Malfunctions: Issues like biases, hallucinations, reliability failures, and potential loss of control over autonomous systems.
    • Systemic Risks: Broad impacts, including job displacement, privacy risks, high energy consumption, and widening economic gaps from unequal AI access.
  3. Challenges in Risk Management:
    • Lack of transparency in AI decision-making (“black box” systems).
    • Regulatory gaps due to the rapid pace of AI advancements.
    • Competitive pressures in the AI industry prioritize speed over safety.

Recommendations and Research Priorities

  • Transparency and Accountability: Increase understanding of how GPAI systems function internally and ensure they behave reliably.
  • Global Collaboration: Develop international safety standards and frameworks for responsible AI deployment.
  • Ethical Design: Align AI systems with human values through adversarial training and real-time monitoring.
  • Focus on Systemic Risks: Address workforce impacts and environmental sustainability while promoting equitable access to AI benefits.

Overview

Capabilities of general-purpose AI

Analyzes the evolving capabilities and strengths of general-purpose AI systems. The field is advancing rapidly, with AI becoming increasingly adept at handling diverse and complex tasks—though challenges remain with reliability and context-specific performance.

  1. Rapid Improvement: General-purpose AI now excels at complex tasks, including natural conversation, image creation, and software development. Recent advances have significantly enhanced its scientific and programming capabilities, as documented in the May 2024 interim report.
  2. Understanding Modalities: The different types of data AI systems can process are fundamental to their operation. These include:
    • Text: Interacting through natural language, coding, and generating written content.
    • Images: Creating or interpreting visual data.
    • Video: Processing and analyzing moving visual content.
    • Robotic Actions: Although limited, AI aspires to perform physical tasks effectively. Understanding and leveraging these modalities is essential for effective AI system design and implementation.
  3. Diverse Task Execution: General-purpose AI demonstrates remarkable versatility:
    • Assists programmers with software engineering tasks of varying complexity.
    • Generates creative content such as images and narratives.
    • Finds, summarizes, and analyzes data across diverse sources.
    • Solves complex problems in mathematics and science at advanced academic levels. However, AI still struggles with robotic tasks, maintaining factual accuracy, and managing long-term projects independently.
  4. Inference-Time Enhancements: Highlights key techniques that improve AI performance after initial training:
    • Prompting Methods: Using specific cues to guide AI toward better responses.
    • Answer Selection: Sampling multiple responses to identify the most appropriate one.
    • Chains of Thought: Breaking down complex tasks into smaller steps to enhance AI’s reasoning abilities. These enhancements substantially improve the reliability and depth of AI system outputs.
  5. Future Potential: Explores how AI capabilities can expand through fine-tuning, prompting, and new tools. However, evaluating these capabilities remains challenging due to their dynamic nature and complex interactions with user inputs and training.

Risks

Analyzes the various risks of general-purpose AI deployment. These risks span technical, societal, and regulatory challenges. It emphasizes how understanding these risks is crucial for developing effective governance frameworks that ensure AI technologies remain both safe and beneficial.

  1. Types of Risks: Categorizes risks into several domains, highlighting both immediate and broader societal implications of AI technology. It discusses risks resulting from malfunctions, systemic risks, and potential threats posed by malicious uses of AI systems.
  2. Malfunctions: Covers AI system failures caused by technical errors, design flaws, and environmental interactions. The report highlights the difficulty of predicting these malfunctions and stresses that understanding their potential severity is key to managing risks.
  3. Systemic Risks: Systemic risks encompass broad societal challenges from AI deployment, including disruption to socio-economic systems and widening inequalities. AI’s growing influence in critical sectors like finance and governance creates concerns about accountability and transparency.
  4. Evidence Dilemma: The “evidence dilemma” complicates risk preparation. Policymakers must balance proactive measures against uncertain risks with limited conclusive data on AI capabilities and limitations.
  5. Risk Management Practices: Underscores the need for robust risk management practices to mitigate the potential harms associated with AI. This involves:
    • Identifying and assessing risks related to AI systems.
    • Developing models that are more trustworthy and reliable.
    • Monitoring AI systems post-deployment to catch and address potential issues early on.

2.1 Risks from malicious use

Examines threats from malicious use of general-purpose AI. It analyzes how bad actors exploit AI technologies for harmful activities and their broader societal impact, emphasizing that effective governance must address these risks while supporting beneficial AI innovation.

  1. Nature of Malicious Use: A primary concern is how malicious actors can exploit AI to generate deceptive content that harms individuals and organizations. For instance:
    • Fake Content Generation: AI systems can create convincing false news articles, images, and videos to mislead or defraud individuals, including scams that trick people into sharing personal data or financial information.
    • Targeted Psychological Manipulation: Malicious actors craft targeted narratives to manipulate opinions and behavior, damaging relationships and communities through deepfakes and manipulated media.
  2. Specific Types of Harm: Highlights particularly harmful activities enabled by AI:
    • Non-Consensual Intimate Imagery (NCII): Malicious users create or spread intimate images without consent, causing significant emotional and psychological damage.
    • Child Sexual Abuse Material (CSAM): AI poses grave risks for producing or disseminating content related to child exploitation.
    • Spear Phishing and Extortion: Highly personalized and convincing messages enable more effective phishing campaigns and extortion threats.
  3. Manipulation of Public Opinion: AI-generated content can systematically manipulate public perceptions:
    • Misinformation Campaigns: AI generates compelling content that mimics human writing, enabling coordinated disinformation campaigns that manipulate voters and undermine democratic discourse.
    • Increased Scale of Disinformation: AI’s rapid content production allows harmful narratives to spread quickly, creating pervasive misinformation before countermeasures occur.
  4. Cybersecurity Threats: AI enhances modern cyber offenses:
    • Automation of Attack Chains: AI streamlines cyber attacks by automating complex operations, allowing less skilled attackers to conduct sophisticated attacks and increasing risks to critical infrastructure.
    • State-Sponsored Threats: Growing evidence shows state-sponsored actors actively exploring AI for cyber operations, raising national security stakes.
  5. Risk Management Practices: Outlines essential practices:
    • Development of Trustworthy AI Models: Advancing AI that follows safety and ethical guidelines helps prevent misuse by creating models resistant to generating harmful outputs or exploitation.
    • Ongoing Monitoring and Intervention: Robust monitoring and safeguards enable swift detection and response to AI misuse through technical controls and governance.
  6. Policy Considerations: Key regulatory challenges require:
    • International Cooperation: Global AI technology and cyber threats demand international collaboration on regulations and standards to prevent misuse and develop accountability frameworks.
    • Balancing Offensive and Defensive Research: A balanced regulatory approach must protect against misuse while enabling essential defensive AI research and national security capabilities.

2.2 Risks from malfunctions

Examines reliability and performance issues in general-purpose AI systems, focusing on key malfunction types and their implications. It emphasizes understanding and mitigating AI system risks while highlighting the need for comprehensive risk management through testing, monitoring, and collaborative efforts.

  1. Types of Reliability Issues: Categorizes several standard failure modes that AI systems can exhibit, which include:
    • Confabulations or Hallucinations: AI systems may produce outputs containing substantial inaccuracies or fabrications, such as citing non-existent legal precedents or policies. These errors can mislead users, particularly in high-stakes domains like law or medicine.
    • Common-Sense Reasoning Failures: AI models often struggle with simple reasoning tasks, leading to mistakes in basic arithmetic or misunderstanding fundamental causal relationships.
    • Contextual Knowledge Failures: AI systems may provide outdated or incorrect information when contextual awareness is required, such as offering inaccurate medical advice or failing to account for recent developments.
  2. Loss of Control: Highlights the potential for “loss of control” scenarios, where AI systems act in ways that deviate from their intended goals or behaviors.
    • Distinguishes between intentional and unintentional loss of control, noting that both types can present significant risks to users and society. For example, AI that learns and adapts in unforeseen ways may not align with human oversight or expectations.
  3. Evidence Dilemma: Discusses the challenge of assessing the operational risks associated with AI capabilities.
    • Policymakers and researchers face an “evidence dilemma,” where they must balance the urgency for action against the uncertainty of whether a specific malfunction will materialize.
    • Data on AI performance is often limited, and categorizing capabilities can be complex. Historical benchmarks may not capture near misses that could lead to successful attacks when exploited by a human operator.
  4. Impact of Malfunctions:
    • Misinformation and Misguidance: AI errors could propagate false information, potentially harming individuals who rely on AI for critical decisions.
    • Systemic Risks: Widespread reliance on malfunctioning AI systems could undermine trust in technology and lead to a general erosion of confidence in automated systems.
  5. Risk Management Strategies:
    • Ensuring that AI systems are rigorously tested and validated before deployment.
    • Developing robust real-time mechanisms for monitoring AI behavior, enabling timely interventions when malfunctions occur.
    • Encouraging the collaborative development of AI that combines human oversight with machine learning to enhance reliability and trustworthiness.

2.3 Systemic Risks

Examines systemic risks of AI deployment beyond individual model capabilities, focusing on market concentration and potential widespread failures. It advocates for regulations that balance AI benefits with societal risks.

  1. Definition of Systemic Risks: Defines systemic risks as those affecting society through AI system deployment, considering AI’s influence on governance, social stability, and societal dynamics.
  2. Market Concentration: A key concern is the high concentration of general-purpose AI technologies among a few major players. This concentration leads to:
    • Vulnerability to Failures: When a small number of AI models dominate the market, their failures can have widespread repercussions across sectors like finance and healthcare, potentially affecting millions.
    • Power Imbalance: The dominance of a few large tech companies in AI development creates governance challenges, leading to unequal power dynamics in decision-making and technological advancement.
  3. Single Points of Failure: Examines how reliance on specific AI systems creates risks of catastrophic failure. A malfunction in a dominant AI system can disrupt entire sectors dependent on its functionality.
  4. Difficulties Reducing Risks: High development costs limit smaller firms’ AI capabilities. Large companies frequently acquire these firms, increasing market concentration and reducing the diversity of AI solutions that could prevent systemic failures.
  5. Measurement and Evaluation: Methods for measuring systemic AI risks remain limited. Despite emerging research, the slow adoption of these methodologies impedes effective policy management.
  6. Regulatory Considerations: Policymakers must balance innovation with preventing over-reliance on potentially destabilizing AI technologies through improved governance frameworks and oversight.
  7. Recommendations: To address systemic risks, the report advocates for:
    • Regulatory frameworks that foster competition and diversity in AI to reduce market power concentration.
    • Enhanced monitoring and assessment mechanisms to ensure AI systems function reliably and safely within society.

2.4 Impact of open weight general purpose AI models on AI risks

Examines open-weight AI models with publicly accessible model weights and architectures. While this approach promotes innovation and transparency, it raises significant concerns about misuse and control. It advocates for balanced risk evaluation and responsible practices to optimize benefits while minimizing potential harm.

  1. Understanding Open-Weight Models: Open-weight models allow researchers to freely access and download model architectures. This transparency enhances safety research and innovation while introducing certain risks.
  2. Benefits of Openness: Public access to model weights offers key advantages:
    • Transparency: The research community can better identify and fix model flaws through increased scrutiny.
    • Facilitating Research: Researchers can advance safety practices by experimenting with existing models without proprietary constraints.
  3. Risks of Malicious Use: Despite these benefits, open-weight models pose significant risks:
    • Potential for Misuse: Easy access enables bad actors to exploit models for harmful purposes—creating deepfakes, launching automated cyber-attacks, or spreading misinformation.
    • Difficulty in Monitoring: Once released publicly, developers lose control over model usage, making it hard to track applications or implement critical safety updates.
  4. Marginal Risk Assessment: Comparing risks between open and closed models shows that careful evaluation is essential before public release.
  5. Challenges in Risk Management: Key difficulties in managing open-weight model risks include:
    • The broad range of potential real-world applications.
    • Limited understanding of consequences due to AI systems’ complexity and unpredictability.
  6. Recommendations for Mitigation: To address these risks, the report recommends:
    • Creating best practices for responsible model release and use.
    • Establishing standards to monitor and evaluate societal impacts.
    • Fostering collaboration among developers, researchers, and policymakers to build comprehensive risk management approaches.

Technical approaches to risk management

Outlines strategies and methodologies for identifying, assessing, and mitigating risks associated with general-purpose AI systems. It emphasizes developing effective, adaptable strategies while addressing these technologies’ complex challenges, advocating for a multifaceted approach to ensure safety and reliability in AI deployment.

  1. Overview of Risk Management: Risk management is vital in AI development, requiring frameworks that address unique challenges through risk identification, impact evaluation, and safeguard implementation.
  2. Challenges in Risk Management: The field faces distinct difficulties in managing general-purpose AI risks:
    • Broad Application Range: The diverse applications of AI make it challenging to predict system behavior across different contexts.
    • Limited Understanding: Developers often lack adequate understanding of their models’ operations, complicating behavioral predictions and explanations.
    • Evolving Nature: The rapid development of AI capabilities introduces new, difficult-to-foresee risks.
  3. Risk Identification and Assessment: The section emphasizes robust methods to identify and assess risks, including:
    • Continuous monitoring of AI systems to detect emerging risks.
    • Development of flexible risk assessment methodologies that adapt to various AI contexts.
  4. Mitigation Strategies: Key strategies for risk mitigation include:
    • Training Trustworthy Models: Enhancing AI model reliability through improved training techniques and datasets to minimize biases and harmful outputs.
    • Monitoring and Intervention: Implementing monitoring mechanisms for real-time oversight and swift intervention when risks emerge.
    • Technical Methods for Privacy: Using advanced privacy-preserving techniques to protect sensitive data during AI training and deployment.
  5. Multilayered Defense Mechanisms: Emphasizes layered defense strategies combining technical solutions with human oversight, recognizing that technical measures alone are insufficient.
  6. Future Directions: Highlights ongoing research and stakeholder collaboration’s importance in developing AI-specific risk management frameworks, emphasizing advancing monitoring tools and methods.

3.1 Risk Management Overview

Outlines fundamental risk management principles for general-purpose AI systems. Effective risk management—encompassing risk identification, assessment, and mitigation—is crucial for safe AI deployment. It emphasizes collaborative approaches and holistic solutions to address emerging challenges.

  1. Definition of Risk Management: Risk management is a systematic approach to identifying, assessing, and mitigating risks that could compromise AI systems’ performance and safety.
  2. Importance of Risk Management in AI: General-purpose AI requires robust risk management because these systems operate across diverse contexts, making risk assessment inherently complex.
  3. Core Components of Risk Management:
    • Risk Identification: Recognizing potential hazards in AI use, including intentional misuse and unintentional malfunctions.
    • Risk Assessment: Evaluating the impact and likelihood of identified risks to determine priorities for attention and mitigation.
    • Risk Mitigation: Implementing strategies to minimize identified risks through safety measures, standards, and compliance protocols.
  4. Challenges to Effective Risk Management: Several obstacles hinder effective risk management:
    • AI technology advances faster than risk management practices can adapt.
    • Limited understanding of risk scope due to AI models’ proprietary nature and decision-making opacity.
  5. Collaborative Efforts for Improvement: Advocates for collaboration among researchers, developers, and regulators to enhance risk management practices. It emphasizes developing adaptable guidelines and frameworks that evolve with the AI landscape.
  6. Holistic Approach: Recommends a comprehensive risk management approach that extends beyond technical considerations to encompass social, ethical, and regulatory dimensions, fostering a complete understanding of AI-related risks.

3.2 General challenges for risk management and policymaking

Explores significant challenges in AI risk management and policy development, including system complexity, rapid technological change, limited transparency, resource constraints, market competition, and cross-border coordination needs. These issues underscore the importance of flexible, coordinated approaches to managing AI risks.

  1. Complexity of AI Systems: Complex AI systems are difficult to predict across different contexts. This makes it challenging for developers and policymakers to assess risks and create policies that account for all possible scenarios.
  2. Rapid Technological Advancements: Rapid AI innovation outpaces policy development, creating governance gaps and unaddressed risks. This requires constant updates to risk management and regulations.
  3. Lack of Transparency: Many AI models are “black boxes” with limited access to their internal workings, hindering understanding of decisions. This opacity raises concerns about system safety and reliability, complicating risk management efforts.
  4. Diverse Stakeholder Perspectives: Different stakeholders have varying views on acceptable risks and needed measures, making it challenging to reach a consensus and create effective policies.
  5. Resource Constraints: Effective risk management requires substantial resources, including expertise, time, and financial investment. Many organizations may lack the resources to conduct thorough risk assessments or implement advanced risk mitigation strategies.
  6. Competitive Pressures: The competitive landscape in AI development may lead companies to prioritize innovation and deployment speed over comprehensive risk management. This pressure can result in underinvestment in safety measures and risk mitigation practices.
  7. International Coordination: AI’s global nature necessitates international cooperation for effective governance. Countries may adopt varying standards and regulations, leading to inconsistencies and complicating collaborative risk management efforts.
  8. Social and Ethical Considerations: Policymakers must account for the societal impacts of AI deployment, which includes ethical considerations around bias, privacy, and transparency. Balancing these values with technological advancement is a significant challenge.

3.3 Risk identification and assessment

Outlines key processes for identifying and evaluating risks in general-purpose AI systems. It covers risk identification methods, assessment frameworks, and current challenges while emphasizing the importance of stakeholder input and ongoing monitoring.

  1. Importance of Risk Identification: Effective risk management begins with identifying potential risks in AI systems, focusing on operational failures and possible misuse.
  2. Types of Risks: The report categorizes various risks, including:
    • Technical Risks: System malfunctions, inaccuracies, and performance failures.
    • Ethical Risks: Discrimination, bias, and moral implications of AI decisions that affect individuals or communities.
    • Societal Risks: Broader implications including privacy violations, security threats, and harm from unforeseen AI behaviors.
  3. Methodologies for Assessment:
    • Qualitative and Quantitative Approaches: Risk assessment combines qualitative methods (expert judgment, stakeholder consultation) with quantitative techniques (statistical modeling, simulations) for systematic evaluation.
    • Scenario Analysis: Exploring hypothetical scenarios helps understand how risks might materialize under various conditions, enhancing preparedness.
  4. Limitations of Current Assessments: Limited data and incomplete risk modeling lead to inaccurate risk estimates.
  5. Need for Comprehensive Frameworks: Experts advocate for unified risk assessment frameworks integrating multiple approaches and cross-disciplinary insights.
  6. Stakeholder Engagement: Input from technologists, ethicists, policymakers, and communities strengthens risk identification and evaluation, improving assessment quality and acceptance.
  7. Continuous Monitoring and Iteration: Risk identification and assessment require ongoing monitoring of AI systems to adapt to the evolving landscape of risks.

3.4 Risk mitigation and monitoring

Outlines strategies for reducing and managing risks in general-purpose AI systems. The key approaches include training trustworthy models, monitoring system behavior, and establishing intervention protocols. It emphasizes combining technical solutions with human oversight for maximum safety.

  1. Overview of Risk Mitigation Strategies: Risk mitigation techniques protect against potential harm from AI systems by reducing risks’ probability and severity.
  2. Training Trustworthy Models: Developing robust and reliable AI models is essential. While techniques like adversarial training can strengthen models against vulnerabilities, current methods cannot eliminate unsafe outputs.
  3. Monitoring and Evaluation: Continuous monitoring of AI systems detects risks through performance evaluation, behavior tracking, and impact assessment. This process requires specialized tools to monitor systems and identify potential issues.
  4. Intervention Mechanisms: Intervention strategies are crucial for addressing unsafe AI system actions. These can include automated responses to detected risks and human oversight to ensure safe and ethical operation.
  5. Technical Methods for Privacy: Privacy-preserving techniques protect user data and ensure regulatory compliance, preventing data breaches and misuse.
  6. Limitations of Current Techniques: Current risk mitigation and monitoring techniques have significant limitations. Continued research is needed to improve their effectiveness as AI systems evolve.
  7. Combination of Approaches: A multifaceted approach combining technical risk mitigation, monitoring methods, and human oversight creates a more robust safety framework for responding to emerging risks.

Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless. Moreover. Furthermore. Nevertheless. Nonetheless.

GYBN is a youth-led initiative representing young people’s voices in the UN Convention on Biological Diversity (CBD) negotiations. Its goals are threefold: to raise youth awareness about biodiversity’s importance, connect…

A non-profit organization blending science, art, and storytelling to foster planetary awareness and understanding. Its mission is to spark a fundamental shift in thinking toward a safer, fairer, and better…

A nonprofit and international advocacy organization devoted to reducing all-hazards global catastrophic risk (GCR). Fiscally sponsored by Social and Environmental Entrepreneurs (SEE), it envisions a world where all governments have…