Future Proof The Authority Stack
Operator Edition · AI Agent Liability Guide Part of the Agent Liability Network
Published by Future Proof Intelligence
Insure Your Agent The Coverage Guide

What questions will an insurance underwriter ask about my AI agent?

Most SME founders who have deployed an AI agent have never had to explain it to an insurance underwriter. That conversation is becoming unavoidable. Carriers that write professional indemnity, errors and omissions, cyber, or general liability coverage are now asking specific questions about AI deployments at renewal. Armilla, Counterpart, and Lloyd's syndicates writing under the AIUC-1 standard all require a structured submission before quoting standalone AI coverage. This article walks through every category of question an underwriter will ask, explains the reasoning behind each category, and tells you how to prepare your answers so you do not walk into a coverage gap you did not know existed.

Key takeaways

  • The first question every AI underwriter asks is about autonomy: what does your agent do without a human review step before it happens? The width of that autonomy envelope is the primary driver of your premium and your exclusions.
  • Underwriters need the full model supply chain: the foundation model, any fine-tuning, retrieval databases, and external tool integrations. A failure in any layer is your liability toward users, regardless of where in the stack it originated.
  • Incident history is not automatically bad. A documented incident that was logged, investigated, and corrected often results in better coverage terms than a deployment with no monitoring records, because documented oversight demonstrates operational control.
  • The governance documentation question is now standard. Carriers writing under AIUC-1 and the Munich Re aiSure framework require evidence that you assessed the system before deploying it. No documentation typically means either a declination or a quote with an AI-specific exclusion endorsement.
  • Your API contract with your model provider limits what the provider owes you. It does not limit what you owe users and third parties. That gap is exactly what underwriters are pricing when they assess your AI agent risk.

Why underwriting AI agents is different from underwriting standard software

Underwriters have been pricing software risk for decades. They know how to assess a SaaS product, a custom application, or a platform that executes deterministic code in response to user inputs. AI agents do not fit that model. Their outputs are probabilistic, not deterministic. They can produce results that were not anticipated by the developers, not constrained by explicit rules, and not reproducible under identical input conditions. This makes standard software underwriting questions inadequate for AI agents.

The core challenge for an underwriter is that an AI agent's failure mode is a distribution, not a specification. A bug in traditional software either triggers or it does not. An AI agent can produce a harmful output on the tenth interaction, the thousandth, or never, depending on user inputs and context the operator cannot fully anticipate in advance. This means the underwriter cannot simply review the code and assess whether it performs as designed. They need to understand how the agent is bounded, supervised, tested, and how incidents are caught and handled when the distribution produces a tail event.

The Moffatt v. Air Canada case (BC Civil Resolution Tribunal, 2024) and the sanctions in Mata v. Avianca (SDNY, 2023) both established the same principle from different directions: the operator is responsible for what the agent does, whether or not a human reviewed it, and whether or not the model provider's terms of service place limits on their liability to you. Underwriters are pricing the exposure that sits in that gap between what your provider owes you and what you owe the world.

Category one: the autonomy envelope

The first question in almost every AI agent underwriting submission is a version of: what does the agent do without human approval before it happens?

Underwriters call this the autonomy envelope. It describes the range of actions the agent can take, the categories of decisions it can produce, and the contexts in which a human reviews those outputs before they take effect. A customer service agent that answers questions and drafts proposed responses for a human to approve has a narrow autonomy envelope. An agent that sends responses, creates calendar invitations, places orders in third-party systems, or updates customer records has a much wider one.

The practical questions you should be prepared to answer in this category are:

What categories of action can your agent initiate? List every type of consequential output: communications sent, records modified, transactions initiated, appointments scheduled, documents filed. Underwriters will interpret an incomplete list as an incomplete governance picture, not a narrow risk profile.

At what points does a human review the agent's output before it takes effect? Be specific about which actions require human confirmation and which execute automatically. Many deployments have partial oversight, where high-value transactions require approval but routine communications do not. That structure matters for pricing.

What limits have you placed on the agent's authority? Examples include: the agent cannot commit to refunds above a certain amount, cannot modify subscription plans, cannot contact users outside business hours. These limits are relevant evidence that you have thought systematically about the agent's authority boundaries.

Category two: the model and supply chain

Once an underwriter understands what the agent does, they want to understand how it is built. The supply chain question covers every third-party component in your system.

Which foundation model or models does the agent use? Underwriters want to know who supplies the core model, under what commercial terms, and what the provider's liability limits are in the API agreement. Most major providers including OpenAI, Anthropic, Google, and Mistral include terms that limit their liability to the fees paid under the API agreement in a given period. Those limits are often small relative to the potential harm a deployment could cause to users.

Has the model been fine-tuned or otherwise modified? Fine-tuning a foundation model to a specific task can reduce hallucination rates in domain-specific queries, but it also removes some of the safety alignment work applied by the original provider. An underwriter will want to know whether a fine-tuned model was evaluated post-modification and whether the evaluation covered the specific failure modes relevant to your deployment context.

What retrieval databases, knowledge bases, or external tools does the agent call? Retrieval-augmented generation (RAG) architectures introduce a new failure mode: the agent retrieves outdated, incorrect, or out-of-context information and presents it as authoritative. Underwriters will ask about the provenance of your knowledge base, how frequently it is updated, and how you monitor for retrieval errors. External tool integrations, such as calendar APIs, CRM systems, or payment processors, introduce a different risk: the agent makes an irreversible real-world action based on an AI output that may have been wrong.

Category three: governance and pre-deployment assessment

The governance category is where most SME operators are least prepared. Larger enterprises that have been working through EU AI Act compliance already have the documentation infrastructure that underwriters want to see. Smaller operators often deployed their agent quickly and have no formal record of how they assessed it before going live.

The questions underwriters ask in this category include:

How was the agent tested before deployment? Underwriters look for evidence that you ran the agent against a realistic range of user inputs before exposing it to real users. This does not have to be a formal process, but it does have to be documented. A testing record that shows you deliberately probed failure modes, harmful output scenarios, and edge cases carries much more weight than a description of informal checking.

Was the agent assessed by a third party? Carriers writing under the AIUC-1 standard, developed by the AI Underwriting Consortium, require evidence of third-party assessment or a self-assessment against a recognised framework before they will quote coverage. The Munich Re aiSure product uses parametric coverage structures that are explicitly tied to certification and documentation standards. Operators who can demonstrate assessment against a recognised framework, such as the Agent Certified methodology or ISO/IEC 42001:2023, receive better terms because the documentation reduces the underwriter's uncertainty about exposure.

Who approved the deployment decision? For SMEs, this often means the founder. Underwriters are not looking for a formal committee process. They want to know that someone with decision-making authority consciously approved the deployment, understood the risk profile, and documented that approval. An undocumented deployment by a developer without business sign-off is a red flag because it suggests the organisation does not treat AI agent risk as a business-level decision.

Category four: oversight in production

Governance covers the period before deployment. Oversight covers the period after it. These are distinct categories and underwriters treat them separately.

How do you monitor the agent's outputs in production? This includes: whether you log interactions, at what level of detail, for how long, and whether anyone reviews those logs systematically. Underwriters are not expecting real-time human review of every agent output. They are looking for a system that would detect a pattern of harmful or incorrect outputs before the pattern produced a material claim.

What triggers a human intervention? Good oversight systems have defined thresholds: if the agent produces a response that includes certain categories of content, if a user escalates, if an error rate exceeds a defined level, a human reviews the case. Underwriters want to know those thresholds exist and are monitored, not just defined in policy documents.

What is your incident response process? When something goes wrong, what happens in the first 48 hours? Most AI liability policies require prompt notification to the carrier as a condition of coverage. Operators who cannot describe their incident response process, or who acknowledge they have no formal process, are communicating that the first 48 hours after an incident would be improvised. That is a material underwriting concern. The AI agent incident response guide on this site covers the basic process in detail.

Category five: scope of deployment

Underwriters need to understand scale to price exposure. The same agent architecture poses different risks depending on who it interacts with, how many interactions it handles, and what the consequences of a failure look like at each touchpoint.

Who are the end users and what decisions does the agent influence? An agent advising consumers on regulated financial products, medical decisions, or legal matters is a fundamentally different risk from an agent managing internal scheduling or drafting marketing content. Consumer-facing agents operating in regulated sectors attract specific exclusions or require specific endorsements in most professional indemnity policies.

How many interactions does the agent handle per day or month? Volume drives exposure. An agent handling one thousand customer service interactions per day has a different claims profile from one handling twenty. Underwriters use volume as a proxy for the frequency at which the agent's error distribution will produce tail events.

In what jurisdictions do you deploy the agent? EU-based operators deploying to users in high-risk AI Act categories face specific regulatory obligations under Regulation (EU) 2024/1689. Operators deploying to users in multiple jurisdictions face multiple legal frameworks. Underwriters need to understand jurisdictional scope to assess whether your governance documentation meets the requirements of the most demanding framework you are subject to. For a fuller picture of how different jurisdictions treat AI liability, see the US, EU and UK comparison on the Agent Liability Global Desk.

What to prepare before your first underwriting conversation

Most SME operators can prepare a workable underwriting file in a few hours if they approach it systematically. The five documents that most significantly improve coverage terms and reduce the likelihood of exclusions are:

Agent scope documentation. A one-to-two page description of what the agent does, who it interacts with, what actions it can take autonomously, and where human oversight checkpoints are located. This document does not have to be long. It has to be specific and accurate.

Model and supply chain record. A list of every third-party component: the foundation model and provider, any fine-tuning and who performed it, retrieval databases and their update frequency, and external tool integrations. Include a note on what the commercial agreements with providers say about their liability limits.

Pre-deployment testing log. A record of the testing you ran before going live. Include the categories of inputs you tested, the failure modes you specifically probed, any issues you identified, and what you did to address them. A spreadsheet is adequate. What matters is that it exists and is timestamped.

Production monitoring record. A description of what you log, for how long, what your escalation thresholds are, and who receives alerts. If you have logs from the last three to six months showing normal operation or capturing and resolving any incidents, include a summary.

Incident log. A chronological record of any errors, unexpected outputs, user complaints, or near-misses since the agent was deployed. Including zero incidents is fine and not suspicious. What is suspicious is no record at all, because it suggests no monitoring exists.

Operators who complete a formal self-assessment against a recognised framework before approaching underwriters are consistently in a stronger negotiating position on coverage terms. The Agent Certified methodology at agentcertified.eu maps to the seven governance dimensions that underwriters assess: data governance, model transparency, autonomy controls, human oversight, performance monitoring, security resilience, and deployment governance. Completing that assessment produces documentation that directly answers the questions in every category above.

Frequently asked questions

What is the first thing an underwriter wants to know about my AI agent?

The first question is almost always about autonomy: what does your agent do without human review before it happens? The width of that autonomy envelope is the primary driver of your premium and your exclusions. An agent that drafts outputs for human approval carries very different risk from an agent that sends messages, places orders, or modifies records without a review step.

Do underwriters require documentation before quoting AI agent coverage?

Yes. Most underwriters who are actively writing AI agent coverage require at minimum: a description of what the agent does and who it interacts with, the model or models it is built on, a summary of human oversight controls, and a record of any prior incidents or near-misses. Carriers writing under the AIUC-1 standard also ask for governance documentation showing how the agent was assessed before deployment. Operators who cannot produce basic documentation typically receive either a declination or a coverage quote with significant exclusions.

What does an underwriter mean by the model supply chain?

The model supply chain refers to every third-party component in your AI agent: the foundation model, any fine-tuning layers, retrieval or knowledge bases, tool integrations, and any external APIs called during inference. Underwriters ask about this because a failure in any layer can produce harmful outputs for which you, as the deployer, bear responsibility toward users, regardless of which party in the chain was the proximate cause of the failure.

Will my AI agent's prior incident history affect my premium?

Yes, but not automatically in the way you might expect. A documented incident that was logged, investigated, and corrected often produces better coverage terms than no documentation at all, because it demonstrates a functioning monitoring system. What concerns underwriters most is discovering that an agent has been running in production for an extended period with no systematic logging of errors or near-misses. That absence suggests untracked exposure rather than a clean record.

How should I prepare my underwriting file before approaching an insurer?

Prepare five categories: agent scope documentation, model and supply chain record, pre-deployment testing log, production monitoring record, and an incident log. Underwriters will ask for some version of all five. Having them ready demonstrates operational discipline and typically results in tighter pricing and broader coverage terms.

References

  1. Moffatt v. Air Canada, 2024 BCCRT 149. BC Civil Resolution Tribunal, February 2024.
  2. Mata v. Avianca, Inc., No. 22-cv-1461 (S.D.N.Y. 2023). United States District Court, Southern District of New York.
  3. Regulation (EU) 2024/1689 of the European Parliament and of the Council on artificial intelligence (EU AI Act). OJ L, 12 July 2024.
  4. AI Underwriting Consortium (AIUC). AIUC-1 AI Agent Underwriting Standard. Version 1.0.
  5. Munich Re. aiSure product framework and parametric AI performance coverage. Munich Re Special Enterprise Risks, 2025.
  6. ISO/IEC 42001:2023. Artificial intelligence management system requirements. International Organization for Standardization, December 2023.
  7. Armilla AI. Coverage framework documentation. Armilla, 2026. Lloyd's coverholder, Chaucer and Axis Capital syndicate.
  8. Counterpart. AI-specific affirmative coverage endorsements. Counterpart, 2025.