TAQ

TAQ

Release control for AI agents.

Turn failed agent runs into replayable tests before deployment.

When prompts, models, tools, or retrieval change, TAQ replays known failures before a new agent version ships.

Regression case
Failed run
Regression test
Replay
Release gate

Scenario

A refund agent approved a refund 45 days after purchase even though policy only allows refunds within 30 days.

Failingv1.2.0
Fixedv1.3.0

Problem

Refund Agent v1 approved an ineligible refund.

Fix

TAQ saved the failure as a regression test.

Result

Refund Agent v1.3.0 passed replay and was approved for release.

Agent scenario

01

Failed run detected

The agent made a policy-violating decision. TAQ captured the full run.

run_v1_20260518_0941failed2 steps failed
2026-05-18T09:41:22Z

Agent

Refund Agent v1.2.0

Model

gpt-4o (2025-01-preview)

Prompt

refund-agent-prompt@1.2

Customer input
{
  "customerId": "C-9182",
  "orderId": "ORD-44821",
  "orderDate": "2026-04-03",
  "requestDate": "2026-05-18",
  "amount": 129,
  "currency": "USD",
  "reason": "Product not as described"
}
Retrieved policy

Refund Policy v3.1: Standard refunds are accepted within 30 calendar days of the purchase date. Requests submitted after 30 days are not eligible unless a product defect is confirmed by the quality team.

Tool calls

01lookup_orderpass
312ms

Retrieve order record by order ID

Input

{
  "orderId": "ORD-44821"
}

Output

{
  "orderId": "ORD-44821",
  "customerId": "C-9182",
  "orderDate": "2026-04-03",
  "amount": 129,
  "currency": "USD",
  "status": "delivered",
  "deliveredDate": "2026-04-07"
}
02check_refund_policypass
287ms

Retrieve the current refund policy from the knowledge base

Input

{
  "policyVersion": "latest"
}

Output

{
  "policyVersion": "3.1",
  "windowDays": 30,
  "rule": "Refunds accepted within 30 calendar days of purchase date. Requests after 30 days are not eligible unless product defect is confirmed.",
  "exceptions": [
    "confirmed_defect",
    "fraud_investigation"
  ]
}
03evaluate_eligibilityfail
1840ms

Determine whether the request qualifies for a refund

Input

{
  "orderDate": "2026-04-03",
  "requestDate": "2026-05-18",
  "policyWindowDays": 30
}

Output

{
  "daysElapsed": 45,
  "eligible": true,
  "reasoning": "Request seems reasonable. Customer noted product issue. Approved."
}

45 days elapsed exceeds the 30-day policy window. eligible should be false.

04issue_refundfail
523ms

Execute the refund transaction

Input

{
  "orderId": "ORD-44821",
  "amount": 129,
  "currency": "USD",
  "reason": "Product not as described"
}

Output

{
  "transactionId": "TXN-88271",
  "status": "processed",
  "amount": 129,
  "creditedAt": "2026-05-18T09:41:26Z"
}

Refund executed on an ineligible order. This tool call should not have been reached.

Agent response

After reviewing the order details and refund policy, I am approving this refund request. The customer purchased the item approximately 45 days ago and the request appears reasonable.

Decision summary

Refund approved. Agent determined request falls within policy window. $129.00 credited to customer account.

Failure reason

Order date: 2026-04-03. Request date: 2026-05-18. Elapsed: 45 days. Policy window: 30 days. Agent approved a refund that should have been denied.

Prompt snapshot — refund-agent-prompt@1.2

You are a refund processing agent. Evaluate whether the customer's refund request falls within the approximate policy window. If the request seems reasonable, approve the refund.