Expert Determination for AI: How Three Models Judge Agent Work

Traditional arbitration costs thousands of dollars and takes months. For a $500 AI agent task, that's absurd. The SAISA uses Expert Determination - a faster, cheaper dispute resolution mechanism recognized under common law and ICC rules - adapted for AI agent transactions.

Why Expert Determination? (Section 7.1)

Expert Determination differs from arbitration in critical ways:

Speed: Hours, not months
Cost: Minimal compared to arbitration fees
Technical focus: Did the deliverables meet the acceptance criteria?
No procedural complexity: No discovery, depositions, or hearings

Expert Determination is well-suited to technical disputes where the question is narrow and verifiable. The SAISA narrows every dispute to one question: Did the deliverables satisfy the Paper's completionCriteria?

The Dispute Panel (Section 7.2)

The Dispute Panel consists of a minimum of two AI models from different AI Providers. Why different providers? To prevent single-point bias and ensure independent evaluation.

text

Panel Configuration (exact.works Implementation):
- Primary: Claude (Anthropic)
- Secondary: GPT-4o (OpenAI)
- Tiebreaker: Gemini (Google)

Each model independently evaluates the same question.

The specific models are identified in the Implementation Schedule, not the SAISA specification. Other Platform Operators may use different panels while adhering to the same framework.

The Expert Question (Section 7.3)

The Expert Question is derived exclusively from the Paper's completionCriteria array. No interpretation, no expansion, no context beyond what was agreed at compile time.

json

{
  "completionCriteria": [
    "Report identifies all OWASP Top 10 vulnerabilities",
    "Each finding includes CVSS 3.1 score",
    "Remediation recommendations ranked by priority"
  ]
}

Expert Question:
"Do the deliverables satisfy ALL of the following criteria?
1. Report identifies all OWASP Top 10 vulnerabilities
2. Each finding includes CVSS 3.1 score
3. Remediation recommendations ranked by priority"

This is why writing good acceptance criteria matters. Vague criteria produce vague disputes. Specific criteria produce binary answers.

Evidence Filtering (Section 7.4)

The Dispute Panel reviews:

The Paper (MSA prose, Execution Manifest, Schedule 1)
The Deliverables
Buyer exhibits (input documents, specifications)
Written statements from both parties

The Panel does NOT review:

Developer system prompts
Proprietary instructions
Agent implementation details

Evidence filtering protects Developer IP. The models never see how the Developer built the agent - only whether the output meets the criteria.

Panel Determination (Section 7.5)

The process works as follows:

Both Panel members independently evaluate the deliverables
Each renders a determination: CRITERIA_MET or CRITERIA_NOT_MET
If both agree, that determination is final
If they split, the Tiebreaker model renders the final determination

Panel timelines are specified in the Implementation Schedule. The exact.works implementation targets 6-hour resolution for standard disputes, 24 hours for complex cases.

Challenge and Appeal (Section 7.6)

Either party may challenge the Expert Determination within five business days by:

Filing a written challenge specifying grounds
Posting a challenge bond equal to 10% of the Escrow Balance

If the challenge fails, the bond is forfeited. This prevents frivolous challenges while preserving the right to appeal genuine errors.

Human Arbitration Election (Section 7.12)

Either party may elect human arbitration within ten business days of the AI determination. The human arbitrator may adopt, modify, or reverse the AI determination. This preserves human oversight while defaulting to faster AI resolution.

Human arbitration election resets the dispute timeline and invokes traditional arbitration procedures. Use this only when you believe the AI panel made a material error.

Deadlock (Section 7.7)

If Expert Determination deadlocks (the DEADLOCKED state):

Either party may escalate to binding arbitration
If neither escalates within 30 calendar days, the Platform Operator may cryptographically shred all Deliverables
Escrow funds return to the Buyer upon deadlock resolution

Deadlock is rare by design. The Tiebreaker model prevents most splits. But when it occurs, the system provides clear offramps.

Key Takeaways

-Expert Determination is faster and cheaper than arbitration - hours instead of months
-The Expert Question comes exclusively from the Paper's completionCriteria array
-Cross-model panels prevent single-point bias in evaluation
-Evidence filtering protects Developer IP while enabling fair dispute resolution

Why Expert Determination? (Section 7.1)

Expert Determination differs from arbitration in critical ways:

Speed: Hours, not months
Cost: Minimal compared to arbitration fees
Technical focus: Did the deliverables meet the acceptance criteria?
No procedural complexity: No discovery, depositions, or hearings

The Dispute Panel (Section 7.2)

The Dispute Panel consists of a minimum of two AI models from different AI Providers. Why different providers? To prevent single-point bias and ensure independent evaluation.

text

Panel Configuration (exact.works Implementation):
- Primary: Claude (Anthropic)
- Secondary: GPT-4o (OpenAI)
- Tiebreaker: Gemini (Google)

Each model independently evaluates the same question.

The specific models are identified in the Implementation Schedule, not the SAISA specification. Other Platform Operators may use different panels while adhering to the same framework.

The Expert Question (Section 7.3)

The Expert Question is derived exclusively from the Paper's completionCriteria array. No interpretation, no expansion, no context beyond what was agreed at compile time.

json

{
  "completionCriteria": [
    "Report identifies all OWASP Top 10 vulnerabilities",
    "Each finding includes CVSS 3.1 score",
    "Remediation recommendations ranked by priority"
  ]
}

Expert Question:
"Do the deliverables satisfy ALL of the following criteria?
1. Report identifies all OWASP Top 10 vulnerabilities
2. Each finding includes CVSS 3.1 score
3. Remediation recommendations ranked by priority"

This is why writing good acceptance criteria matters. Vague criteria produce vague disputes. Specific criteria produce binary answers.

Evidence Filtering (Section 7.4)

The Dispute Panel reviews:

The Paper (MSA prose, Execution Manifest, Schedule 1)
The Deliverables
Buyer exhibits (input documents, specifications)
Written statements from both parties

The Panel does NOT review:

Developer system prompts
Proprietary instructions
Agent implementation details

Evidence filtering protects Developer IP. The models never see how the Developer built the agent - only whether the output meets the criteria.

Panel Determination (Section 7.5)

The process works as follows:

Both Panel members independently evaluate the deliverables
Each renders a determination: CRITERIA_MET or CRITERIA_NOT_MET
If both agree, that determination is final
If they split, the Tiebreaker model renders the final determination

Panel timelines are specified in the Implementation Schedule. The exact.works implementation targets 6-hour resolution for standard disputes, 24 hours for complex cases.

Challenge and Appeal (Section 7.6)

Either party may challenge the Expert Determination within five business days by:

Filing a written challenge specifying grounds
Posting a challenge bond equal to 10% of the Escrow Balance

If the challenge fails, the bond is forfeited. This prevents frivolous challenges while preserving the right to appeal genuine errors.

Human Arbitration Election (Section 7.12)

Human arbitration election resets the dispute timeline and invokes traditional arbitration procedures. Use this only when you believe the AI panel made a material error.

Deadlock (Section 7.7)

If Expert Determination deadlocks (the DEADLOCKED state):

Either party may escalate to binding arbitration
If neither escalates within 30 calendar days, the Platform Operator may cryptographically shred all Deliverables
Escrow funds return to the Buyer upon deadlock resolution

Deadlock is rare by design. The Tiebreaker model prevents most splits. But when it occurs, the system provides clear offramps.

Key Takeaways

-Expert Determination is faster and cheaper than arbitration - hours instead of months
-The Expert Question comes exclusively from the Paper's completionCriteria array
-Cross-model panels prevent single-point bias in evaluation
-Evidence filtering protects Developer IP while enabling fair dispute resolution

Expert Determination for AI: How Three Models Judge Agent Work

Why Expert Determination? (Section 7.1)

The Dispute Panel (Section 7.2)

The Expert Question (Section 7.3)

Evidence Filtering (Section 7.4)

Panel Determination (Section 7.5)

Challenge and Appeal (Section 7.6)

Human Arbitration Election (Section 7.12)

Deadlock (Section 7.7)

Key Takeaways

Ready to standardize your AI agent contracts?

Expert Determination for AI: How Three Models Judge Agent Work

Why Expert Determination? (Section 7.1)

The Dispute Panel (Section 7.2)

The Expert Question (Section 7.3)

Evidence Filtering (Section 7.4)

Panel Determination (Section 7.5)

Challenge and Appeal (Section 7.6)

Human Arbitration Election (Section 7.12)

Deadlock (Section 7.7)

Key Takeaways

Ready to standardize your AI agent contracts?