What is Evidence Gate?

Evidence Gate is a fail-closed quality gate for AI-driven CI/CD pipelines. It integrates with GitHub Actions to validate test coverage, SBOM, provenance, and NemoClaw sandbox configurations before merge. It supports 25 gate types.

What are Blind Gates?

Blind Gates hide evaluation criteria (thresholds and rules) from AI agents. The agent submits evidence without knowing the criteria, structurally preventing fabrication of results to match expected values.

How does NemoClaw integration work?

NemoClaw is NVIDIA's sandbox platform for AI agent execution. Evidence Gate validates NemoClaw blueprints (blueprint.yaml), policies (policy.yaml), and inference routing configurations at CI stage, detecting dangerous configurations and blocking them before deployment.

Is Evidence Gate free?

Yes. The Free plan includes 100 evaluations/month, 1,000 API calls, all 25 gate types, SARIF output, and GitHub Check Runs integration. Open-source projects can use it at no cost.

Evidence Gate — AI Agent Quality Gates + NemoClaw Governance

How It Works

Three steps to enforced quality in every pull request

1

Define

Add Evidence Gate to your workflow YAML. Specify gate types, evidence files, and thresholds.

2

Evaluate

Gates automatically verify your evidence files — existence, schema, thresholds, and integrity.

3

Enforce

Fail-closed: pipelines stop on quality violations. Results appear in PR summary and workflow annotations.

Blind Gates: Why AI Agents Need Hidden Criteria

When an LLM writes your code AND your tests, every visible threshold becomes a target to optimize against — not a quality standard to meet

Designed for AI Governance

Evidence Gate's design aligns with Japan's AI Business Operator Guidelines

Fail-Closed Safety

All gates default to FAIL. Only explicitly verified evidence earns a PASS. Supports the guideline's emphasis on safety and risk prevention.

Transparency & Trust Levels

Genchi Genbutsu Trust Levels (L1–L4) make evidence reliability explicit. SHA-256 Evidence Chain enables integrity verification of all judgment data.

Security & Accountability

AWS KMS encryption (FIPS 140-2 validated), HMAC-signed cursors, and a maturity-level-based Quality State Model provide auditable governance at every step.

Evidence Gate supports practices aligned with key principles including transparency, safety, and accountability. Learn more about our approach →

This product is not endorsed by or affiliated with any government body. Feature descriptions are for informational purposes only and do not constitute compliance certification.

What Evidence Gate Protects

CI gates alone aren’t enough — AI agents also operate at runtime. Evidence Gate validates the NemoClaw infrastructure that runs those agents.

When your pipeline clears the Evidence Gate, it deploys NemoClaw sandboxes — isolated environments where AI agents execute. Understanding this runtime layer explains why Evidence Gate validates blueprints, policies, and inference configuration: a misconfigured sandbox can escape isolation or consume unbounded resources. The architecture below shows what Evidence Gate is guarding.

Plugin (TypeScript)

Blueprint (Python)

OpenShell Sandbox

Sandbox Lifecycle

Five stages from blueprint resolution to running sandbox — the Plugin handles stages 1–2, the Blueprint handles stages 3–5

1

Resolve

Plugin resolves blueprint version and downloads the versioned Python artifact

›

2

Verify

Plugin checks blueprint signature and integrity before execution

›

3

Plan

Blueprint determines the OpenShell resources needed for the sandbox

›

4

Apply

Blueprint invokes OpenShell CLI to create and configure sandbox resources

›

5

Status

Blueprint reports sandbox readiness and connection endpoints

›

Inference Routing

Three provider profiles switchable at runtime — no sandbox restart required

NVIDIA Cloud

Nemotron 3 Super 120B

Production inference via build.nvidia.com. Highest capability for demanding workloads.

Production

Local NIM

NIM container on local network

On-premises inference for testing and air-gapped environments where cloud access is restricted.

Testing / Air-gapped

Local vLLM

vLLM server on localhost

Offline development with fast iteration cycles. No network dependency required.

Offline Dev

Providers can be switched at runtime without restarting the sandbox. Configuration is managed through the Blueprint's inference settings.

Security Guarantees

Evidence Gate validates that every sandbox deploys with four mandatory isolation layers — if any layer is misconfigured, the gate fails the pipeline

Landlock LSM

Linux Security Module restricting filesystem access at the kernel level. The sandbox process cannot access paths outside its granted set — even if the agent finds a code execution vulnerability. Evidence Gate validates that Landlock rules are correctly configured before deployment.

seccomp Filtering

System call filter that limits which kernel operations the sandbox process can invoke. Blocks dangerous syscalls like ptrace, mount, and reboot before they reach the kernel.

Network Namespace Isolation

Each sandbox runs in its own network namespace with deny-by-default egress policy. Only endpoints explicitly approved in the blueprint can be reached. Unapproved requests are blocked and surfaced for operator approval.

Inference Control

LLM inference requests route through the OpenShell gateway — never directly from the agent process. The gateway enforces model allowlists, rate limits, and cost caps before forwarding to the provider.

Filesystem

/sandbox — agent working directory (read + write)
/tmp — temporary files (read + write)
All other paths — read-only or inaccessible
System binaries, configs, and host mounts are never writable

Network

Deny-by-default — no egress until explicitly allowed
Approved endpoints listed in blueprint.yaml
Unapproved requests blocked and queued for operator review
Inference traffic routed through gateway, not direct from agent

Evidence Gate’s blueprint and policy gates validate that all four isolation layers are correctly configured before any sandbox is deployed to production.

Agent Governance Ecosystem

Three layers of protection — from CI validation to runtime controls

3-Layer Governance Architecture

CI Layer

Before deploy

Blueprint Validation

Structure, version, profiles

Policy Audit

TLS, wildcards, filesystem

SBOM & Provenance

CycloneDX, SLSA checks

▼ deploy ▼

Infra Layer

Runtime isolation

Filesystem Isolation

Landlock LSM

Network Control

Deny-by-default, agentgov-only

Process Sandboxing

seccomp, no privilege escalation

▼ inference requests ▼

Runtime Layer

Per-request controls

Budget Gate

Hold / Settle

HITL Approval

Slack / webhook

Loop Detection

Auto-halt

Audit Log

SHA-256 chain

▼ governed LLM call ▼

LLM Provider — OpenAI / Anthropic / Gemini

evidence-gate-action

25 gate types including NemoClaw blueprint, policy, and sandbox lifecycle validation. Fail-closed CI gates with SARIF output and AI agent repair contracts.

View on GitHub →

nemoclaw-governance

Validates Plugin+Blueprint configurations for NVIDIA OpenShell sandboxes. Checks blueprint.yaml, policy.yaml, and inference profiles. pip install nemoclaw-governance

View on GitHub →

agentgov

Runtime governance proxy for NemoClaw sandboxes. Budget enforcement with hold/settle billing, 3 inference profile support (NVIDIA Cloud, Local NIM, Local vLLM), and operator-controlled network approval.

View on GitHub →

Why three layers? NemoClaw provides sandbox isolation with Landlock+seccomp+netns but has no cost controls. agentgov adds runtime budget enforcement and inference routing governance. Evidence Gate validates all configurations at CI time — before blueprints reach production sandboxes.

Simple, Transparent Pricing

Start free, upgrade when your team needs advanced features

Feature	Free ($0/mo)	Pro (Contact)	Enterprise (Contact us)
Evaluations/month	100	5,000	Unlimited
API calls/month	1,000	50,000	Unlimited
All 25 gate types	✓	✓	✓
SARIF output	✓	✓	✓
GitHub Check Runs	✓	✓	✓
SHA-256 integrity hashing	✓	✓	✓
Fail-closed error handling	✓	✓	✓
Three enforcement modes (warn / observe / enforce)	✓	✓	✓
Config file (.evidencegate.yml) — zero required inputs	✓	✓	✓
SBOM gate (CycloneDX/SPDX structural validation)	✓	✓	✓
Provenance gate (SLSA build attestation)	✓	✓	✓
NemoClaw gates (blueprint + policy + sandbox lifecycle)	✓	✓	✓
Inference routing validation (NVIDIA Cloud, NIM, vLLM)	✓	✓	✓
Sandbox security posture checks (Landlock, seccomp, netns)	—	✓	✓
Signal-sorted Job Summary (Critical > Warning > Info)	✓	✓	✓
AI agent repair contract (retry_prompt output)	✓	✓	✓
Gate presets	✓	✓	✓
Sticky PR comments	✓	✓	✓
Blind Gate evaluation	—	✓	✓
Evidence chain verification (L4)	—	✓	✓
Quality State tracking	—	✓	✓
Remediation workflows	—	✓	✓
Missing evidence + suggested actions	—	✓	✓
Self-hosted deployment	—	—	✓
Custom API base URL	—	—	✓
Dedicated support	—	—	✓
	Get Started Free	Start Pro Trial	Contact Sales

Up and Running in 5 Minutes

Add quality gates to your GitHub Actions workflow in three simple steps

1 Install from Marketplace

Visit the Evidence Gate Marketplace page and click "Use latest version" to add the action to your repository.

2 Add to your workflow

Add the Evidence Gate step to your GitHub Actions workflow file:

name: Quality Gate
on: [pull_request]

permissions:
  contents: read
  checks: write

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Your build & test steps here...

      - name: Evidence Gate
        uses: evidence-gate/evidence-gate-action@v1
        with:
          # Or use .evidencegate.yml config file for zero required inputs
          gate_type: "test_coverage"
          phase_id: "testing"
          evidence_files: "coverage.json"

3 See results in your PR

Evidence Gate writes a detailed summary to GITHUB_STEP_SUMMARY, visible directly in your pull request's workflow run. Gate pass/fail results, evidence hashes, and threshold evaluations appear automatically — no configuration needed.

NemoClaw Integration Quick Start

Validate NemoClaw configs, enforce runtime budgets, and gate everything in CI — one workflow

name: NemoClaw Governance
on: [pull_request]

permissions:
  contents: read
  checks: write

jobs:
  validate-blueprint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Validate NemoClaw blueprint.yaml
      - name: Blueprint Gate
        uses: evidence-gate/evidence-gate-action@v1
        with:
          gate_type: "nemoclaw_blueprint"
          evidence_files: "blueprint.yaml"

  validate-policy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Validate sandbox policy constraints
      - name: Policy Gate
        uses: evidence-gate/evidence-gate-action@v1
        with:
          gate_type: "nemoclaw_policy"
          evidence_files: "policy.yaml"

  enforce-budget:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Validate agentgov budget configuration
      - name: Budget Gate
        uses: evidence-gate/evidence-gate-action@v1
        with:
          gate_type: "custom"
          phase_id: "budget"
          evidence_files: "agentgov.config.json"

Three parallel jobs — blueprint structure, sandbox policy, and runtime budget — all validated before merge. Each gate is fail-closed: if any config is invalid, the PR is blocked.

Quality Gates for Full-Stack AI Governance

How It Works

Define

Evaluate

Enforce

Blind Gates: Why AI Agents Need Hidden Criteria

Designed for AI Governance

Fail-Closed Safety

Transparency & Trust Levels

Security & Accountability

What Evidence Gate Protects

Sandbox Lifecycle

Resolve

Verify

Plan

Apply

Status

Inference Routing

NVIDIA Cloud

Local NIM

Local vLLM

Security Guarantees

Landlock LSM

seccomp Filtering

Network Namespace Isolation

Inference Control

Filesystem

Network

Agent Governance Ecosystem

evidence-gate-action

nemoclaw-governance

agentgov

Simple, Transparent Pricing

Up and Running in 5 Minutes

1 Install from Marketplace

2 Add to your workflow

3 See results in your PR

NemoClaw Integration Quick Start