GitHub Actions · 29 Gate Types · NemoClaw Governance · Fail-Closed
Quality Gates for Full-Stack AI Governance
AI agents write your code, your tests, and now operate your infrastructure through NemoClaw sandboxes. Evidence Gate enforces quality at every layer — Blind Gates that hide CI criteria from gaming, plus blueprint and policy validation that ensures NemoClaw sandboxes deploy with correct isolation, budgets, and inference routing. Full-stack governance, fail-closed by default.
# Validate NemoClaw blueprint before deploy - uses: evidence-gate/evidence-gate-action@v1 with: gate_type: "nemoclaw_blueprint" phase_id: "deploy" evidence_files: "blueprint.yaml"
How It Works
Three steps to enforced quality in every pull request
Define
Add Evidence Gate to your workflow YAML. Specify gate types, evidence files, and thresholds.
Evaluate
Gates automatically verify your evidence files — existence, schema, thresholds, and integrity.
Enforce
Fail-closed: pipelines stop on quality violations. Results appear in PR summary and workflow annotations.
Blind Gates: Why AI Agents Need Hidden Criteria
When an LLM writes your code AND your tests, every visible threshold becomes a target to optimize against — not a quality standard to meet
The problem: Traditional CI gates publish their thresholds in workflow YAML. An AI coding agent (Copilot, Cursor, Devin, etc.) instructed to "pass CI" can read these thresholds and generate minimal tests that hit exactly 80.1% coverage — satisfying the metric while proving nothing about quality.
The solution: Blind Gates evaluate evidence server-side against criteria that are never exposed to the pipeline, the repository, or the AI agent. The LLM that generated the code cannot see, reverse-engineer, or optimize against the pass/fail threshold. Quality must be genuine.
How it works: Your pipeline submits evidence files. The Evidence Gate API evaluates them against private criteria configured by your team. The pipeline — and the AI agent driving it — only receives pass or fail. Never the criteria themselves.
Designed for AI Governance
Evidence Gate's design aligns with Japan's AI Business Operator Guidelines
Fail-Closed Safety
All gates default to FAIL. Only explicitly verified evidence earns a PASS. Supports the guideline's emphasis on safety and risk prevention.
Transparency & Trust Levels
Genchi Genbutsu Trust Levels (L1–L4) make evidence reliability explicit. SHA-256 Evidence Chain enables integrity verification of all judgment data.
Security & Accountability
AWS KMS encryption (FIPS 140-2 validated), HMAC-signed cursors, and a maturity-level-based Quality State Model provide auditable governance at every step.
Evidence Gate supports practices aligned with key principles including transparency, safety, and accountability. Learn more about our approach →
This product is not endorsed by or affiliated with any government body. Feature descriptions are for informational purposes only and do not constitute compliance certification.
What Evidence Gate Protects
CI gates alone aren’t enough — AI agents also operate at runtime. Evidence Gate validates the NemoClaw infrastructure that runs those agents.
When your pipeline clears the Evidence Gate, it deploys NemoClaw sandboxes — isolated environments where AI agents execute. Understanding this runtime layer explains why Evidence Gate validates blueprints, policies, and inference configuration: a misconfigured sandbox can escape isolation or consume unbounded resources. The architecture below shows what Evidence Gate is guarding.
Sandbox Lifecycle
Five stages from blueprint resolution to running sandbox — the Plugin handles stages 1–2, the Blueprint handles stages 3–5
Resolve
Plugin resolves blueprint version and downloads the versioned Python artifact
›Verify
Plugin checks blueprint signature and integrity before execution
›Plan
Blueprint determines the OpenShell resources needed for the sandbox
›Apply
Blueprint invokes OpenShell CLI to create and configure sandbox resources
›Status
Blueprint reports sandbox readiness and connection endpoints
›Inference Routing
Three provider profiles switchable at runtime — no sandbox restart required
NVIDIA Cloud
Nemotron 3 Super 120B
Production inference via build.nvidia.com. Highest capability for demanding workloads.
ProductionLocal NIM
NIM container on local network
On-premises inference for testing and air-gapped environments where cloud access is restricted.
Testing / Air-gappedLocal vLLM
vLLM server on localhost
Offline development with fast iteration cycles. No network dependency required.
Offline DevProviders can be switched at runtime without restarting the sandbox. Configuration is managed through the Blueprint's inference settings.
Security Guarantees
Evidence Gate validates that every sandbox deploys with four mandatory isolation layers — if any layer is misconfigured, the gate fails the pipeline
Landlock LSM
Linux Security Module restricting filesystem access at the kernel level. The sandbox process cannot access paths outside its granted set — even if the agent finds a code execution vulnerability. Evidence Gate validates that Landlock rules are correctly configured before deployment.
seccomp Filtering
System call filter that limits which kernel operations the sandbox process can invoke. Blocks dangerous syscalls like ptrace, mount, and reboot before they reach the kernel.
Network Namespace Isolation
Each sandbox runs in its own network namespace with deny-by-default egress policy. Only endpoints explicitly approved in the blueprint can be reached. Unapproved requests are blocked and surfaced for operator approval.
Inference Control
LLM inference requests route through the OpenShell gateway — never directly from the agent process. The gateway enforces model allowlists, rate limits, and cost caps before forwarding to the provider.
Filesystem
/sandbox— agent working directory (read + write)/tmp— temporary files (read + write)- All other paths — read-only or inaccessible
- System binaries, configs, and host mounts are never writable
Network
- Deny-by-default — no egress until explicitly allowed
- Approved endpoints listed in
blueprint.yaml - Unapproved requests blocked and queued for operator review
- Inference traffic routed through gateway, not direct from agent
Evidence Gate’s blueprint and policy gates validate that all four isolation layers are correctly configured before any sandbox is deployed to production.
Agent Governance Ecosystem
Three layers of protection — from CI validation to runtime controls
3-Layer Governance Architecture
evidence-gate-action
25 gate types including NemoClaw blueprint, policy, and sandbox lifecycle validation. Fail-closed CI gates with SARIF output and AI agent repair contracts.
View on GitHub →nemoclaw-governance
Validates Plugin+Blueprint configurations for NVIDIA OpenShell sandboxes. Checks blueprint.yaml, policy.yaml, and inference profiles. pip install nemoclaw-governance
agentgov
Runtime governance proxy for NemoClaw sandboxes. Budget enforcement with hold/settle billing, 3 inference profile support (NVIDIA Cloud, Local NIM, Local vLLM), and operator-controlled network approval.
View on GitHub →Why three layers? NemoClaw provides sandbox isolation with Landlock+seccomp+netns but has no cost controls. agentgov adds runtime budget enforcement and inference routing governance. Evidence Gate validates all configurations at CI time — before blueprints reach production sandboxes.
Simple, Transparent Pricing
Start free, upgrade when your team needs advanced features
| Feature | Free ($0/mo) | Pro (Contact) | Enterprise (Contact us) |
|---|---|---|---|
| Evaluations/month | 100 | 5,000 | Unlimited |
| API calls/month | 1,000 | 50,000 | Unlimited |
| All 25 gate types | ✓ | ✓ | ✓ |
| SARIF output | ✓ | ✓ | ✓ |
| GitHub Check Runs | ✓ | ✓ | ✓ |
| SHA-256 integrity hashing | ✓ | ✓ | ✓ |
| Fail-closed error handling | ✓ | ✓ | ✓ |
| Three enforcement modes (warn / observe / enforce) | ✓ | ✓ | ✓ |
| Config file (.evidencegate.yml) — zero required inputs | ✓ | ✓ | ✓ |
| SBOM gate (CycloneDX/SPDX structural validation) | ✓ | ✓ | ✓ |
| Provenance gate (SLSA build attestation) | ✓ | ✓ | ✓ |
| NemoClaw gates (blueprint + policy + sandbox lifecycle) | ✓ | ✓ | ✓ |
| Inference routing validation (NVIDIA Cloud, NIM, vLLM) | ✓ | ✓ | ✓ |
| Sandbox security posture checks (Landlock, seccomp, netns) | — | ✓ | ✓ |
| Signal-sorted Job Summary (Critical > Warning > Info) | ✓ | ✓ | ✓ |
| AI agent repair contract (retry_prompt output) | ✓ | ✓ | ✓ |
| Gate presets | ✓ | ✓ | ✓ |
| Sticky PR comments | ✓ | ✓ | ✓ |
| Blind Gate evaluation | — | ✓ | ✓ |
| Evidence chain verification (L4) | — | ✓ | ✓ |
| Quality State tracking | — | ✓ | ✓ |
| Remediation workflows | — | ✓ | ✓ |
| Missing evidence + suggested actions | — | ✓ | ✓ |
| Self-hosted deployment | — | — | ✓ |
| Custom API base URL | — | — | ✓ |
| Dedicated support | — | — | ✓ |
| Get Started Free | Start Pro Trial | Contact Sales |
Up and Running in 5 Minutes
Add quality gates to your GitHub Actions workflow in three simple steps
1 Install from Marketplace
Visit the Evidence Gate Marketplace page and click "Use latest version" to add the action to your repository.
2 Add to your workflow
Add the Evidence Gate step to your GitHub Actions workflow file:
name: Quality Gate on: [pull_request] permissions: contents: read checks: write jobs: evaluate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 # Your build & test steps here... - name: Evidence Gate uses: evidence-gate/evidence-gate-action@v1 with: # Or use .evidencegate.yml config file for zero required inputs gate_type: "test_coverage" phase_id: "testing" evidence_files: "coverage.json"
3 See results in your PR
Evidence Gate writes a detailed summary to GITHUB_STEP_SUMMARY, visible directly in your pull request's workflow run. Gate pass/fail results, evidence hashes, and threshold evaluations appear automatically — no configuration needed.
NemoClaw Integration Quick Start
Validate NemoClaw configs, enforce runtime budgets, and gate everything in CI — one workflow
name: NemoClaw Governance on: [pull_request] permissions: contents: read checks: write jobs: validate-blueprint: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 # Validate NemoClaw blueprint.yaml - name: Blueprint Gate uses: evidence-gate/evidence-gate-action@v1 with: gate_type: "nemoclaw_blueprint" evidence_files: "blueprint.yaml" validate-policy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 # Validate sandbox policy constraints - name: Policy Gate uses: evidence-gate/evidence-gate-action@v1 with: gate_type: "nemoclaw_policy" evidence_files: "policy.yaml" enforce-budget: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 # Validate agentgov budget configuration - name: Budget Gate uses: evidence-gate/evidence-gate-action@v1 with: gate_type: "custom" phase_id: "budget" evidence_files: "agentgov.config.json"
Three parallel jobs — blueprint structure, sandbox policy, and runtime budget — all validated before merge. Each gate is fail-closed: if any config is invalid, the PR is blocked.