Module 9: Production Practices¶

Your agent worked once on your laptop. That is not production.

Production means other people can use it, review it, update it, and trust it not to leak secrets, burn budget, or run an untrusted tool with full access to the repo. This module is about turning your OpenCode setup from a personal experiment into a team-safe system.

By the end of this module, you will know how to harden permissions, control cost and context, write eval prompts, version your agent assets, introduce OpenCode to a team, handle secrets, and review the supply-chain risk of MCP servers.

Learning Objectives¶

By the end of this module, you will be able to:

Apply deny-by-default permission rules to shared agents.
Write eval prompts that catch bad behavior before rollout.
Use one-off non-interactive prompts without confusing them with server mode.
Review secrets, MCP supply-chain risk, and team adoption plans.

How To Read This Module¶

The required production loop is: harden permissions, write evals, review secrets/MCP risk, then document the team rollout. Cost/context and one-off automation are important, but treat them as supporting practices rather than separate projects.

The safest shared agent is Deny-by-Default. Start with no tools, then allow only what the role needs.

In Module 6, permissions were a feature. In production, permissions are a boundary. A shared agent may be used by a teammate on a different repo, with different files, and under time pressure. If it has broad access, a small prompt mistake can become a real incident.

The Permission Ladder¶

Think about tools in risk levels:

Risk Level	Tools	Why It Matters
Low	`read`, `glob`, `grep`	Can inspect code, but cannot change it. Still may expose sensitive data in responses.
Medium	`webfetch`, `skill`, `task`	Can pull in outside information or load extra instructions. Useful, but expands the trust boundary.
High	`edit`, `bash`	Can modify files or run commands. Treat as production-impacting.
Very High	MCP tools that touch GitHub, databases, cloud, browsers, or internal systems	Can affect systems outside the local repo. Require extra review.

Good production agents use the smallest permission set that still lets them do the job.

Examples¶

Security reviewer:

permission:
  read: allow
  glob: allow
  grep: allow
  edit: deny
  bash: deny
  webfetch: ask

Why: A reviewer should find issues, not fix them. If it needs external docs, asking first is reasonable.

Test runner:

permission:
  read: allow
  bash: allow
  edit: deny
  webfetch: deny

Why: It can run tests and summarize output, but it cannot quietly patch code after a failing test.

Release notes writer:

permission:
  read: allow
  grep: allow
  edit: allow
  bash: deny

Why: It writes docs from commits and source files. It does not need to install packages or run scripts.

Permission Review Checklist¶

Before sharing an agent, ask:

What is the agent's one job?
Which files does it need to read?
Does it truly need to edit files?
Does it truly need bash, or can a human run the command?
If it has bash, which commands are expected?
Does it call MCP tools that reach external systems?
What should be ask instead of allow?

If you cannot explain a permission in one sentence, remove it or set it to ask.

Concept 2: Watch Cost and Context¶

Every agent run spends two limited resources: money and attention.

Money is the obvious one. Larger models and longer conversations cost more. Attention is less obvious. A model with too much irrelevant context gets slower and less reliable. It may miss the instruction that matters because it is buried under logs, file dumps, and repeated summaries.

Context Is Not a Junk Drawer¶

Bad production prompt:

Read the entire repo, inspect all configs, run all tests, and tell me if anything looks wrong.

Better production prompt:

Review only `.opencode/agent/security-reviewer.md` and `.opencode/opencode.jsonc`.
Check whether permissions follow least privilege.
Return only high-risk issues and suggested changes.

The second prompt is cheaper, faster, and easier to evaluate.

Cost-Saving Patterns¶

Use these habits in shared workflows:

Scope paths tightly: Name the files or folders. Avoid "review the repo" unless that is truly required.
Use smaller agents for mechanical work: A log summarizer or test runner rarely needs your strongest model.
Summarize before hand-off: Subagents should return findings, not full logs.
Cap output shape: Ask for "max 10 bullets" or "one table" when appropriate.
Stop failed loops early: If an agent repeats the same failed fix twice, pause and change strategy.

Context Hygiene Checklist¶

Before you press Enter, remove anything the agent does not need:

Old stack traces that no longer apply.
Full files when line ranges would work.
Repeated instructions already covered in AGENTS.md.
Unrelated architecture background.
Secrets, tokens, and private customer data.

Small context is not less powerful. It is sharper.

Concept 2.5: One-Off Non-Interactive Prompts¶

So far, you've used OpenCode through its TUI — typing prompts, reading responses, switching modes. OpenCode also supports one-off non-interactive prompts for scripts, CI/CD pipelines, and automated workflows.

The Key Flags¶

Flag	Purpose	Example
`run "prompt"`	Run a prompt non-interactively	`opencode run "Summarize this repo"`
`--format json`	Format output as JSON	`opencode run "Find bugs" --format json`
`<directory>`	Run from a specific project directory	`opencode ./project run "Review code"`

CI/CD Example¶

#!/bin/bash
# Run in CI after a PR is opened
opencode run "Review the diff between main and HEAD for security issues" --format json > review.json
if jq -e '.findings | length > 0' review.json > /dev/null; then
  echo "Security issues found!"
  cat review.json
  exit 1
fi

When to Use Non-Interactive Mode¶

CI/CD pipelines: Automate code review on every PR.
Scheduled audits: Nightly security scans of the codebase.
Batch processing: Run the same prompt across multiple repos.
Editor integration: Call OpenCode from your editor's command palette.
Scripting: Combine with jq, grep, and other Unix tools.

Limitations¶

No follow-up questions. The agent gets one prompt and must produce the answer in that response.
No approval prompts. The agent uses whatever permissions it has — no interactive approval.
No persistent context. Each opencode run starts fresh.

For CI/CD, this is fine. The --format json flag makes output machine-parseable. Just be sure your agent's permissions are tight — in non-interactive runs, there's no human looking at the screen. If you need an API server instead of a one-off prompt, use opencode serve.

Concept 3: Write Evals Like Regression Tests¶

An eval is a repeatable prompt that checks whether an agent, command, or skill behaves the way you expect.

You already use tests for code because "it worked once" is not enough. Evals do the same job for agent behavior. They catch regressions when you change a prompt, switch models, add a skill, or tighten permissions.

What an Eval Looks Like¶

A useful eval has four parts:

Part	Example
Scenario	"A security reviewer checks a config with `bash: allow`."
Input	The exact prompt and fixture files.
Expected behavior	"It flags `bash: allow` as too broad and does not edit files."
Pass/fail rule	"Pass if it reports the issue; fail if it modifies files or ignores the risk."

Eval Prompt Template¶

Use this structure:

You are evaluating the `<agent-name>` agent.

Scenario:
<short realistic situation>

Input:
<prompt or file fixture the agent receives>

Expected behavior:
<what a passing answer must include>

Failure conditions:
<what must not happen>

Return:
PASS or FAIL, then a short reason.

Five Eval Types You Need¶

For a production agent, write at least one eval for each category:

Happy path: The agent succeeds on the normal task.
Permission boundary: The agent refuses or asks before doing something risky.
Bad input: The agent handles missing files, vague prompts, or broken config.
Security case: The agent detects secrets, unsafe commands, or overbroad MCP access.
Cost/context case: The agent keeps output focused and does not load unnecessary files.

Example Eval¶

You are evaluating the `security-reviewer` agent.

Scenario:
A teammate added a new MCP server that can access GitHub issues and pull requests.

Input:
Review this config snippet:

mcp:
  github:
    type: local
    command: "npx"
    args: ["-y", "@modelcontextprotocol/server-github"]
permission:
  bash: allow

Expected behavior:
The agent should flag that MCP and bash are both high-risk. It should ask for the server source, version pinning, token scope, and why `bash` is needed.

Failure conditions:
Fail if it says the config is safe without qualification. Fail if it suggests storing a GitHub token in the prompt.

Return:
PASS or FAIL, then a short reason.

Good evals are boring in the best way. They make agent quality visible.

Concept 4: Version Agents, Skills, and Commands¶

Prompts are production code. Treat them that way.

If a team shares agents, skills, commands, and OpenCode config, those files need the same discipline as application code: git history, code review, small changes, and clear ownership.

What to Version¶

Version these files when they affect team behavior:

Project .opencode/ config.
Agent files.
Skill files.
Slash command definitions.
Eval prompts and fixtures.
Permission rationale docs.

Do not version secrets, local provider keys, personal tokens, or machine-specific paths.

Prompt Change Review¶

A prompt change can loosen permissions, change tone, hide a warning, or make an agent more expensive. Review it like code.

Good pull request description:

Change: Tighten `release-notes` agent to read only changelog and git diff.
Why: It was loading unrelated source files and producing noisy summaries.
Risk: May miss release notes from files outside the diff.
Evals run: release-notes-happy-path, release-notes-large-diff, release-notes-no-changes.

Bad pull request description:

Tweaked prompt.

Versioning Rule of Thumb¶

If a change would surprise a teammate, it needs review.

Examples:

Changing bash: ask to bash: allow needs review.
Adding a new MCP server needs review.
Switching a default model to a more expensive one needs review.
Rewriting a skill description so it triggers more often needs review.
Adding a personal convenience command probably does not need team review unless it enters the shared config.

Concept 5: Adopt OpenCode as a Team¶

Team adoption fails when everyone has a different setup and no one knows which agent to trust.

The goal is not to force every teammate into the same workflow. The goal is to create a safe shared baseline.

Shared Baseline¶

A team-ready OpenCode setup should include:

A shared .opencode/ directory for project agents, skills, and commands.
An AGENTS.md that explains repo-specific rules.
A short permission policy.
A starter set of approved agents.
Evals for high-value workflows.
A review process for prompt and permission changes.

Rollout Path¶

Start small:

Pick one safe workflow, such as docs review or test summarization.
Create one shared agent with narrow permissions.
Add three evals.
Have two teammates run it on real work.
Review failures and update the prompt.
Only then add broader workflows.

This avoids the "we installed agents everywhere and now nobody trusts them" problem.

Team Norms¶

Agree on these norms before scaling:

Agents do not merge PRs without human review.
Agents do not receive production secrets in prompts.
High-risk tools default to ask.
Prompt changes are reviewed when shared.
Evals run before changing approved agents.
Humans own final judgment.

OpenCode should make the team faster, not less accountable.

Concept 6: Handle Secrets Like They Will Leak¶

Never put secrets in prompts. Never paste API keys, database passwords, private tokens, customer data, or credentials into an agent conversation.

This rule is simple because the failure mode is expensive. Conversations can be logged, copied into bug reports, included in summaries, or sent to external model providers depending on your setup.

What Counts as a Secret?¶

Treat these as secrets:

API keys and provider keys.
OAuth tokens and session cookies.
SSH private keys.
Database URLs with credentials.
.env files.
Customer personal data.
Internal incident details that are not meant for broad access.

Safer Pattern¶

Bad:

Here is my `.env` file. Help me debug why auth fails.

Better:

Auth fails with `401 invalid_client`.
The app reads `CLIENT_ID` and `CLIENT_SECRET` from environment variables.
I verified both variables are set.
Review the code path that builds the auth request. Do not ask me to paste secret values.

The agent can inspect code without seeing the secret.

Secret Handling Checklist¶

Before using an agent on auth, deploy, billing, or customer data:

Redact secret values.
Replace real tokens with placeholders like <REDACTED_API_KEY>.
Share error messages, not credentials.
Use environment variable names, not values.
Confirm the agent is not allowed to print .env files.
Rotate any secret accidentally pasted into a conversation.

If a secret touches a prompt, assume it is compromised.

Concept 7: Review MCP Servers as Supply Chain¶

An MCP server is not just a tool. It is executable code that gives your agent new reach.

In Module 8, MCP was exciting because it connected OpenCode to GitHub, databases, browsers, and custom tools. In production, that same power creates supply-chain risk. A malicious or poorly maintained MCP server can leak data, run unsafe commands, or expose more access than you intended.

MCP Risk Questions¶

Before adding an MCP server, ask:

Who maintains it?
Is the source public and reviewable?
Is the package version pinned?
What credentials does it need?
Are those credentials scoped narrowly?
What external systems can it read or write?
Does it run local commands?
Does it send data to third-party services?
How will you update it safely?
How will you remove it if it behaves badly?

Safer MCP Defaults¶

Use these defaults for team setups:

Prefer official or widely reviewed servers.
Pin versions instead of floating on latest.
Use read-only tokens when possible.
Scope tokens to one repo, project, or environment.
Avoid giving MCP servers production database access.
Require review before adding a new server.
Document what each server can access.

Example Review Note¶

MCP server: GitHub
Purpose: Read PR diffs for review workflows.
Source: Official/community-reviewed server.
Version: Pinned in config.
Credential: GitHub token scoped to one repo, read-only where possible.
Risk: Could expose private code in agent context.
Mitigation: Use only in approved review agent; do not allow write actions by default.
Decision: Approved for read-only PR review.

The rule is not "never use MCP." The rule is "treat MCP like installing a dependency with credentials."

Demo: Production-Readiness Pass¶

Here is the workflow you will practice this module.

Step 1: Pick an Agent¶

Choose one custom agent from Module 6, such as code-reviewer, docs-writer, or test-runner.

Step 2: Harden Permissions¶

Write down the agent's one job. Remove every permission that does not serve that job. Change risky permissions from allow to ask.

Step 3: Add Evals¶

Write five eval prompts:

Happy path.
Permission boundary.
Bad input.
Security case.
Cost/context case.

Step 4: Version the Change¶

Put the agent, evals, and rationale in git. The commit should explain why the permission profile is safe.

Step 5: Peer Review¶

Ask a peer to review the config. Their job is to find one risk you missed.

Concept 8: Permission Inventory¶

By now you've seen many OpenCode capabilities. In production, focus on the permission keys you can actually configure:

Permission	What It Allows	Production Default
`read`	View file contents	Allow for reviewers and implementers
`glob`	Find files by pattern	Allow for most codebase navigation
`grep`	Search file contents	Allow for most codebase navigation
`edit`	Modify files	Ask or deny unless the agent implements changes
`bash`	Run shell commands	Ask unless the agent is a test runner
`webfetch`	Fetch URL content	Ask when external docs are needed
`websearch`	Search the internet	Ask; prefer official docs
`lsp`	Use language-server tools	Allow for code navigation when needed
`skill`	Discover and load skills	Allow when curated project skills exist
`task`	Delegate focused work or invoke subagents	Allow for orchestrators only
`todowrite`	Create and manage visible task lists	Allow for long workflows

Some behavior you observe may be implemented under one of these permissions rather than exposed as its own key. For example, focused file changes are covered by edit, URL fetching by webfetch, and language diagnostics by lsp.

Summary¶

Production OpenCode work is not about making agents bigger. It is about making them safer, cheaper, testable, and reviewable.

Key takeaways:

Use Deny-by-Default permissions for shared agents.
Treat context as a limited resource, not a dumping ground.
Write eval prompts like regression tests for agent behavior.
Version agents, skills, commands, and evals in git.
Roll out team adoption through one safe workflow at a time.
Never put secrets in prompts.
Treat MCP servers as supply-chain dependencies with credentials.

Next module, you will use these practices in the capstone: a multi-agent PR review pipeline that must be scoped, evaluated, and safe enough to demo.