Agent skill
sf-ai-agentforce-testing
Agentforce agent testing with dual-track workflow and 100-point scoring. TRIGGER when: user tests Agentforce agents, runs sf agent test commands, creates test specs, validates topic routing, or analyzes agent test coverage. DO NOT TRIGGER when: Apex unit tests (use sf-testing), building agents (use sf-ai-agentforce), or Agent Script DSL (use sf-ai-agentscript).
Install this agent skill to your Project
npx add-skill https://github.com/Jaganpro/sf-skills/tree/main/skills/sf-ai-agentforce-testing
Metadata
Additional technical details for this skill
- author
- Jag Valaiyapathy
- scoring
- 100 points across 7 categories
- version
- 2.1.0
SKILL.md
sf-ai-agentforce-testing: Agentforce Test Execution & Coverage Analysis
Use this skill when the user needs formal Agentforce testing: multi-turn conversation validation, CLI Testing Center specs, topic/action coverage analysis, preview checks, or a structured test-fix loop after publish.
When This Skill Owns the Task
Use sf-ai-agentforce-testing when the work involves:
sf agent testworkflows- multi-turn Agent Runtime API testing
- topic routing, action invocation, context preservation, guardrail, or escalation validation
- test-spec generation and coverage analysis
- post-publish / post-activate test-fix loops
Delegate elsewhere when the user is:
- building or editing the agent itself → sf-ai-agentforce or sf-ai-agentscript
- running Apex unit tests → sf-testing
- creating seed data for actions → sf-data
- analyzing session telemetry / STDM traces → sf-ai-agentforce-observability
Core Operating Rules
- Testing comes after deploy / publish / activate.
- Use multi-turn API testing as the primary path when conversation continuity matters.
- Use CLI Testing Center as the secondary path for single-utterance and org-supported test-center workflows.
- Fixes to the agent should be delegated to sf-ai-agentscript when Agent Script changes are needed.
- Do not use raw
curlfor OAuth token validation in the ECA flow; use the provided credential tooling.
Script path rule
Use the existing scripts under:
~/.claude/skills/sf-ai-agentforce-testing/hooks/scripts/
These scripts are pre-approved. Do not recreate them.
Required Context to Gather First
Ask for or infer:
- agent API name / developer name
- target org alias
- testing goal: smoke test, regression, coverage expansion, or bug reproduction
- whether the agent is already published and activated
- whether the org has Agent Testing Center available
- whether ECA credentials are available for Agent Runtime API testing
Preflight checks:
- discover the agent
- confirm publish / activation state
- verify dependencies (Flows, Apex, data)
- choose testing track
Dual-Track Workflow
Track A — Multi-turn API testing (primary)
Use when you need:
- multi-turn conversation testing
- topic re-matching validation
- context preservation checks
- escalation or action-chain analysis across turns
Requires:
- ECA / auth setup
- agent runtime access
Track B — CLI Testing Center (secondary)
Use when you need:
- org-native
sf agent testworkflows - test spec YAML execution
- quick single-utterance validation
- CLI-centered CI/CD usage where Testing Center is available
Quick manual path
For manual validation without full formal testing, use preview workflows first, then escalate to Track A or B as needed.
Recommended Workflow
1. Discover and verify
- locate the agent in the target org
- confirm it is published and activated
- confirm required actions / Flows / Apex exist
- decide whether Track A or Track B fits the request
2. Plan tests
Cover at least:
- main topics
- expected actions
- guardrails / off-topic handling
- escalation behavior
- phrasing variation
3. Execute the right track
Track A
- validate ECA credentials with the provided tooling
- retrieve metadata needed for scenario generation
- run multi-turn scenarios with the provided Python scripts
- analyze per-turn failures and coverage
Track B
- generate or refine a flat YAML test spec
- run
sf agent testcommands - inspect structured results and verbose action output
4. Classify failures
Typical failure buckets:
- topic not matched
- wrong topic matched
- action not invoked
- wrong action selected
- action invocation failed
- context preservation failure
- guardrail failure
- escalation failure
5. Run fix loop
When failures imply agent-authoring issues:
- delegate fixes to sf-ai-agentscript
- re-publish / re-activate if needed
- re-run focused tests before full regression
Testing Guardrails
Never skip these:
- test only after publish/activate
- include harmful / off-topic / refusal scenarios
- use multiple phrasings per important topic
- clean up sessions after API tests
- keep swarm execution small and controlled
Avoid these anti-patterns:
- testing unpublished agents
- treating one happy-path utterance as coverage
- storing ECA secrets in repo files
- debugging auth with brittle shell-expanded
curlcommands - changing both tests and agent simultaneously without isolating the cause
Output Format
When finishing a run, report in this order:
- Test track used
- What was executed
- Pass/fail summary
- Coverage gaps
- Root-cause themes
- Recommended fix loop / next test step
Suggested shape:
Agent: <name>
Track: Multi-turn API | CLI Testing Center | Preview
Executed: <specs / scenarios / turns>
Result: <passed / partial / failed>
Coverage: <topics, actions, guardrails, context>
Issues: <highest-signal failures>
Next step: <fix, republish, rerun, or expand coverage>
Cross-Skill Integration
| Need | Delegate to | Reason |
|---|---|---|
| fix Agent Script logic | sf-ai-agentscript | authoring and deterministic fix loops |
| create test data | sf-data | action-ready data setup |
| fix Flow-backed actions | sf-flow | Flow repair |
| fix Apex-backed actions | sf-apex | Apex repair |
| set up ECA / OAuth | sf-connected-apps | auth and app configuration |
| analyze session telemetry | sf-ai-agentforce-observability | STDM / trace analysis |
Reference Map
Start here
- references/interview-wizard.md
- references/multi-turn-testing.md
- references/cli-commands.md
- references/test-spec-reference.md
Execution / auth
- references/execution-protocol.md
- references/multi-turn-execution.md
- references/eca-setup-guide.md
- references/credential-convention.md
- references/connected-app-setup.md
Coverage / fix loops
- references/coverage-analysis.md
- references/agentic-fix-loops.md
- references/results-scoring.md
- references/known-issues.md
Advanced / specialized
- references/agentscript-agents.md
- references/agentscript-testing-patterns.md
- references/cli-testing-details.md
- references/deep-conversation-history-patterns.md
- references/swarm-execution.md
- references/trace-analysis.md
- references/agent-api-reference.md
Templates / assets
- references/test-templates.md
- references/test-plan-format.md
- assets/
Score Guide
| Score | Meaning |
|---|---|
| 90+ | production-ready test confidence |
| 80–89 | strong coverage with minor gaps |
| 70–79 | acceptable but coverage expansion recommended |
| 60–69 | partial validation only |
| < 60 | insufficient confidence; block release |
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
sf-diagram-mermaid
Salesforce architecture diagrams using Mermaid with ASCII fallback. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use sf-diagram-nanobananapro), or asks about non-Salesforce systems.
sf-integration
Salesforce integration architecture with 120-point scoring. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use sf-connected-apps), Apex-only logic (use sf-apex), or data import/export (use sf-data).
sf-deploy
Salesforce DevOps automation using sf CLI v2. TRIGGER when: user deploys metadata, creates/manages scratch orgs or sandboxes, sets up CI/CD pipelines, or troubleshoots deployment errors with sf project deploy. DO NOT TRIGGER when: writing Apex/LWC code (use sf-apex/sf-lwc), creating metadata XML (use sf-metadata), or querying org data (use sf-data).
sf-industry-commoncore-omnistudio-analyze
Cross-cutting OmniStudio analysis skill for namespace detection, dependency visualization, and impact analysis across OmniScripts, FlexCards, Integration Procedures, and Data Mappers. TRIGGER when: user asks about OmniStudio dependencies, wants namespace detection (Core vs vlocity_cmt vs vlocity_ins), needs impact analysis, or requests dependency diagrams. DO NOT TRIGGER when: authoring OmniScripts (use sf-industry-commoncore-omniscript), building FlexCards (use sf-industry-commoncore-flexcard), creating Integration Procedures (use sf-industry-commoncore-integration-procedure), or configuring Data Mappers (use sf-industry-commoncore-datamapper).
sf-industry-commoncore-callable-apex
Salesforce Industries Common Core (OmniStudio/Vlocity) Apex callable generation and review with 120-point scoring. TRIGGER when: user creates or reviews System.Callable classes, migrates `VlocityOpenInterface` / `VlocityOpenInterface2`, or builds Industries callable extensions used by OmniStudio, Integration Procedures, or DataRaptors. DO NOT TRIGGER when: generic Apex classes/triggers (use sf-apex), building Integration Procedures (use sf-industry-commoncore-integration-procedure), authoring OmniScripts (use sf-industry-commoncore-omniscript), configuring Data Mappers (use sf-industry-commoncore-datamapper), or analyzing namespace/dependency issues (use sf-industry-commoncore-omnistudio-analyze).
sf-datacloud-act
Salesforce Data Cloud Act phase. TRIGGER when: user manages activations, activation targets, data actions, or downstream delivery of Data Cloud audiences and data. DO NOT TRIGGER when: the task is segment creation (use sf-datacloud-segment), data retrieval/search work (use sf-datacloud-retrieve), or STDM/session tracing (use sf-ai-agentforce-observability).
Didn't find tool you were looking for?