Get started

Quickstart

Run your first portable evaluation against an agent in a few minutes.

1. Install

bash

py -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install "ecp-runtime==0.5.0" "ecp-sdk==0.5.0"

For framework demos, install the matching SDK extra:

bash

pip install "ecp-sdk[langchain]==0.5.0" langchain-openai
pip install "ecp-sdk[crewai]==0.5.0" crewai
pip install "ecp-sdk[pydanticai]==0.5.0" pydantic-ai
pip install "ecp-sdk[llamaindex]==0.5.0" llama-index llama-index-llms-openai llama-index-tools-yahoo-finance

2. Create a starter eval

bash

ecp init
ecp validate ecp_eval/manifest.yaml
ecp run --manifest ecp_eval/manifest.yaml --json

3. Run the flagship demo

The customer support demo checks final output, required tool calls, and evaluator-safe audit context.

bash

ecp validate examples/customer_support_demo/manifest.yaml
ecp run --manifest examples/customer_support_demo/manifest.yaml --report report.html

4. Run a framework demo

bash

ecp run --manifest examples/langchain_demo/manifest.yaml

Other manifests live in:

examples/plain_python_demo/manifest.yaml
examples/two_agent_demo/manifest.yaml
examples/crewai_demo/manifest.yaml
examples/pydantic_ai_demo/manifest.yaml
examples/llamaindex_demo/manifest.yaml

5. Native Pytest Integration

Instead of using ecp run, you can write native Python test assertions using our built-in Pytest fixture:

python

# test_agent.py
def test_customer_support(ecp_agent):
    result = ecp_agent.step("I need a refund")
    assert "refund" in result.get("public_output", "").lower()

Run it directly with your agent target:

bash

pytest test_agent.py --ecp-target="python agent.py"

6. Large Datasets (CSV/JSONL)

For large-scale evaluations, you can dynamically load datasets directly in your manifest.yaml instead of hardcoding steps:

yaml

scenarios:
  - name: "Bulk refund tests"
    dataset:
      type: "csv"
      source: "data/refund_queries.csv"
      input_column: "user_query"
      output_column: "expected_response"

7. Exporting to LangSmith

You can natively export all evaluation runs, including inputs and outputs, directly to LangSmith:

bash

ecp run --manifest examples/customer_support_demo/manifest.yaml --export langsmith

8. JSON output for CI

Print a JSON report:

bash

ecp run --manifest examples/customer_support_demo/manifest.yaml --json

Save a JSON report:

bash

ecp run --manifest examples/customer_support_demo/manifest.yaml --json-out report.json

By default, ecp run exits non-zero when checks fail. Use --no-fail-on-error when you want a report without failing the process.

9. Optional LLM judge

If your manifest uses llm_judge, set:

bash

$env:OPENAI_API_KEY="your_key_here"
$env:ECP_LLM_JUDGE_MODEL="gpt-4o-mini"
$env:ECP_LLM_JUDGE_TEMPERATURE="0"

10. Streamable HTTP

Start the HTTP agent:

bash

python examples/streamable_http_demo/agent.py

Run the HTTP-target manifest:

bash

ecp run --manifest examples/streamable_http_demo/manifest.yaml --json

11. Inspector

bash

npm run inspector

Open http://127.0.0.1:6274.

12. Conformance smoke test

For protocol implementers:

bash

ecp conformance --target "python examples/customer_support_demo/agent.py"

Notes

The current release line is 0.5.0.
New agents should use evaluation_context; private_thought remains a deprecated compatibility alias.
Use ECP_RPC_TIMEOUT to control step timeouts. The default is 30 seconds.