Coding

Evaluate Agents

Your agent is now live, helping students and scheduling meetings with professors. But here's the thing - how do you know it's actually working correctly?

promptBeginner5 min to valuemarkdown
0 views
Feb 7, 2026

Sign in to like and favorite skills

Prompt Playground

2 Variables

Fill Variables

Preview

# [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]valuate [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]gents

Your agent is now live, helping students and scheduling meetings with professors. But here's the thing - how do you know it's actually working correctly?

Just like with testing in the earlier chapters, the same question gets answered differently every time. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nd that's fine... usually, but makes things a bit tricky.

## [CLUSTER_DOMAIN>]hree layers of agent testing

When evaluating agents, we will focus on three areas:

1. **[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nit tests for individual tools** - [CLUSTER_DOMAIN>]est each tool in isolation. [CLUSTER_DOMAIN>]oes the calendar [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]P[CLUSTER_DOMAIN>] actually create events? [CLUSTER_DOMAIN>]oes the search return relevant results?
2. **[CLUSTER_DOMAIN>]ext-to-J[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] validation** - [CLUSTER_DOMAIN>]an the [CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] format tool calls correctly, and does it choose the right tools? ([[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]poiler: malformed J[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] is where most agents break)
3. **[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nd-to-end evaluation** - [CLUSTER_DOMAIN>]oes the complete workflow help users?

We've already set up an eval framework earlier, so let's put it to work testing our agent!

## 1. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nit [CLUSTER_DOMAIN>]esting [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]gent [CLUSTER_DOMAIN>]ools

Before we test the whole agent, let's make sure each individual tool works correctly. [CLUSTER_DOMAIN>]hink of it like testing the ingredients before baking the cake.

[CLUSTER_DOMAIN>]he canopy backend already has unit tests set up for the student assistant tools. [CLUSTER_DOMAIN>]et's run them!

1. We first need to install some dependencies:

    ```bash
    cd /opt/app-root/src/backend
    pip install -r app/requirements.txt
    pip install -r tests/requirements-test.txt
    ```

2. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nd then we can run the unit tests:

    ```bash
    pytest tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py -v
    ```

You should see output like this:

```
tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py::test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]search[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]knowledge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]base P[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]                    [ 25%]
tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py::test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]find[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]professors[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]by[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]expertise P[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]            [ 50%]
tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py::test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]mcp[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]calendar[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]list[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools P[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]                 [ 75%]
tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py::test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]mcp[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]calendar[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]list[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]events P[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]                [100%]

======================== 4 passed in 1.22s ========================
```

**What did we just test?**

- **search[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]knowledge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]base** - Verified the tool can retrieve relevant content from the vector store
- **find[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]professors[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]by[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]expertise** - [CLUSTER_DOMAIN>]hecked that professor matching works correctly
- **[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]P calendar tools** - [CLUSTER_DOMAIN>]onfirmed the [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>]P server is reachable and exposes the right tools

**Pro tip:** Want to see what the tools are returning? [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]un with the `-s` flag:

```bash
pytest tests/test[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools.py -v -s
```

[CLUSTER_DOMAIN>]his shows the actual search results and helps you understand what data your tools are working with.

## 2. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]dd [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nit [CLUSTER_DOMAIN>]ests to [CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]/[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>] Pipeline

[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]ow that we've verified the unit tests work locally, let's automate them in our [CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]/[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>] pipeline! [CLUSTER_DOMAIN>]his ensures every code change is tested before being promoted to production.

### [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nable [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nit [CLUSTER_DOMAIN>]ests in the [CLUSTER_DOMAIN>]ekton Pipeline

[CLUSTER_DOMAIN>]he evaluation pipeline can run unit tests alongside the other evaluations. [CLUSTER_DOMAIN>]et's enable this step:

1. Go to `genaiops-gitops/toolings/evaluation-pipeline/config.yaml` in your workbench and update the config file to enable a unit test step:

    ```yaml
    chart[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]path: charts/canopy-evals-pipeline
    [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]: <[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]
    [CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]: <[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]
    kfp:
      lls[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]rl: http://llama-stack-service.<[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]-test.svc.cluster.local:8321
      backend[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]rl: http://canopy-backend.<[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]-test.svc.cluster.local:8000
    testing:                    # 👈 [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]dd this
      enable[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nit[CLUSTER_DOMAIN>]ests: true     # 👈 [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]dd this
    ```

2. Push it to git:

    ```bash
    cd /opt/app-root/src/genaiops-gitops
    git pull
    git add .
    git commit -m "1️⃣ [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nabled unit tests 1️⃣"
    git push
    ```

3. [CLUSTER_DOMAIN>]o make sure it was added, go to [CLUSTER_DOMAIN>]pen[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]hift [CLUSTER_DOMAIN>]onsole -[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] Pipelines -[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] canopy-evals-pipeline and see that `tool-unit-tests` is in there.  

    ![unit-test-step.png](images/unit-test-step.png)

We will see it action soon, but first, let's make sure that our end-to-end tests works for our agent as well.

### [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]dding [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]gent [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]2[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] [CLUSTER_DOMAIN>]ests

1. Go to your workbench and navigate to the `evals` repository:

    ```bash
    cd /opt/app-root/src/evals
    ```

2. [CLUSTER_DOMAIN>]reate a new folder for the student assistant tests:

    ```bash
    mkdir student-assistant
    ```

3. [CLUSTER_DOMAIN>]reate the test configuration file. [CLUSTER_DOMAIN>]pen a new file `student-assistant/student[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]assistant[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tests.yaml` and paste this:

```yaml
name: student[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]assistant[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tests
description: [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nd-to-end tests for the student assistant agent with tool choice validation
model: llama32
endpoint: /student-assistant
scoring[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]params:
    "llm-as-judge::base":
        "judge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]model": llama32
        "prompt[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]template": e2e[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]judge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]prompt.txt
        "type": "llm[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]as[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]judge"
        "judge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]score[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]regexes": ["[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]nswer: ([[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]|B|[CLUSTER_DOMAIN>]|[CLUSTER_DOMAIN>]|[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]])"]
    "basic::tool[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]choice": null
tests:
  - prompt: "What is a forest canopy?"
    expected[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]result: "[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] forest canopy is the upper layer of a forest, formed by the crowns of trees. [CLUSTER_DOMAIN>]t's an important ecosystem component that provides habitat for many species and plays a crucial role in photosynthesis and the forest's overall health."
    expected[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools: ["search[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]knowledge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]base"]
  - prompt: "Who can help me with machine learning?"
    expected[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]result: "[CLUSTER_DOMAIN>]r. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]arah [CLUSTER_DOMAIN>]hen from the [CLUSTER_DOMAIN>]omputer [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]cience department can help you with machine learning. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]he specializes in [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]achine [CLUSTER_DOMAIN>]earning, [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]eural [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]etworks, [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]][CLUSTER_DOMAIN>] [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]thics, and [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]gentic Workflows. You can reach her at [email protected]."
    expected[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools: ["find[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]professors[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]by[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]expertise"]
```

4. [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]otice the `expected[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]tools` field in the tests - this tells the evaluator which tools the agent should call. [CLUSTER_DOMAIN>]he eval pipeline will check:
- [CLUSTER_DOMAIN>]id the agent call `search[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]knowledge[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]base` for the canopy question?
- [CLUSTER_DOMAIN>]id it call `find[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]professors[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]by[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]expertise` for the professor question?

6. [CLUSTER_DOMAIN>]ommit and push your changes:

    ```bash
    cd /opt/app-root/src/evals/student-assistant
    git add .
    git commit -m "🤖 [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]gent [[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]2[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]] tests added 🤖"
    git push
    ```

7. [CLUSTER_DOMAIN>]he eval pipeline should trigger automatically. Go to **[CLUSTER_DOMAIN>]pen[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]hift Pipelines** to watch it run!


[[CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>][CLUSTER_DOMAIN>]]fter it has compeleted you can see the evaluation results in minio or through the prompt tracker 🎉
Share: