# Feedback, Model Retraining, and Benchmarking

### Feedback and Counterfactual Example Generation

When users submit feedback through the **"Teach Your AI"** feature, that feedback is processed using a specialized reasoning large language model. This model creates hypothetical examples based on your feedback, a method known as **counterfactual generation**. Counterfactual generation involves imagining "what-if" scenarios by altering small parts of a given example to test and improve the model's understanding.

Importantly, **your exact feedback is not directly used** to train your model. Instead, the AI generates new hypothetical examples inspired by your input. Both your original feedback and the hypothetical examples **are kept private** and **are not shared** with Protege's general **ShieldLlama** model.

### Model Retraining Process

* Models are **retrained nightly at midnight Pacific Time** on working days.
* If a newly trained model **exceeds the benchmark** set by the prior model, it will **be automatically promoted** to active use.

### Benchmarking and Performance Measurement

To ensure quality improvements, we use **multiple types of benchmarks**:

#### 1. Policy Annotation Set

These are real-world examples where users have labeled AI feedback as incorrect. The model is tested against these annotations to ensure it better captures such cases over time.

#### 2. Test Set

A portion (10%) of the synthetically generated dataset, based on user examples, is held out of the training process. This "test set" is used to objectively measure model performance on unseen data.

#### 3. Confusion Matrix

We also analyze model results using a **confusion matrix**, which breaks down predictions into four categories:

* **True Positive (Green Box):** Correctly identified errors.
* **True Negative (Green Box):** Correctly ignored non-errors.
* **False Positive (Red Box):** Incorrectly flagged something as an error.
* **False Negative (Red Box):** Missed identifying a true error.

The goal is to maximize values in the **green boxes** (true positives and true negatives) while minimizing values in the **red boxes** (false positives and false negatives).

<figure><img src="/files/hGar8NtuuJgkwNYPFxU0" alt=""><figcaption></figcaption></figure>

### Model Versions and Update Numbering

#### Minor Version Updates

Version updates that are specific to your environment are reflected through minor number increments. For example, in version **10.0.0 to 10.0.1**, the last digit would increase with each minor update that come from Teach your AI fixes.

#### Major Version Upgrades (Model Rebasing)

On a regular cadence, Protege will perform **model rebasing**, updating your model with the latest ShieldLlama model that includes policies, precedents, and compliance standards observed across industries (e.g., FTC, NAD, FDIC, and more). These substantial updates are reflected in **major version upgrades** (e.g., moving from version 10.0 to 11.0, the first digit).

To set up a fine-tuned model for your organization, reach out to <founders@tryprotege.com>.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://gitbook.tryprotege.com/review/review-tool-concepts/ai-features-and-model-management/feedback-model-retraining-and-benchmarking.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
