Copybara import of the project:

--
16994cb2d5d646341f5285ca71d72697d81d18fe by Nilanjan De <nilanjan.de@gmail.com>:

chore: fix typos
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/adk-python/pull/272 from n1lanjan:fix-typos a1ab655b08ec08c5dd2da71aab9a2386e3610e84
PiperOrigin-RevId: 749690489
This commit is contained in:
Nilanjan De
2025-04-20 22:52:42 -07:00
committed by Copybara-Service
parent 23f0383284
commit 1664b45562
15 changed files with 23 additions and 24 deletions

View File

@@ -55,7 +55,7 @@ def load_json(file_path: str) -> Union[Dict, List]:
class AgentEvaluator:
"""An evaluator for Agents, mainly intented for helping with test cases."""
"""An evaluator for Agents, mainly intended for helping with test cases."""
@staticmethod
def find_config_for_test_file(test_file: str):
@@ -91,7 +91,7 @@ class AgentEvaluator:
look for 'root_agent' in the loaded module.
eval_dataset: The eval data set. This can be either a string representing
full path to the file containing eval dataset, or a directory that is
recusively explored for all files that have a `.test.json` suffix.
recursively explored for all files that have a `.test.json` suffix.
num_runs: Number of times all entries in the eval dataset should be
assessed.
agent_name: The name of the agent.

View File

@@ -35,7 +35,7 @@ class ResponseEvaluator:
Args:
raw_eval_dataset: The dataset that will be evaluated.
evaluation_criteria: The evaluation criteria to be used. This method
support two criterias, `response_evaluation_score` and
support two criteria, `response_evaluation_score` and
`response_match_score`.
print_detailed_results: Prints detailed results on the console. This is
usually helpful during debugging.
@@ -56,7 +56,7 @@ class ResponseEvaluator:
Value range: [0, 5], where 0 means that the agent's response is not
coherent, while 5 means it is . High values are good.
A note on raw_eval_dataset:
The dataset should be a list session, where each sesssion is represented
The dataset should be a list session, where each session is represented
as a list of interaction that need evaluation. Each evaluation is
represented as a dictionary that is expected to have values for the
following keys:

View File

@@ -31,10 +31,9 @@ class TrajectoryEvaluator:
):
r"""Returns the mean tool use accuracy of the eval dataset.
Tool use accuracy is calculated by comparing the expected and actuall tool
use trajectories. An exact match scores a 1, 0 otherwise. The final number
is an
average of these individual scores.
Tool use accuracy is calculated by comparing the expected and the actual
tool use trajectories. An exact match scores a 1, 0 otherwise. The final
number is an average of these individual scores.
Value range: [0, 1], where 0 is means none of the too use entries aligned,
and 1 would mean all of them aligned. Higher value is good.
@@ -45,7 +44,7 @@ class TrajectoryEvaluator:
usually helpful during debugging.
A note on eval_dataset:
The dataset should be a list session, where each sesssion is represented
The dataset should be a list session, where each session is represented
as a list of interaction that need evaluation. Each evaluation is
represented as a dictionary that is expected to have values for the
following keys: