-
Notifications
You must be signed in to change notification settings - Fork 652
FEAT: Adding audio scoring #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| # Update rationale to indicate this was from audio transcription | ||
| for score in scores: | ||
| score.score_rationale = f"Audio transcript scored: {score.score_rationale}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't overwrite the rationale. We may want to add to it, but we definitely want to keep the original rationale here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rlundeen2 I changed this to append but the issue here is the score rationale then looks like this for the audio scoring component whereas in my mind I thought it should only include the 2nd part:
- AudioTrueFalseScorer true: The message starts with someone introducing
themselves, evident from 'my name is OK.' Moreover, they express enjoyment of a
croissant by stating, 'this is the best croissant ever.' Both criteria for the
description of 'True' are fulfilled.
Audio transcript scored: The message starts with someone introducing themselves,
evident from 'my name is OK.' Moreover, they express enjoyment of a croissant by
stating, 'this is the best croissant ever.' Both criteria for the description of
'True' are fulfilled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or should I not add extra rationale here at all?
| from pyrit.score.true_false.true_false_scorer import TrueFalseScorer | ||
|
|
||
|
|
||
| class AudioTrueFalseScorer(TrueFalseScorer, _BaseAudioTranscriptScorer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts here:
Why use multiple inheritance here? You could simplify this with composition. If it is possible, it is always preferred to use composition over multiple-inheritance. There is actually a coined term for this.
class AudioTrueFalseScorer(TrueFalseScorer):
def __init__(
self,
*,
text_capable_scorer: TrueFalseScorer,
validator: Optional[ScorerPromptValidator] = None,
) -> None:
super().__init__(validator=validator or self._default_validator)
self._audio_helper = AudioTranscriptHelper(text_capable_scorer=text_capable_scorer)
async def _score_piece_async(
self,
message_piece: MessagePiece,
*,
objective: Optional[str] = None,
) -> list[Score]:
return await self._audio_helper.score_audio_async(
message_piece=message_piece,
objective=objective,
)This keeps the scorer hierarchy clean and avoids MRO complexity. The helper becomes independently reusable too, like how FloatScaleThresholdScorer holds its _scorer via composition.
_BaseAudioTranscriptScorer doesn't need to be a private ABC. It's not a scorer and it's not really a base class that is meant for inheritance, it's a helper that handles transcription and delegation. Maybe rename it to AudioTranscriptHelper?
I know this is the same pattern as the existing _BaseVideoScorer, but let's fix it here rather than in another PR. We can refactor the video scorer to match in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea! @jbolor21, I recommend taking this change in this PR, and in a future one (or a first one) making the update to VideoScorer also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked this in a new follow up story for the videoscorer, but implemented this for the audio one!
| Score( | ||
| score_value="false", | ||
| score_value_description="No audio content to score (empty or no transcript)", | ||
| score_type="true_false", | ||
| score_category=["audio"], | ||
| score_rationale="Audio file had no transcribable content", | ||
| scorer_class_identifier=self.get_identifier(), | ||
| message_piece_id=piece_id, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be an objective=objective parameter passed to the Score?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have this today for Score objects, but tbh since we've separated objective many places it is worth considering. I recommend punting for this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a follow up story for this as well!
| Score( | ||
| score_value="0.0", | ||
| score_value_description="No audio content to score (empty or no transcript)", | ||
| score_type="float_scale", | ||
| score_category=["audio"], | ||
| score_rationale="Audio file had no transcribable content", | ||
| scorer_class_identifier=self.get_identifier(), | ||
| message_piece_id=piece_id, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to pass the objective here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ same as above? Made a follow up story for this
|
I think it would be great reading through this @jbolor21 https://junkangworld.com/blog/inheritance-vs-composition-5-key-rules-for-clean-code-2025 or if you have more time, you can explore |
Description
This PR is adding scorers to evaluate audio files by using AzureTTS to transcribe an audio file. I have also added this scorer to our video files so we can now score the audio in generated videos as well. This works by extracting the audio from the generated video and then using this new base audio scorer.
Tests and Documentation
Added examples in our notebooks for a demo and added unit tests