Skip to content

Conversation

@sam-s10s
Copy link
Member

Under certain circumstances, when there is a short utterance and the VAD triggers a ForceEndOfUtterance message, then this can result in the STT engine emitting the utterance as finals, bypassing the partial payloads. The turn detection uses partials to detect speaker start / stop events and for end of turn messages, and the absence of the partial utterance resulted in these messages not being emitted.

The fix checks final payloads to determine if there was a previous valid partial payload and will then determine whether to trigger a start of turn so that the remaining end of turn logic follows as expected.

@LArmstrongDev
Copy link
Contributor

What testing did you do for this fix?

Copy link
Contributor

@LArmstrongDev LArmstrongDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying re the testing. LGTM.

@sam-s10s sam-s10s merged commit 34f2cba into main Jan 26, 2026
25 checks passed
@sam-s10s sam-s10s deleted the fix/single-word-eou branch January 26, 2026 16:21
sam-s10s added a commit that referenced this pull request Jan 26, 2026
commit 34f2cba
Author: Sam Sykes <sams@speechmatics.com>
Date:   Mon Jan 26 16:21:42 2026 +0000

    Fix for short utterances when using ForceEndOfUtterance (#78)

    * track previous partials when checking new finals

    * check we are not already speaking!

    * EOU / FEOU testing

    * permit no punctuation

    * added test for feou

    * update existing FEOU test

    * updated test.

    * expanded samples

    * fix test set

    * refining the values

    * updated tests for FEOU

    * extra tests and split out FIXED and ADAPTIVE tests

    * support other endpoints

    * Adjust VAD timeout default from 0.18 to 0.22 for FEOU.

    * Support `is_eou` for final segment in an utterance.

    * remove FEOU tests

    * retain 0.18 as the VAD timeout

commit fecea0e
Author: Lorna Armstrong <lorna.armstrong@speechmatics.com>
Date:   Wed Jan 14 15:35:49 2026 +0000

    Fix Scribe preset configuration (#77)

commit 8825c42
Author: Sam Sykes <sams@speechmatics.com>
Date:   Mon Jan 12 14:19:02 2026 +0000

    Voice SDK url parameter handling (#76)

    ## What's Changed?
    - better handling for `sm-app` and other URL parameters provided by the client.
    - ensure that URL parameters are parsed correctly.

commit 81f093f
Author: Sam Sykes <sams@speechmatics.com>
Date:   Thu Jan 8 01:51:53 2026 +0100

    Fix to max delay mode and filter for final changes (#74)

    ## What's Changed?
    - Updated to max delay mode and filter for final updates.

commit 7c88c25
Author: Sam Sykes <sams@speechmatics.com>
Date:   Tue Dec 30 14:44:44 2025 +0100

    Updated integration examples. (#73)

    * Updated integration examples. Includes linting of the README.

    * TIP fix

    * Prettier override.

commit 624f014
Author: Zultran <edgar.adamovics@speechmatics.com>
Date:   Tue Dec 30 12:20:58 2025 +0000

    Adds comprehensive README documentation (#70)

    * Adds comprehensive README documentation

    Introduces a detailed README file to provide users with a comprehensive guide to the Speechmatics Python SDK.

    The README includes:
    - Quick start instructions for installation and basic usage
    - Information on key features, use cases, and integration examples
    - Documentation links and migration guides
    - Information about Speechmatics technology
    - Links to resources and community support

    * Removes bold formatting from migration guide links

    Updates the README to remove bold formatting from the "Full Migration Guides" section.
    This improves the visual consistency of the document and avoids unnecessary emphasis on the links.

    * Updates examples and adds env variable

    Refactors the examples in the README to use environment variables
    for the API key and includes an async close on the client in the
    batch example. Also adds prefer_current_speaker to the speaker
    diarization config example.

    * Updates README with usage examples and features

    Enhances the README with detailed examples for batch,
    realtime, TTS, and voice agent functionalities.

    Also, includes installation instructions, key features,
    and use cases for the Speechmatics Python SDK.

    * Fixed broken status page link to README

    * Enhances README with examples and details

    Updates the README to include more detailed examples for batch transcription, realtime streaming, text-to-speech, and voice agent functionalities.

    Adds sections on key features like speaker diarization, custom dictionaries, audio intelligence, and translation with corresponding code snippets.

    Provides information on framework integrations, focusing on LiveKit Agents and Pipecat AI, improving user understanding and adoption.

commit cb48e21
Author: Sam Sykes <sams@speechmatics.com>
Date:   Mon Dec 22 10:45:11 2025 +0100

    Reduce RT logging in Voice SDK (#72)

    ## What's Changed
    - Lowered logging of the RT AsyncClient to reduce debug noise
    - Bumped ORT / ONNX runtime dependency requirement

commit 3a247b0
Author: Sam Sykes <sams@speechmatics.com>
Date:   Mon Dec 22 10:39:02 2025 +0100

    Fix for when diarization is not enabled (#71)

    ## What's Changed
    - When diarization is not enabled, all speakers are identified as `UU`.

commit 95ca9b6
Author: Sam Sykes <sams@speechmatics.com>
Date:   Wed Dec 17 09:48:32 2025 +0100

    fix to use rt 0.5.3 (#69)

commit cecb235
Author: Sam Sykes <sams@speechmatics.com>
Date:   Tue Dec 16 20:18:01 2025 +0100

    fix to SSL for AsyncClient WebSocket (#68)

    Fix so `ws://` connections do not fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants