Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.

ane asr audio automatic-speech-recognition avfoundation coreml ios macos nvidia parakeet real-time speaker-diarization speaker-embedding speaker-identification speaker-recognition speech-to-text swift vad voice-activity-detection
4 Open Issues Need Help Last updated: Sep 13, 2025

Open Issues Need Help

View All on GitHub
enhancement good first issue speaker-diarization

Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.

Swift
#ane#asr#audio#automatic-speech-recognition#avfoundation#coreml#ios#macos#nvidia#parakeet#real-time#speaker-diarization#speaker-embedding#speaker-identification#speaker-recognition#speech-to-text#swift#vad#voice-activity-detection

AI Summary: The task is to debug and fix a bug in the `AsrModels.load` function within the FluidAudio Swift framework. The function currently ignores user-specified compute units (`configuration.computeUnits`) and instead always uses optimized compute units (`optimizedConfig.computeUnits`). The fix requires modifying the `load` function to correctly utilize the user-provided compute unit configuration.

Complexity: 3/5
bug good first issue

Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.

Swift
#ane#asr#audio#automatic-speech-recognition#avfoundation#coreml#ios#macos#nvidia#parakeet#real-time#speaker-diarization#speaker-embedding#speaker-identification#speaker-recognition#speech-to-text#swift#vad#voice-activity-detection
iOs live ASR 3 months ago

AI Summary: The task is to provide a Swift code snippet demonstrating live Automatic Speech Recognition (ASR) from a microphone on iOS using the FluidAudio library. The snippet should handle audio input from the microphone, process it using the library's (currently missing) `transcribeChunk` function (or an equivalent), and output the transcription to the console. The solution needs to account for the user's inexperience with iOS development.

Complexity: 3/5
documentation good first issue

Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.

Swift
#ane#asr#audio#automatic-speech-recognition#avfoundation#coreml#ios#macos#nvidia#parakeet#real-time#speaker-diarization#speaker-embedding#speaker-identification#speaker-recognition#speech-to-text#swift#vad#voice-activity-detection

AI Summary: The task is to modify the FluidAudio Swift framework to prevent the repeated download of VAD models from HuggingFace upon each app launch. This involves integrating the pre-downloaded CoreML models directly into the Xcode project, eliminating the need for online retrieval.

Complexity: 3/5
bug good first issue

Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.

Swift
#ane#asr#audio#automatic-speech-recognition#avfoundation#coreml#ios#macos#nvidia#parakeet#real-time#speaker-diarization#speaker-embedding#speaker-identification#speaker-recognition#speech-to-text#swift#vad#voice-activity-detection