一个用于从视频中提取语音转录的开源工具,通过融合OCR(硬字幕识别)和ASR(语音识别)技术,提供高质量的转录结果。

1 Open Issue Need Help Last updated: Jul 2, 2025

Open Issues Need Help

View All on GitHub

AI Summary: Implement ASR functionality to transcribe audio segments detected by VAD, and integrate a large language model to fuse ASR and OCR results for improved accuracy in video transcription. Currently, only OCR-based transcription from hard subtitles is implemented.

Complexity: 4/5
help wanted

一个用于从视频中提取语音转录的开源工具,通过融合OCR(硬字幕识别)和ASR(语音识别)技术,提供高质量的转录结果。

Python