第一句子网 > python语音识别播放音乐_使用python语音识别播放和流式转录音频

python语音识别播放音乐_使用python语音识别播放和流式转录音频

时间：2021-08-04 21:20:25

我是Python的新手，正在尝试如何在后台播放声音的情况下，从文件中实时转录音频语音。在

更新：@petezurich Sorry for the bad question. Currently, I can hear the

audio playing in the background. However, I am having trouble getting

Sphinx to transcribe the audio. Is there something wrong with the way

I am passing the audio to Sphinx?

It's constantly outputting "Sphinx error" message.

我正在使用PocketSpinx和Uberi/语音识别库。在

到目前为止，我总结了一下：

#!/usr/bin/env python

# recognitions.py : Transcribe Test from an Audio File

import os

import sys

import time

import wave

import pyaudio

import speech_recognition as sr

import threading

try:

import pocketsphinx

except:

print("PocketSphinx is not installed.")

# import audio file within script folder

from os import path

audio_file = path.join(os.path.abspath(os.path.dirname(sys.argv[0])), "samples/OSR_us_000_0061_8k.wav")

print("Transcribing... " + audio_file)

wf = wave.open(audio_file, 'rb')

# set PyAudio instance

pa = pyaudio.PyAudio()

# set recognizer instance (unmodified)

r = sr.Recognizer()

stream_buffer = bytes()

stream_counter = 0

audio_sampling_rate = 48000

def main_recognize(stream):

global audio_sampling_rate

# Create a new AudioData instance, which represents "mono" audio data

audio_data = sr.AudioData(stream, audio_sampling_rate, 2)

# recognize using CMU Sphinx (en-US only)

try:

print("Sphinx: " + r.recognize_sphinx(audio_data, language="en-US"))

except sr.UnknownValueError:

print("Sphinx error")

except sr.RequestError as e:

print("Sphinx error; {0}".format(e))

def stream_audio(data):

global stream_buffer

global stream_counter

buffer_set_size = 200

if stream_counter < buffer_set_size:

# force 'data' to BYTES to allow concat

data = bytes()

stream_buffer += data

stream_counter += 1

else:

threading.Thread(target=main_recognize, args=(stream_buffer,)).start()

# reset

stream_buffer = bytes()

stream_counter = 0

# define callback

def callback(in_data, frame_count, time_info, status):

data = wf.readframes(frame_count)

stream_audio(in_data)

return (data, pyaudio.paContinue)

# open audio stream

stream = pa.open(format=pa.get_format_from_width(wf.getsampwidth()),

channels=wf.getnchannels(),

rate=wf.getframerate(),

output=True,

stream_callback=callback)

# start the stream

stream.start_stream()

# wait for stream to finish

while stream.is_active():

time.sleep(0.1)

# stop stream

stream.stop_stream()

stream.close()

wf.close()

# close PyAudio

pa.terminate()

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。