I am working on creating a virtual mic that mirrors speakers, a loopback to the mic pretty much. I was able to successfully create this mic, and when I look at pavucontrol and it shows the mic is copying the sound waves as the speakers. BUT when I connect it to a python script I tried all available inputs it doesnt read or hear anything.
I'm not sure 100% i setup this virtual mic correctly.
Here is what I did.
I create the new virtual mic:
pactl load-module module-pipe-source source_name=virtual_mic file=/tmp/virtual_mic format=s16le rate=44100 channels=2
I get the index of the new mic:
pactl list sources
index: 2
name: <virtual_mic>
driver: <module-pipe-source.c>
module: 22
properties:
device.string = "/tmp/virtual_mic"
device.description = "Unix FIFO source /tmp/virtual_mic"
device.icon_name = "audio-input-microphone"
I get the index of my speakers
Sink #2
State: IDLE
Name: alsa_output.pci-0000_02_02.0.analog-stereo
Description: ES1371/ES1373 / Creative Labs CT2518 (Audio PCI 64V/128/5200 / Creative CT4810/CT5803/CT5806 [Sound Blaster PCI]) Analog Stereo
Driver: module-alsa-card.c
I create a loopback between the mic and speakers
pactl load-module module-loopback source=virtual_mic sink=alsa_output.pci-0000_02_02.0.analog-stereo
I run this Python script to get all of my sources
import pyaudio
import subprocess
def get_pulseaudio_sources():
result = subprocess.run(['pacmd', 'list-sources'], capture_output=True, text=True)
sources = result.stdout.split('\n')
# Extract source indexes and names
source_info = [line.strip() for line in sources if line.startswith('index') or line.startswith('device.description')]
source_info = [info.split(':')[-1].strip() for info in source_info]
source_indexes = [int(source_info[i]) for i in range(0, len(source_info), 2)]
source_names = [source_info[i] for i in range(1, len(source_info), 2)]
return dict(zip(source_indexes, source_names))
audio = pyaudio.PyAudio()
inputdevice = 0
pulseaudio_sources = get_pulseaudio_sources()
print("\nStarting Audio Devices \n")
# Get all Audio Devices
for i in range(audio.get_device_count()):
device_info = audio.get_device_info_by_index(i)
device_name = device_info['name']
device_index = device_info['index']
print(f"Device {i}: {device_name}" )
print(f" Index: {device_index}")
for pulseaudio_index, pulseaudio_name in pulseaudio_sources.items():
if pulseaudio_name in device_name:
print(f" Matches PulseAudio Source Index: {pulseaudio_index}")
break
print("-----")
This is what I get: Starting Audio Devices
Device 0: Ensoniq AudioPCI: ES1371 DAC1 (hw:0,1)
Index: 0
-----
Device 1: pulse
Index: 1
-----
Device 2: default
Index: 2
-----
When I try different indexes in my code It doesnt find any sound (I tried all).
import sounddevice as sd
import numpy as np
from transformers import pipeline
def record_audio_callback(indata, frames, time, status):
if status:
print(status)
# Process the audio data if needed
else:
print("inside record")
try:
# Convert the recorded audio to text
recognizer = pipeline("automatic-speech-recognition", model="openai/whisper-medium")
result = recognizer(np.squeeze(indata))
# Print the entire 'result' for debugging
print("Full result:", result)
# Check if 'transcription' is present in the result
#if 'transcription' in result[0]:
#if result and isinstance(result, list) and result[0].get('transcription'):
if result and isinstance(result, dict) and result.get('text'):
#transcription = result[0]['transcription']
transcription = result['text']
print("Transcription:", transcription)
else:
print ("No valid Transcription found in result")
except Exception as e:
print("Error during transcription: ", e)
# Set the audio parameters
channels = 1 # Mono audio
sample_rate = 44100
input_device = 1
# Start recording
with sd.InputStream(callback=record_audio_callback, channels=channels, samplerate=sample_rate, device=input_device):
print("Inside Live Audio Listen")
try:
sd.sleep(20000) # Record for 10 seconds (adjust as needed)
except KeyboardInterrupt:
print("Keyboard Stopped Recording")