PulseAudio Create Virtual Mic and use it in Python

76 Views Asked by At

I am working on creating a virtual mic that mirrors speakers, a loopback to the mic pretty much. I was able to successfully create this mic, and when I look at pavucontrol and it shows the mic is copying the sound waves as the speakers. BUT when I connect it to a python script I tried all available inputs it doesnt read or hear anything.

I'm not sure 100% i setup this virtual mic correctly.

Here is what I did.

I create the new virtual mic:

pactl load-module module-pipe-source source_name=virtual_mic file=/tmp/virtual_mic format=s16le rate=44100 channels=2

I get the index of the new mic:

pactl list sources

index: 2
    name: <virtual_mic>
    driver: <module-pipe-source.c>
    module: 22
    properties:
        device.string = "/tmp/virtual_mic"
        device.description = "Unix FIFO source /tmp/virtual_mic"
        device.icon_name = "audio-input-microphone"

I get the index of my speakers

Sink #2
    State: IDLE
    Name: alsa_output.pci-0000_02_02.0.analog-stereo
    Description: ES1371/ES1373 / Creative Labs CT2518 (Audio PCI 64V/128/5200 / Creative CT4810/CT5803/CT5806 [Sound Blaster PCI]) Analog Stereo
    Driver: module-alsa-card.c

I create a loopback between the mic and speakers

pactl load-module module-loopback source=virtual_mic sink=alsa_output.pci-0000_02_02.0.analog-stereo

I run this Python script to get all of my sources

import pyaudio
import subprocess

def get_pulseaudio_sources():
    result = subprocess.run(['pacmd', 'list-sources'], capture_output=True, text=True)
    sources = result.stdout.split('\n')

    # Extract source indexes and names
    source_info = [line.strip() for line in sources if line.startswith('index') or line.startswith('device.description')]
    source_info = [info.split(':')[-1].strip() for info in source_info]
    source_indexes = [int(source_info[i]) for i in range(0, len(source_info), 2)]
    source_names = [source_info[i] for i in range(1, len(source_info), 2)]

    return dict(zip(source_indexes, source_names))

audio = pyaudio.PyAudio()
inputdevice = 0

pulseaudio_sources = get_pulseaudio_sources()

print("\nStarting Audio Devices \n")
# Get all Audio Devices
for i in range(audio.get_device_count()):
    device_info = audio.get_device_info_by_index(i)
    device_name = device_info['name']
    device_index = device_info['index']
    
    print(f"Device {i}: {device_name}" )
    print(f" Index: {device_index}")
    
    for pulseaudio_index, pulseaudio_name in pulseaudio_sources.items():
        if pulseaudio_name in device_name:
            print(f" Matches PulseAudio Source Index: {pulseaudio_index}")
            break
    print("-----")

This is what I get: Starting Audio Devices

Device 0: Ensoniq AudioPCI: ES1371 DAC1 (hw:0,1)
 Index: 0
-----
Device 1: pulse
 Index: 1
-----
Device 2: default
 Index: 2
-----

When I try different indexes in my code It doesnt find any sound (I tried all).

import sounddevice as sd
import numpy as np
from transformers import pipeline

def record_audio_callback(indata, frames, time, status):
    if status:
        print(status)
    # Process the audio data if needed
    else:
        print("inside record")
        try:
            # Convert the recorded audio to text
            recognizer = pipeline("automatic-speech-recognition", model="openai/whisper-medium")
            result = recognizer(np.squeeze(indata))

            # Print the entire 'result' for debugging
            print("Full result:", result)

             # Check if 'transcription' is present in the result
            #if 'transcription' in result[0]:
            #if result and isinstance(result, list) and result[0].get('transcription'):
            if result and isinstance(result, dict) and result.get('text'):
                #transcription = result[0]['transcription']
                transcription = result['text']
                print("Transcription:", transcription)
            else:
                print ("No valid Transcription found in result")
        except Exception as e:
            print("Error during transcription: ", e)


# Set the audio parameters
channels = 1  # Mono audio
sample_rate = 44100
input_device = 1

# Start recording
with sd.InputStream(callback=record_audio_callback, channels=channels, samplerate=sample_rate, device=input_device):
    print("Inside Live Audio Listen")
    try:
        sd.sleep(20000)  # Record for 10 seconds (adjust as needed)
    except KeyboardInterrupt:
        print("Keyboard Stopped Recording")
0

There are 0 best solutions below