Getting robotic and chopped audio when using EZOutput

86 Views Asked by At

I am using EZOutput from EZAudio to output sound I receive from a stream of audio (self.myAudioBufferList). Think LiveListen. I also have the audio basic description (self.asbd!). However when I set the shouldFill method of the EZOutput datasource, I get a robotic version of the sound that is chopped with some of it missing.

func output(_ output: EZOutput!, shouldFill audioBufferList: UnsafeMutablePointer<AudioBufferList>!, withNumberOfFrames frames: UInt32, timestamp: UnsafePointer<AudioTimeStamp>!) -> OSStatus {
    
    if self.asbd != nil {
        output.inputFormat = self.asbd!
    }
    
    if self.audioBufferList != nil {
        audioBufferList.pointee.mNumberBuffers = self.myAudioBufferList!.mNumberBuffers
        audioBufferList.pointee.mBuffers.mNumberChannels = self.myAudioBufferList!.mBuffers.mNumberChannels
        audioBufferList.pointee.mBuffers.mData = self.myAudioBufferList!.mBuffers.mData
    } else {
        print("it is nil, and that is why it is making low noise")
        audioBufferList.pointee.mBuffers.mData = nil
    }
    
    return noErr;
}

Also when is set the audioBufferList mDataByteSize, the quality gets worse. In it's current configuration the dataByteSize is 4096 where as myAudioBufferList is not constant and fluctuates, never going over 512.

audioBufferList.pointee.mBuffers.mDataByteSize = self.myAudioBufferList!.mBuffers.mDataByteSize

The frames value that the datasource method gives has a value of 1024.

Note: I have also tried providing the inputFormat when creating an instance of EZOutput, however this does nothing to improve the sound

// Assign a delegate and datasource to the shared instance of the output to provide the output audio data
self.ezOutput = EZOutput.init(dataSource: self, inputFormat: asbd)
self.ezOutput?.delegate = self
1

There are 1 best solutions below

0
Rob Napier On

I'm not particularly familiar with EZAudio, but the callback seems to be requesting a specific number of frames, and you seem to be ignoring that. If it asks for 1024 frames, you must fill precisely 1024 frames before returning (unless you're at the end of data, in which case you'd pad and post EZAudioPlayerDidReachEndOfFileNotification). If you don't have 1024 frames, then you'll either need to block until you do, signal an underrun (not sure how EZAudio does this), or fabricate "appropriate" data to fill (for example, Packet Loss Concealment). If you're getting data in blocks of irregular sizes, then you may need to buffer a bit.

"Robotic" or other types of noise are very often due to mismatched waveforms. (This translates into high-frequency noise.) In this code, that can be due to dropping good data or due to inserting zeros for missing data.