I am trying to feed tflite model with real-time audio data(which is) in Android.

This is the logic of my program:

  1. make AudioRecord instance
  2. make TensorAudio object as Android AudioFormat
  3. load all audio samples in audioRecord(in Timer())
  4. make TensorBuffer object for input tensor
  5. make TensorBuffer object for output tensor
  6. do interpreter.run(inputTensorBuffer, outputTensorBuffer) for inference.

The problem is step 4. In official Tensorflow web site, it is insturcted that I should get tensorAudio.getTensorBuffer(). My tensorflow lite model needs [256, 20] shape input data(which means 256 audio data that has length of 20), but it looks like input data from AudioRecord is linear(for example, [5120] shape).

This there any way that I can reshape [5120] data into [256, 20] in TensorAudio or TensorBuffer?

override fun onCreate(savedInstanceState: Bundle?) {

        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        val REQUEST_RECORD_AUDIO = 1337
        requestPermissions(arrayOf(Manifest.permission.RECORD_AUDIO), REQUEST_RECORD_AUDIO)

        textView = findViewById<TextView>(R.id.output)
        val recorderSpecsTextView = findViewById<TextView>(R.id.textViewAudioRecorderSpecs)

        val SAMPLE_RATE = 16000
        //AUDIO_SOURCE = 1
        //CHANNEL_CONFIG = 1
        val MODEL_INPUT_LENGTH = 20 //L: Length of the filters (in samples)
        val MODEL_PATH = "conv_tasnet_39nx_7_float16.tflite"

        val bufferSizeInBytes = AudioRecord.getMinBufferSize(
            SAMPLE_RATE,
            AudioFormat.CHANNEL_IN_MONO,
            AudioFormat.ENCODING_PCM_FLOAT)

        // 1. make AudioRecord instance
        val audioRecord = AudioRecord(
            MediaRecorder.AudioSource.MIC,
            SAMPLE_RATE,
            AudioFormat.CHANNEL_IN_MONO,
            AudioFormat.ENCODING_PCM_FLOAT,
            bufferSizeInBytes)

        //2. make TensorAudio object as Android AudioFormat
        val tensorAudio = TensorAudio.create(audioRecord.format, MODEL_INPUT_LENGTH)

        val interpreter = Interpreter(File(MODEL_PATH))

        audioRecord.startRecording();

        //tensor spec for model input/output
        val inputShape = intArrayOf(256, 20)
        val outputShape = intArrayOf(256, 2, 20)

        Timer().scheduleAtFixedRate(1, 500) {
            //3. load all audio samples in audioRecord
            tensorAudio.load(audioRecord);

            //4. make TensorBuffer object for input tensor
            // how do I shaped tensor to [256, 20]?
            val tensorBuffer = tensorAudio.getTensorBuffer()
            //val inputTensorBuffer = TensorBuffer.createFixedSize(inputShape, DataType.FLOAT32)
            val inputTensorBuffer = convertToTensorBuffer(tensorAudio.tensorBuffer)

            //5.make TensorBuffer object for output tensor
            val outputTensorBuffer = TensorBuffer.createFixedSize(outputShape, DataType.FLOAT32)
            //interpreter.run(input, output);
            interpreter.run(inputTensorBuffer, outputTensorBuffer.buffer)
        }
    }

    private fun convertToTensorBuffer(samples: TensorBuffer): TensorBuffer {
        // Divide the samples into 20 samples each and fill the tensor buffer
        val numFilters = 256
        val numSamplesPerFilter = 20

        val tensorShape = intArrayOf(numFilters, numSamplesPerFilter)
        val tensorBuffer = TensorBuffer.createFixedSize(tensorShape, DataType.FLOAT32)

        for (i in 0 until numFilters) {
            val startIndex = i * numSamplesPerFilter
            val endIndex = startIndex + numSamplesPerFilter
            //sliceArray not working with TensorBuffer
            val filterSamples = samples.sliceArray(startIndex until endIndex)
            tensorBuffer.loadArray(filterSamples, intArrayOf(i, 0))
        }
        return tensorBuffer
    }

I tried to reshape TensorBuffer by slicing array(in convertToTensorBuffer method), but TensorBuffer doesn't support sliceArray method.

0

There are 0 best solutions below