How to convert model.tflite to model.cc and model.h on Windows 10

3.2k Views Asked by At

I have created a TensorFlow Lite .tflite model which I plan to use on a microcontroller. However, this file must be converted to a C source file, i.e, a TensorFlow Lite for Microcontrollers model. TensorFlow documentation provides a simple way to convert to a C array with the unix command xxd. I am using Windows 10 and do not have access to the unix command and there are no alternative Windows methods documented. After searching superuser, I saw that xxd for Windows now exists. I downloaded the command and ran it on my .tflite model. The results were different than the hello world example.

First, the hello world example model.h file has a comment that say it was "Automatically created from a TensorFlow Lite flatbuffer using the command: xxd -i model.tflite > model.cc" When I ran the command, model.h was not "automatically created".

Second, comparing the model.cc file from the hello world example, with the model.cc file that I generated, they are quite different and I'm not sure how to interpret this (I'm not referring to the differences in the actual array). Again, in the example model.cc file, it states that it was "automatically created" using the xxd command. Line 28 in the example is alignas(8) const unsigned char g_model[] = { and line 237 is const int g_model_len = 2488;. In comparison, the equivalent lines in the file I generated are unsigned char _________g_model[] = { and unsigned int _________g_model_len = 4009981;

While I am not a C expert, I am not sure how to interpret the differences in the files and if I have generated the model.cc file incorrectly. I would greatly appreciate any insight or guidance here on how to properly generate both the model.h and model.cc files from the original model.tflite file.

2

There are 2 best solutions below

8
the busybee On

After doing some experiments, I think this is why you are getting differences:

  1. xxd replaces any non-letter/non-digit character of the path to the input file by an underscore ('_'). Apparently you called xxd with a path for the input file that has 9 such leading characters, perhaps something like "../../../g.model". The syntax of C allows only letters (a to z, A to Z), digits (0 to 9) and underscore as characters of objects' names, and the names need to start with a non-digit. This is the only "manipulation" xxd does to the name of an input file.

  2. Since xxd knows nothing about TensorFlow, it could not had generated the copyright notice. Using this as indication, any other difference had been inserted by other means by the TensorFlow authors, despite the statement "Automatically created from a TensorFlow Lite flatbuffer ...". This could be done manually or by a script, unfortunately I did not find any hint in some quick research on their repository. Apparently the statement means just the data values.

So you need to edit your result:

  1. Add any comment you see fit.

  2. Add the compiler-specific alignas(8) to the array, if your compiler supports it.

  3. Add the keywords const to the array and the length variable. This will tell the compiler to prohibit any write access. And probably this will place the data in read-only memory.

  4. Rename array and length variables to g_model and g_model_len, respectively. Most probably TensorFlow expects these names.

  5. Copy "model.cc" into "model.h", and then apply more editions, as the example demonstrated.

Don't be bothered by different values. Different contents of the model's file are the reason. It's especially simple to check the length variable, it has to have exactly the same value as the size of the input file.

EDIT:

On line 28 which is this text alignas(8) const unsigned char as shown in the example converted model. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 28 is unsigned char (any other text on that line is not in question). How is line 28 edited & explained?

Concerning the "how": I firmly believe that the authors of TensorFlow literally used an editor (an IDE or a stand-alone program like Notepad++ or Geany) and edited the line, or used some script to automate this.

The reason for alignas(8) is most probably that TensorFlow expects the data with an alignment of 8 bytes, for example because it casts the byte array to a structure that contains values of 8 bytes width.

The insertion of const will also commonly locate the model in read-only memory, which is preferable on most microcontrollers. If it were left out, the model's data were not only writable, but would be located in precious RAM.

On line 237, the text specifically is const int. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 237 is unsigned int (any other text on that line is not in question). Why are these two lines different in these specific places? It makes me believe that xxd on Windows is not functioning the same?

Again, I firmly believe this was edited manually or by a script. TensorFlow might expect this variable to be of data type int, but any xxd I tried (Windows and Linux) generates unsigned int. I don't think that your specific version of xxd functions differently on Windows.

For const the same thoughts apply as above.

Finally, when I attempt to convert the example model "hello_world.tflite" file using the xxd for windows utility, my resulting array doesn't match the example "hello_world.cc" file. I would expect the array values to be identical if the xxd worked. The last question is how to generate the "model.h" and "model.cc" files on Windows.

Did you note that the model you link is in another branch of the repository?

If I use the branch on GitHub as in your link to "hello_world.cc", I find in "../train/README.md" this archive hello_world_2020_12_28.zip. I unpacked it and ran xxd on the included "model.tflite". The result's data match the included "model.cc" in the archive. But it does not match the data of "hello_world.cc" in the same branch that you linked. The difference is already there.

My conclusion is, that the example result was not generated from the example model. This happens, since developers sometimes don't pay enough attention on what they commit. Yes, it's unfortunate, as it irritates and frustrates beginners like you.

But, as I wrote, don't let this make you headaches. Try the simple example, use the documentation as instructions on the process. Look at the differences in specific data as a quirk. You will encounter such things time after time when working with other's projects. It is quite normal.

2
Alphin Thomas On

I was also having this same requirement to convert .tflite model file to model.h file. In Linux, xxd -i model.tflite > model.h is available to convert the file. For Windows, xxd command was available to build but it doesn't give an expected output for creating the model. So I've written a python code to make model.h file which the input is given as .tflite file

import numpy as np
import os

def convert_tflite_to_header(tflite_path, output_header_path):

with open(tflite_path, 'rb') as tflite_file:
    tflite_content = tflite_file.read()


hex_lines = [', '.join([f'0x{byte:02x}' for byte in tflite_content[i:i+12]]) for i in range(0, len(tflite_content), 12)]


hex_array = ',\n  '.join(hex_lines)


with open(output_header_path, 'w') as header_file:
    
    header_file.write('const unsigned char model[] = {\n  ')
    header_file.write(f'{hex_array}\n')
    header_file.write('};\n\n')
    
if __name__ == "__main__":
    tflite_path = 'model.tflite'
    output_header_path = 'model.h'

    convert_tflite_to_header(tflite_path, output_header_path)
    convert_tflite_to_header(tflite_path, output_header_path)

The input is given as the model.tflite file

You can run the above given python script to generate the model.h file

Ensure that you have installed all the dependencies to run the above script

  • Python(3.8 or higher)
  • Numpy
  • os