How to parse the value of tfrecords, convert it to something else, and put it back

484 Views Asked by At

I've a tfrecords file which unfortunately doesn't have the label value in it. It has two values: Image and Id

So, to get the label, I need to look at the Id in a pandas DataFrame to drive its value and then based on its value I create the label, e.g:

if df[df['id'] == Id]['value'] > threshold_value:
   label = 1
else:
   label = 0

But, I don't know how to convert a Tensor("ParseSingleExample/ParseExample/ParseExampleV2:1", shape=(), dtype=string) to python string.

I copied the code that I parse the tfrecords here:

def parse_tf_records(example_input):
    feature_description_dict = {
        IMAGE_FIELD: tf.io.FixedLenFeature(IMAGE_SIZE, tf.float32),
        ID_FIELD: tf.io.FixedLenFeature([], tf.string)
    }
    parsed_example = tf.io.parse_single_example(example_input, feature_description_dict)
    return parsed_example

and

def read_tfrecord(example_input):
    parsed_example = parse_tf_records(example_input)
    image_data = parsed_example[IMAGE_FIELD]
    id_data = parsed_example[ID_FIELD]
    
    # label = Look for the value of id_data in a Pandas Dataframe and compare the value to threshold_value
    label_data = tf.cast(label, tf.int32)
    return image_data , label_data

I'm using tensorflow 2.4.1. Really appreciate if someone can help me with this. Thanks.

1

There are 1 best solutions below

0
Nejla On

Ok, tf.py_fuction is the answer. Here is my code and it works beautifully:

def get_label(tf_id):
    _id = tf_id.numpy().decode('utf-8')
    if df[df['id'] == _id]['value'] > threshold_value:
       label = 1
    else:
       label = 0
    return tf.cast(label, tf.int32)

def read_tfrecord(example_input):
    parsed_example = parse_tf_records(example_input)
    image_data = parsed_example[IMAGE_FIELD]
    id_data = parsed_example[ID_FIELD]
    label = tf.py_function(get_label, [id_data], tf.int32)
    return image_data , label