Java Deeplearning4j image in wrong shape

168 Views Asked by At

I have a small problem with Java and machine learning. I have trained a model with Keras and it works as expected when I use Python to predict images.

The shape on which the model is trained was [ width, height, RGB ].

But when I load an image in Java I got [ RGB, width, height] - so I try to use .reshape() to change the shape but I clearly mess there something up because all predictions are wrong afterwards:

ResizeImageTransform rit = new ResizeImageTransform(128, 128);
NativeImageLoader loader = new NativeImageLoader(128, 128, 3, rit);

INDArray features = loader.asMatrix(f); // GIVES ME A SHAPE OF 1, 3, 128, 128
features = features.reshape(1, 128, 128, 3); // GIVES ME THE SHAPE 1, 128, 128, 3 AS NEEDED

INDArray[] prediction = model.output(features); // all predictions wrong

I am no Java developer and I try to get alon with the documentation but here I clearly overlook something. Maybe someone here can give a tip what I am doing wrong...

4

There are 4 best solutions below

7
Markus Bauer On

So now I get at least 136 images of my test-set flagged. The Python version flags 195 images...

So I guess the normalisation is a problem. I train the model with:

train = ImageDataGenerator(rotation_range=5, horizontal_flip=True, vertical_flip=True, rescale=1/255)

And I use

X *= 1/255

before the prediction in the test script.

In Java I use

features = features.permute(0, 2, 3, 1);
DataNormalization scalar = new ImagePreProcessingScaler(0, 1);
scalar.transform(features);

but I am not sure if the normalisation is the issue or if I have srewed up the parameters for .permute()...

Any suggestions?

0
Markus Bauer On

That's all how the model is trained:

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.applications import MobileNetV2

from tensorflow.keras.applications import ResNet152V2

# GENERAL WIDTH AND HIGHT FOR THE IMAGES
WIDTH = 128
HEIGHT = 128

train = ImageDataGenerator(rotation_range=5, horizontal_flip=True, vertical_flip=True, rescale=1/255) 
valid = ImageDataGenerator(rescale=1/255)

train_set = train.flow_from_directory('images_train/', target_size=(WIDTH,HEIGHT), batch_size=64, class_mode='categorical')
test_set  = valid.flow_from_directory('images_test/', target_size=(WIDTH,HEIGHT), batch_size=64, class_mode='categorical')

resnet = ResNet50V2(include_top=False, weights='imagenet', input_shape=(WIDTH,HEIGHT,3))  
    
for layer in resnet.layers:
    layer.trainable = False

x = tf.keras.layers.Flatten()(resnet.output)
x = tf.keras.layers.Dense(512, activation='relu')(x)
n_classes = len(train_set.class_indices)
predictions = tf.keras.layers.Dense(n_classes, activation='softmax')(x)

model = tf.keras.Model(inputs=resnet.input, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])

hist = model.fit(train_set, epochs=20, validation_data=test_set)

model.save('resnet50v2.h5')

That's the code how I test images in Python:

ctr = 0
for root, dirs, files in os.walk(base_path):
    for name in files:
        image_path = os.path.join(root, name)
        
        tmp = image_path.lower().split(".")
        if tmp[-1] in ["jpg", "jpeg", "png", "bmp"]:
            orig_image = Image.open(image_path)
            if orig_image.mode != "RGB":
                orig_image = orig_image.convert("RGB")
                
            image = orig_image.resize((128, 128))
            
            X = []
            X.append(np.array(image.getdata()).reshape((128,128,3)))
            X = np.array(X).astype('float64')
            X *= 1/255

            # PREDICT AND WRITE REPORT
            pred = model.predict(X)
            pred = np.rint(pred).astype("int32")

            if(pred[0][1] != 1):
                ctr += 1
                print(f"{ctr} :: {image_path} == {pred[0]}")

And that's the code how I test images in Java:

int ctr = 0;
for (File f : listOfFiles) {
    if (f.isFile()) {
        ResizeImageTransform rit = new ResizeImageTransform(128, 128);
        NativeImageLoader loader = new NativeImageLoader(128, 128, 3, rit);
        INDArray features = null;
        try{
            features = loader.asMatrix(f); // GIVES ME A SHAPE OF 1, 3, 128, 128
        } 
        catch(IOException ex){
            continue;
        }
        features = features.permute(0, 2, 3, 1);
        DataNormalization scalar = new ImagePreProcessingScaler(0, 1);
        scalar.transform(features);

        INDArray[] prediction = model.output(features);

        // Get Class
        double pred[] = prediction[0].toDoubleVector();
        int predClass = 0;
        for(int i = 0; i < pred.length; i++){
            predClass = pred[i] > pred[predClass] ? i : predClass;
        }

        if(predClass != 1){
            ctr++;
            System.out.println(f.getName());
            System.out.println(ctr + ") PORN FOUND :: " + predClass);
        }
    }
}
2
Markus Bauer On

Python

DSC_3767.jpg
    A          B          C
[[[[0.9254902  0.88627451 0.87843137]
   [0.9254902  0.88627451 0.87843137]
   [0.9254902  0.88627451 0.87843137]
   ...

Java

DSC_3767.jpg
        C          B          A
[[[[    0.8784,    0.8863,    0.9255], 
   [    0.8784,    0.8863,    0.9255], 
   [    0.8784,    0.8863,    0.9255],
   ...

All I have to do is swap C and A and the model will work fine. I just don't get the way how.

0
Markus Bauer On

I mean I can fix the issue with

for (int y = 0; y < 128; y++) {
    for (int x = 0; x < 128; x++) {
        double a = features.getDouble(0,y,x,0);
        double b = features.getDouble(0,y,x,2);
        features.putScalar(new int[] {0,y,x,0}, b);
        features.putScalar(new int[] {0,y,x,2}, a);
    }
}

... but there must be a nicer / better solution.