I am wondering how to find the mean and standard deviation for lets say 3 images. This is to be used as inputs to the Normalize function in Pytorch (from torchvision.transforms import Normalize).
In the particular dataset I work, the 3 color channels are in separate tif files. As it just a repetition I will show the the calculations for the red band.
Approach 1
I load the tensor which is 1x120x120 tensor and find the mean of the red channel and append it to a list to keep track of the means (means across the pixels) for the 3 images. Then finally to find the mean of the dataset for the red channel, I just find the mean of the list (mean across the images). Computing the standard deviation would be the same process
def get_mean_std(root:str):
"""
Finds the mean and standard deviation of channels in a dataset
Inputs
- root : Path to Root directory of dataset
"""
rb_list = []
gb_list = []
bb_list = []
mean = 0
for data_folder in os.listdir(root)[:3]:
# Path containing to folder containing 12 tif files and a json file
data_folder_pth = os.path.join(root, data_folder)
# Path to RGB channels | rb refers to red band ...
rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])
# Open each Image and convert to tensor
rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)
# Find the mean of all pixels
rb_list.append(rb.mean().item())
mean_of_3_images = np.array(rb_list).mean()
print(f"rb_list : {rb_list}")
print(f"mean of red channel : {mean_of_3_images}")
# output
>>> rb_list : [281.01361083984375, 266.2029113769531, 1977.7083740234375]
>>> mean of red channel : 841.6416320800781
Approach 2
Following this post (https://saturncloud.io/blog/how-to-normalize-image-dataset-using-pytorch/#step-2-calculate-the-mean-and-standard-deviation-of-the-dataset) but amended to work with this dataset. Here the author find keeps track of the count of all the pixels and the mean and then divid the mean by the number of pixels.
But the results I get from the two methods are different.
def get_mean_std(root:str):
"""
Finds the mean and standard deviation of channels in a dataset
Inputs
- root : Path to Root directory of dataset
"""
mean = 0
num_pixels = 0
for data_folder in os.listdir(root)[:3]:
# Path containing to folder containing 12 tif files and a json file
data_folder_pth = os.path.join(root, data_folder)
# Path to RGB channels | rb refers to red band ...
rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])
# Open each Image and convert to tensor
rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)
batch, height, width = rb.shape #(1,120,120)
num_pixels += batch * height * width
mean += rb.mean().sum()
print(mean)
print(mean / num_pixels)
# Output
>>> tensor(2524.9248)
>>> tensor(0.0584)
I am wondering why the values are so different. Any Idea why my method is incorrect?
Just to get some idea of the values for the red band inside the 3 images ...
tensor([[[322., 275., 262., ..., 260., 225., 268.],
[283., 271., 259., ..., 277., 269., 278.],
[302., 303., 276., ..., 305., 279., 283.],
...,
[398., 341., 374., ..., 246., 273., 227.],
[383., 351., 375., ..., 266., 277., 260.],
[353., 347., 359., ..., 280., 260., 227.]]])
tensor([[[153., 214., 242., ..., 825., 575., 399.],
[206., 223., 198., ..., 766., 507., 477.],
[219., 256., 189., ..., 593., 365., 384.],
...,
[138., 255., 329., ..., 227., 289., 334.],
[174., 215., 276., ..., 402., 395., 350.],
[216., 212., 214., ..., 354., 362., 312.]]])
tensor([[[1727., 1852., 1184., ..., 3494., 3539., 3374.],
[1882., 1868., 1307., ..., 3523., 3443., 3278.],
[1716., 1975., 1919., ..., 3280., 3319., 3121.],
...,
[2199., 2214., 2269., ..., 2563., 2284., 2147.],
[2181., 2213., 2312., ..., 2686., 2668., 2737.],
[2208., 2297., 2351., ..., 2647., 2904., 3008.]]])
The link you mentioned implement the mean and std calculation wrong. The first implementation is the right way.