I have an big array data_neighbors with shape=(1, 3, 1000000, 112) containing a lot of nan values.
array([[[[ 88.769226, 80.62714 , 75.95856 ]],
[[ 88.749695, 79.52362 , 76.456604]],
[[ 89.07196 , 82.84393 , 77.12067 ]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]],
[[[ 88.769226, 80.62714 , 75.95856 ]],
[[ 88.749695, 79.52362 , 76.456604]],
[[ 89.07196 , 82.84393 , 77.12067 ]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]],
[[[ 88.769226, 80.62714 , 75.95856 ]],
[[ 88.749695, 79.52362 , 76.456604]],
[[ 89.07196 , 82.84393 , 77.12067 ]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]],
...,
[[[116.88446 , 119.25018 , 125.77301 ]],
[[117.02118 , 118.58612 , 124.601135]],
[[116.82587 , 118.84979 , 125.46051 ]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]],
[[[117.02118 , 118.58612 , 124.601135]],
[[116.98212 , 119.34784 , 125.89996 ]],
[[116.91376 , 118.957214, 125.606995]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]],
[[[117.099304, 119.45526 , 126.03668 ]],
[[117.10907 , 118.81073 , 125.2359 ]],
[[117.030945, 119.09393 , 125.79254 ]],
...,
[[ nan, nan, nan]],
[[ nan, nan, nan]],
[[ nan, nan, nan]]]], dtype=float32)
How can I remove all nan values in this array to improve the memory use?
It's important to note that the number of nan values changes in the last dimension. For example: data_neighbors[0,0,0].shape=3 and data_neighbors[0,0,1].shape=112. So, it will be impossible to result an array. Maybe lists in array?
EDIT : The main objective of the script is to realize multi-point regridding. For each point of a grid A, I assign x values of another grid B within a radius of x kilometers around the point. The x values are determined by ind_regrid (1000000*112) a variable which contains for each index of A, the different points of B to integrate. Depending on the index of A, ind_regrid potentially contains nan values in the 112 potential indices to regrid.
nc_conf = Dataset(fic_regril, 'r')
print('-> Read regrid file '+str(fic_regril))
ind_regrid = nc_conf.variables['inds_regrid'][:]
nc_conf.close()
masked_indices = np.ma.getmaskarray(ind_regrid)
data_neighbors = data[:,:,:,np.where(~masked_indices,ind_regrid,0)]
data_neighbors[masked_indices] = np.nan
data_neighbors_list.append(data_neighbors) #pt, regrid, param, time