I keep getting blank images while trying to Georeference an image with GDAL

492 Views Asked by At
#!/usr/bin/env python3
import numpy as np
from osgeo import gdal
from osgeo import osr

# Load an array with shape (197, 250, 3)
# Data with dim of 3 contain (value, longitude, latitude)
data = np.load("data.npy")

# Copy the data and coordinates
array = data[:,:,0]
lon = data[:,:,1]
lat = data[:,:,2]

nLons = array.shape[1]
nLats = array.shape[0]

# Calculate the geotransform parameters
maxLon, minLon, maxLat, minLat = [lon.max(), lon.min(), lat.max(), lat.min()]
resLon = (maxLon - minLon) / nLons
resLat = (maxLat - minLat) / nLats

# Get the transform
geotransform = (minLon, resLon, 0, maxLat, 0, -resLat)

# Create the ouptut raster
output_raster = gdal.GetDriverByName('GTiff').Create('myRaster.tif', nLons, nLats, 1, 
gdal.GDT_Int32)

# Set the geotransform
output_raster.SetGeoTransform(geotransform)
srs = osr.SpatialReference()

# Set to world projection 4326
srs.ImportFromEPSG(4326)
output_raster.SetProjection(srs.ExportToWkt())
output_raster.GetRasterBand(1).WriteArray(array)
output_raster.FlushCache()

The code above is meant to georeference a raster using GDAL but returns blank tiff files. I have vetted the data and variables, I, however, suspect the problem could be from geotransform variables. The documentation demands the variable to be:

top-left-x, w-e-pixel-resolution, 0,
top-left-y, 0, n-s-pixel-resolution (negative value)

I have used lats and lons not sure I'm getting which one corresponds to x and which to y. It could be something else but I'm not quite sure.

1

There are 1 best solutions below

1
Rutger Kassies On

Overall your approach looks correct to me, but it's hard to tell without seeing the data you're using, but here are some points to consider:

First, there's a difference between the output file being empty, and/or being in the wrong location, georeferencing relates only to the latter.

When working interactive, you should also make sure to properly close the Dataset using output_raster = None, that will also trigger flushing for you.

You could start by testing if GDAL reads the same data that you intended to write. Using something like:

ds = gdal.Open('myRaster.tif')
data_from_disk = ds.ReadAsArray()
ds = None

np.testing.assert_array_equal(data_from_disk, array)

If those are not identical, it could be an issue with the datatype. Like writing floats close to 0 as integers, causing them to clip to 0 giving the appearance of an "empty" file.

Regarding the georeferencing, the projection you use has the coordinates in degrees. If yours are in radians your output ends up close to null-island.

Your approach also assumes that the data and lat/lon arrays are on a regular grid (having a constant resolution). That might not be the case (especially if the data comes with a 2D grid of coordinates).

Often when coordinate arrays are given, they are defined as valid for the center of the pixel. Compared to GDAL's geotransform which is defined for the (outer) edge of the pixel. So you might need to account for that by subtracting half the resolution. And this also impacts your calculation of the resolution, which in the case for the center-definition should probably use / (nLons-1) & / (nLats-1). Or alternatively verify with:

# for a regular grid
resLon = lon[0,1] - lon[0,0]
resLat = lat[1,0] - lat[0,0]

When I run your snippet with some dummy data, it gives me a correct output (ignoring the center/edge issue mentioned above).

lat, lon = np.mgrid[89:-90:-2, -179:180:2]
array = np.sqrt(lon**2 + lat**2).astype(np.int32)