How read GeoTiff NODATA value without GDAL in C#?

144 Views Asked by At

In C# I am writing a GeoTiff reader/writer that does not use GDAL. Not using GDAL is a requirement. I am able to read some things like the number of columns with code like:

using BitMiracle.LibTiff.Classic;

// ... elided code ...

        using (Tiff image = Tiff.Open(fileToOpen, "r"))
        {
            if (image == null) throw new Exception("Could not open file.");

            returnRaster = new Raster();

            FieldValue[] value = image.GetField(TiffTag.IMAGEWIDTH);
            int width = value[0].ToInt();
            returnRaster.numColumns = width;

But when I try to get the NODATA value with something like

            value = image.GetField(TiffTag.NODATA);
            int nodata = value[0].ToInt();
            returnRaster.No_Data = nodata;

There is no TiffTag.NODATA or equivalent.

How do I read (and write) the No Data value of a GeoTiff without using GDAL and only using the BitMiracle API?

1

There are 1 best solutions below

1
Jacob Frazer On

I empathize with your challenge here, as I have experienced this same issue from a pythonic perspective and can tell you that the issue of extracting NODATA values from a Geotiff is a nontrivial/language-agnostic problem.

Especially if not using GDAL is a requirement! I also have the requirement of not using GDAL in my particular circumstance due to dependency limitations. So we are going through this at the same time. :)

Synopsis;

  1. The geospatial community itself struggles with defining this problem of NODATA consistency cross-platform for Geotiff writers/readers. So unfortunately, simply extracting the NODATA value from the NODATA Tiff Tag is not recommended in my opinion, and not how many in the community store the NODATA information in the GeoTiff format. Hence, why you are seeing no value for that tag. In some cases it may exist there, but I would not depend on that approach explicitly.

  2. In order to follow best practices for reading tiff tag information (such as NODATA), please see: https://docs.ogc.org/is/19-008r4/19-008r4.html, specifically: https://docs.ogc.org/is/19-008r4/19-008r4.html#_conformance_class_tiff (which you probably have already seen since you are using Libtiff: https://gitlab.com/libtiff/libtiff)

  3. IMPORTANT: as of Feb. 2022, This comment is most pertinent to our situation of NODATA: https://github.com/opengeospatial/ogcapi-coverages/issues/85#issuecomment-1041885772, this amounts to a "yes and" for point #2 above... The OGC (Open Geospatial Consortium) community is going to update the GeoTiff format, since many struggle with this issue and it is not well defined what ARE the best practices. Please review the comment above and the surrounding discussion, if curious. So currently in the interface control document for the GeoTiff format, you will find very little help on how to extract NODATA from tiff tags.

So then, lets break down the github comment above: https://github.com/opengeospatial/ogcapi-coverages/issues/85#issuecomment-1041885772

"SWG 2022-02-16: Reviewing this issue, we can recommand three potential approaches to handle NODATA:

Use the GDAL_NODATA tag, which applies to all bands

Use an alpha channel where 0 transparency means NO DATA Specify a per-band NODATA value as part of a suggested encodingInfo extension to the RangeType DataRecord fields (which also addresses the scale factor and offset) Note that the photometric interpretation TIFF tag should be set correctly to prevent the alpha channel from being interpreted incorrectly. These recommendations could be included as part of a GeoTIFF conformance class in OGC API - Coverages.

Another possibility is a transparency mask band, but in practice it is difficult to implement and does not work with a wide range of software.

Some data products (e.g. LANDSAT-8) have a dedicated quality assessment band with more detailed information as a set of bits (see https://www.usgs.gov/landsat-missions/landsat-collection-2-quality-assessment-bands)."

Above, you see three cases to handle reading the NODATA information from the GeoTiff file, and unfortunately, only one case has NODATA value stored in an actual tag!

What this means is you need to cover these three main cases, in addition to reading the actual NODATA tag in the file.

What I can tell you from experience at my company, and working closely with our Geopatial SME as we encounter customer support issues of this case in our application, is that you need to cover these three main cases as best you can in order to semi-consistently read the NODATA value from files encountered in the "wild". The reason is that there are a number of different ways geospatial software in the community store and write NODATA information.

Recap:

Without using GDAL, and leveraging a custom reader solution (regardless if in C or not, could be python, etc.)

you need to...

  1. Read the GDAL_NODATA tiff tag value (since many softwares or geospatial devs use this tag to store the NODATA value) -> The probability of reading NODATA value information of the file increases by including this tag in your logic.
  2. Sense if there is an alpha channel band (other tiff tags describe the number of bands and datatype for the bands also accessible in the GeoTiff header- please reference the OGC document I provided above for best practices on reading those values. If the alpha channel exists, follow the above github comment's recommendation on how to check if the alpha channel ACTUALLY represents NODATA, this is not always the case.

As always, implement checks and conditions in your NODATA extraction logic to account for the non-existence of ANY values. The NODATA value can also be often wrong or unsupported by common visualization software. We have seen this too, where you extract the value, but the value can be inaccurate.

Sorry for the book of a response, but I have struggled with this problem A LOT, and I wanted to provide you with some support since there is not much to go off of out there. I am also new to the field of geospatial computing, and software in general. BUT, I know documenting these issues and helping each other leads to defining these kinds problems, and can lead to a defined solution set.

Other geospatial devs! Please chime in and help/provide alternatives if they exist. Not using GDAL or other well worn libraries is generally not recommended, but sometimes you have no choice!