Convert Yolo output to a real-world coordinate system

1k Views Asked by At

We have detected objects on UAV data using Yolo v5 and obtained bounding box coordinates (x1,y1,x2,y2) in the format relative to the origin of the satellite data. The data looks like this and is returned as a tab-delimited text file.

[ 7953 11025  7978 11052]

[16777 10928 16817 10970]

[15670 10591 15685 10607]

The results are accompanied by a PNG and the PGW (world file) reads like this:

0.1617903116883119
0
0
-0.1617903116883119
655854.20159515587147325
2716038.70000312989577651

How can the bounding boxes be converted into real-world global projection EPSG:4328 usable in GIS? Any hints towards a python script are much appreciated.

2

There are 2 best solutions below

4
syed asad ali On
0
Philipp R On

I wrote this short function to convert the yolo detections to real-world polygons. The yolo detections.txt needs to be read without the [].

# function to return polygon
def bbox(x1, y1, x2, y2):
    # world file content
    # Line 1: A: x-component of the pixel width (x-scale)
    xscale = 0.1617903116883119
    # Line 2: D: y-component of the pixel width (y-skew)
    yskew = 0
    # Line 3: B: x-component of the pixel height (x-skew)
    xskew = 0
    # Line 4: E: y-component of the pixel height (y-scale), typically negative
    yscale = -0.1617903116883119
    # Line 5: C: x-coordinate of the center of the original image's upper left pixel transformed to the map
    xpos = 655854.20159515587147325
    # Line 6: F: y-coordinate of the center of the original image's upper left pixel transformed to the map
    ypos = 2716038.70000312989577651

    X_proj = xpos + (xscale * x1) + (xskew * y1)
    Y_proj = ypos + (yscale * y1) + (yskew * x1)

    X1_proj = xpos + (xscale * x2) + (xskew * y2)
    Y1_proj = ypos + (yscale * y2) + (yskew * x2)

    return Polygon([[X_proj, Y_proj],
                    [X1_proj, Y_proj],
                    [X1_proj, Y1_proj],
                    [X_proj, Y1_proj]])

outGDF = gpd.GeoDataFrame(geometry = dataset.apply(lambda g: bbox(int(g[0]),int(g[1]),int(g[2]),int(g[3])),axis=1),crs = {'init':'epsg:32638'})