Sorting objects by coordinates from left to right from top to bottom

118 Views Asked by At

I am writing a small program that recognizes characters from a webcam. There are two types of symbols: single-line and double-line. It looks something like this:

First example:

First example

I need to arrange all recognized characters from left to right from top to bottom, that is, like this:

Second example:

Second example

If everything is clear with a one-line photo: we just sort by the x coordinate, then I have difficulties with two-line photos. I was trying to write a simple sort by type like this: sorted_df = df.sort_values(by=['y', 'x'], ascending=[False, True])

But such a decision is very often wrong. The problem is also that the input image with the symbols may be at a slight angle.

The input looks like this. I use panda to work with them.

import pandas as pd

data = {
        "xmin": [73.728722, 58.541206, 43.370064, 18.349848, 84.141769, 74.219193, 63.876919, 32.109692, 13.477271],
        "ymin": [9.410283, 10.085771, 10.857979, 12.260820, 36.286518, 36.769310, 37.599922, 39.808289, 40.412071],
        "xmax": [85.914436, 70.791809, 56.026375, 33.629444, 92.453529, 82.558533, 72.851395, 47.012421, 27.849062],
        "ymax": [29.401623, 29.874952, 31.069559, 32.480732, 51.482807, 51.720161, 52.238033, 58.858406, 59.132389],
        "name": ["A", "B", "C", "D", "1", "2", "3", "4", "5"]
    }

df = pd.DataFrame(data) 

Does anyone have a simple and effective solution to this? Which direction should I move in? I will be very grateful!

2

There are 2 best solutions below

4
mozway On BEST ANSWER

You can first sort on ymin, compute a diff and form groups with cumsum and a threshold. Then sort again based on this group and xmin:

# Y-axis threshold at which one considers a new row
threshold = 10
out = (df.assign(y=df['ymin'].sort_values(ascending=False)
                             .diff().abs().gt(threshold).cumsum())
         .sort_values(by=['y', 'xmin'])
       )

Output:

        xmin       ymin       xmax       ymax name  y
8  13.477271  40.412071  27.849062  59.132389    5  0
7  32.109692  39.808289  47.012421  58.858406    4  0
6  63.876919  37.599922  72.851395  52.238033    3  0
5  74.219193  36.769310  82.558533  51.720161    2  0
4  84.141769  36.286518  92.453529  51.482807    1  0
3  18.349848  12.260820  33.629444  32.480732    D  1
2  43.370064  10.857979  56.026375  31.069559    C  1
1  58.541206  10.085771  70.791809  29.874952    B  1
0  73.728722   9.410283  85.914436  29.401623    A  1

Graph:

enter image description here

NB. if you want to invert the Y-axis, use df['ymin'].sort_values() instead of df['ymin'].sort_values(ascending=False).

threshold = 10
out = (df.assign(y=df['ymin'].sort_values()
                             .diff().abs().gt(threshold).cumsum())
         .sort_values(by=['y', 'xmin'])
       )

enter image description here

2
Oskar Hofmann On

Assuming your tilt angle is not too large, i.e. characters in the first line are always clearly above the characters in the second line:

  • You can first sort your data by ymin.
  • You can then detect the jump to the second line (or wheter a second line exists) by checking for a big jump in the sorted values of ymin. The threshold for that depends on your data. Based on your example you could for example use the average of (ymax - ymin) or a certain fraction of that as a dynamic threshold.
  • For each of the characters in the line you can then simply sort by x-value.

In code:

threshold = (df['ymax'] - df['ymin']).abs().mean()

# calculate the difference between consecutive values (diff)
# get the absolute value (abs)
# check if its greater than the threshold (gt)
# and get the cummulative sum to get all the values after the jump to the next line (cumsum)
df['line'] = df['ymin'].sort_values().diff().abs().gt(threshold).cumsum()

You can then sort by line and xmin:

df.sort_values(by=['line', 'xmin'], ascending = [True, False])
    xmin    ymin    xmax    ymax    name    line
0   73.728722   9.410283    85.914436   29.401623   A   0
1   58.541206   10.085771   70.791809   29.874952   B   0
2   43.370064   10.857979   56.026375   31.069559   C   0
3   18.349848   12.260820   33.629444   32.480732   D   0
4   84.141769   36.286518   92.453529   51.482807   1   1
5   74.219193   36.769310   82.558533   51.720161   2   1
6   63.876919   37.599922   72.851395   52.238033   3   1
7   32.109692   39.808289   47.012421   58.858406   4   1
8   13.477271   40.412071   27.849062   59.132389   5   1

If the above assumption regarding the tilt angle is not valid, I would recommend some preprocessing of your data. This should be relatively easy as you expect values to be in distinct lines, so you can "rotate" your x/y values by a certain angle until the y-values of consecutive characters is barely changing (except for the one potential relatively large jump between the lines). If you are recognizing characters anyway, you should also be able get data for the tilt from that.