I am working on a scanned image of a handwritten letter. My target is to find the bounding boxes of every line on that image. But the bounding boxes must not overlap each other.

Input Image: Input image

Expected Image after drawing bounding boxes: Expected image

So my steps were:

  1. Read the image using CV2
image_binary = cv2.imread(input_path, cv2.IMREAD_UNCHANGED)
  1. Use MSER (Maximally Stable Extremal Regions) to determine all the detected regions and draw bounding boxes.
mser = cv2.MSER_create()
gray = cv2.cvtColor(image_binary, cv2.COLOR_BGR2GRAY)
regions, _ = mser.detectRegions(gray)
regions = [region for region in regions if (cv2.boundingRect(region)[3] <= 100)]
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]

# Converting hulls to 3 Dimensional array
plines = []
for hull in hulls:
  x, y, w, h = cv2.boundingRect(hull)
  cnt_points = x, y, x+w, y+h
  added_to_line = False
  for pline in plines:
    if (abs(pline[0][1] - cnt_points[1]) <= 50): 
      pline.append(cnt_points)
      added_to_line = True
      break

  if not added_to_line:
    plines.append([cnt_points])
    plines = sorted(plines, key=lambda pline: pline[0][1])

# Here simply I am drawing rectangles on the detected points for visualization
lines = []
output = image_binary.copy() 
for pline in plines:
  min_x = min([cnt[0] for cnt in pline])
  min_y = min([cnt[1] for cnt in pline])
  max_x = max([cnt[2] for cnt in pline])
  max_y = max([cnt[3] for cnt in pline])
  lines.append((min_x, min_y, max_x, max_y))
  cv2.rectangle(output, (min_x, min_y), (max_x, max_y), (0, 255, 0), 2)

cv2.imwrite('output.jpg', output)
print(lines)

MSER is not working well, because some of the letters are intersecting with either the upper or the lower line, and the line gap is not constant.

Here is my actual output Actual output

I understand that limiting the height of the detected regions (here in my case 100) and the line deviation (here in my case 50), may produce better results. But the main concern is how to draw the bounding boxes in the real scenario.

Update: As suggested by @Christoph Rackwitz, the requirement is to determine the number of lines in this handwritten letter.

0

There are 0 best solutions below