Multiple object recognition opencv

2k Views Asked by At

I am solving a problem of finding the objects on image given template. Example of image: enter image description here

Example of template

enter image description here

So far I've come up with the following approach:

  1. Use some detected, e.g. sift for finding keypoints
  2. Match keypoints
  3. Cluster them

It looks like

sift = cv2.SIFT_create()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img,None)
kp2, des2 = sift.detectAndCompute(query,None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2,k=2)
# Apply ratio test
good = []
for m,n in matches:
    if m.distance < 0.5*n.distance:
        good.append([m])
# cv.drawMatchesKnn expects list of lists as matches.
img3 = cv2.drawMatchesKnn(img,kp1,query,kp2,good,None,flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.imshow(img3)
plt.show()

with the outcome

enter image description here

But I am stuck here. How could I use these matches to actually find the bboxes of objects present on image. I've tried to create grid, based on keypoints and size of template:

enter image description here

And then using cv2.matchTemplate find the objects in area around each cell (window shifting), but it didn't work quite well. How should I deal with it?

1

There are 1 best solutions below

1
David Serrano On

I hope it is not too late, but it would be a good idea to close this question.

I have tried to develop a piece of code for solving your problem following your approach.

First I have created a mask to identify the whiter zones.white mask

Then, I have thresholded the v channel of the HSV color-space and joined it with the other mask. general mask

Then, I find all the connected components of the mask.conectedComponents

Then, I compute the SIFT descriptor to both input image and the query image. On the good matches, I find the position of the keypoint to link it with the connected component at that position.filteredCC

And the last step is to draw the BBox of each connected components which has a keypoint asigned.finalResult

I have tried other methods as cv2.matchTemplate, but it did not work. Furthermore, I think the result could be better since I had to screenshot the images from your answer and I obtained less good keypoints. However, the drink cartons are extremely hard to individually segment, but if you find a better method to segment them it will work perfectly.

Hope it works!

import cv2
import matplotlib.pyplot as plt
import numpy as np

img = cv2.imread("stack2.png")
query = cv2.imread("stack3.png")

OBJECT_WIDTH_LIMITER = 200  # Variable to delimit the max width of the BBoxes

# Obtain a mask for identifying each product
# First obtain a mask with the whitish colours
white_mask = cv2.inRange(img, (180, 180, 180), (255, 255, 255))
white_mask = white_mask.astype(float) / 255
white_mask = cv2.morphologyEx(white_mask, cv2.MORPH_OPEN, np.ones((1, 10), np.uint8))

# Transform image to hsv, threshold the v channel
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
h, s, v = cv2.split(img)
_, mask = cv2.threshold(v, 0, 1, cv2.THRESH_OTSU)

# Segment the whitests parts of the image
mask[white_mask == 1] = 0

# Apply small closing
# mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, np.ones((2, 2), np.uint8))

# Detect all the connected components
n, conComp, stats, centroids = cv2.connectedComponentsWithStats(mask)

# Create SIFT object
sift = cv2.SIFT_create()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img, None)
kp2, des2 = sift.detectAndCompute(query, None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)
# Apply ratio test
good = []
for m, n in matches:
    if m.distance < 0.5 * n.distance:
        good.append([m])
# cv.drawMatchesKnn expects list of lists as matches.

# Iterate through each DMatch object and obtain the keypoint position of the good matches
good_keypoints = [kp1[match[0].queryIdx].pt for match in good]

# Obtain the connected components which each keypoint beongs
# If the connected component is wider than OBJECT_WIDTH_LIMITER, crop the connected component
# Create a mask with all the connected components that belong to keypoints
cc_filtered = np.zeros((img.shape[0], img.shape[1]), np.uint8)
for kp in good_keypoints:
    ccNumber = conComp[int(kp[1]), int(kp[0])]

    mask = np.zeros((img.shape[0], img.shape[1]), np.uint8)
    if ccNumber != 0:
        if int(kp[0]) - OBJECT_WIDTH_LIMITER < 0:
            left_limit = 0
        else:
            left_limit = int(kp[0]) - OBJECT_WIDTH_LIMITER

        if int(kp[0]) + OBJECT_WIDTH_LIMITER > img.shape[0]:
            right_limit = img.shape[0]
        else:
            right_limit = int(kp[0]) + OBJECT_WIDTH_LIMITER

        mask[conComp == ccNumber] = 1
        mask[:, right_limit:] = 0
        mask[:, :left_limit] = 0
        cc_filtered[mask == 1] = ccNumber

# Draw the BBoxes for each connected connected component
n, conComp, stats, centroids = cv2.connectedComponentsWithStats(cc_filtered)
for ccNumber in range(n):
    if ccNumber != 0:
        tl = (stats[ccNumber, cv2.CC_STAT_LEFT], stats[ccNumber, cv2.CC_STAT_TOP])
        br = (
            stats[ccNumber, cv2.CC_STAT_LEFT] + stats[ccNumber, cv2.CC_STAT_WIDTH],
            stats[ccNumber, cv2.CC_STAT_TOP] + stats[ccNumber, cv2.CC_STAT_HEIGHT],
        )
        cv2.rectangle(img, tl, br, (0, 255, 0), 5)

plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()