How can I track a specific object from a video input using a model that detects multiple objects

18 Views Asked by At

I am reaching out for assistance regarding a computer vision task I am working on, particularly related to drawing trajectories for detected objects in a cricket delivery video.

I am relatively new to computer vision and have been following documentation from Ultralytics on object tracking in videos. Specifically, I have a pre-trained model that can detect both the bowler (class name: 'bowlers') and the ball (class name: 'balls') from a cricket delivery video provided as input to the model.

Below is a snippet of the code I have been working on:

  import cv2
  import numpy as np
  from collections import defaultdict
  from ultralytics import YOLO
  from google.colab.patches import cv2_imshow

  # load model
  model = YOLO('path/to/model')

  # open video file
  cap = cv2.VideoCapture('path/to/video')
  while not cap.isOpened():
     cap = cv2.VideoCapture('path/to/video')
     cv2.waitKey(1000)
     print('Wait for the header')

     # Store the track history for balls
     track_history = defaultdict(lambda: [])

     # Load class names
     class_names = model.names
     print(class_names)

     # Loop through the video frames
     while cap.isOpened():
     # Read a frame from the video
     success, frame = cap.read()

    if success:
       # Run YOLOv8 tracking on the frame, persisting tracks between frames
       results = model.track(frame, persist=True)

       #Get the boxes and track IDs if results[0].boxes is not None
       if results[0].boxes.id is not None:
         boxes = results[0].boxes.xywh.cpu()
         track_ids = results[0].boxes.id.int().cpu().tolist()

         print("Boxes: ", boxes)

         # Visualize the results on the frame
         annotated_frame = results[0].plot()

         # Plot the tracks
         for box, track_id in zip(boxes, track_ids):
            print("Box: ", box)
            x, y, w, h = box
            track = track_history[track_id]
            track.append((float(x), float(y)))  # x, y center point
            if len(track) > 30:  # retain 90 tracks for 90 frames
                track.pop(0)

            # Draw the tracking lines
            points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
            cv2.polylines(annotated_frame, [points], isClosed=False, color=(230, 230,  230), thickness=10)

        # Display the annotated frame
        cv2_imshow(annotated_frame)
else:
    # Break the loop if the end of the video is reached
    break

# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
    break

# Release video capture object and close windows
cap.release()
cv2.destroyAllWindows()   

Currently, the code successfully draws trajectories for both the bowler and the ball. However, my objective is to draw trajectories specifically for the ball's positions on every frame of the video.

I would greatly appreciate any guidance or modifications to the code that would enable me to achieve this goal. Specifically, I am looking for assistance in modifying the code to draw trajectories solely based on the ball positions.

0

There are 0 best solutions below