OpenCV object tracking

Object tracking in OpenCV refers to locating and following a specific object or multiple objects in a sequence of video frames. It involves analyzing the motion of objects in consecutive frames to estimate their position and track them over time.

Example code

The following example code inputs a video and lets us select the region of interest we want to track down:

Note: After hitting "Run", you can select whether you want to track the Rubik's cube or the box of tissues. Simply, drag and drop to select the object and press the Enter key to play the video.

import cv2
import numpy as np

# Define the video file path
video_path = 'my-clip.mp4'

# Define the video capture object
video_capture = cv2.VideoCapture(video_path)

# Read the first frame from the video
ret, frame = video_capture.read()

# Select a region of interest (ROI) to track
bbox = cv2.selectROI("Object Tracking", frame, False)

# Initialize the tracker
roi = frame[bbox[1]:bbox[1] + bbox[3], bbox[0]:bbox[0] + bbox[2]]
roi_hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
roi_hist = cv2.calcHist([roi_hsv], [0, 1], None, [180, 256], [0, 180, 0, 256])
cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)

# Set termination criteria for the tracker
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)

while True:
    # Read a new frame from the video
    ret, frame = video_capture.read()

    if not ret:
        break

    # Convert the frame to HSV color space
    frame_hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    # Calculate the back projection of the frame
    frame_backproj = cv2.calcBackProject([frame_hsv], [0, 1], roi_hist, [0, 180, 0, 256], 1)

    # Apply CAMShift to get the new bounding box
    ret, bbox = cv2.CamShift(frame_backproj, bbox, term_crit)

    # Draw the new bounding box on the frame
    pts = cv2.boxPoints(ret)
    pts = np.int0(pts)
    cv2.polylines(frame, [pts], True, (0, 255, 0), 2)

    # Display the resulting frame
    cv2.imshow("Object Tracking", frame)

    # Exit if the 'q' key is pressed
    if cv2.waitKey(15) & 0xFF == ord('q'):
        break

# Release the video capture object and close windows
video_capture.release()
cv2.destroyAllWindows()

Example code for detecting an object from an input video

Code explanation

Lines 1–2: These lines import the necessary libraries.
Lines 5–8: Input a video named my-clip.mp4 and load it using VideoCapture() method of cv2 library.
Line 11: The read() method reads the frame and stores it in the frames variable. ret is a boolean variable that stores True if a frame has been read successfully. Otherwise, it stores False.
Line 14: selectRIO() method waits for the user to select the region of interest they want to track in the clip. It opens a window titled "Object Tracking" showing the frame and allows the users to draw a bounding box around the object they want to track.
Line 17: Extract the region of interest from the frame based on the selected bounding box coordinates. It creates a sub-image of the frame corresponding to the region of interest.
Line 18: This line converts the color space of the selected region of interest's image from BGR (Blue-Green-Red) to HSV (Hue-Saturation-Value). Converting to the HSV color space is often beneficial for image processing tasks, as it separates the color information from the intensity information.
Line 19: This line calculates the histogram of the ROI in the HSV color space. The cv2.calcHist() function computes a histogram by taking the HSV image roi_hsv as input. It specifies [0, 1] as the channels for histogram calculation, indicating the Hue and Saturation channels. The histogram is computed using 180 bins for the Hue channel (range: 0 to 180) and 256 for the Saturation channel (range: 0 to 256).
Line 20: This line normalizes the calculated histogram to ensure that its values are within the range of 0 to 255. Normalization is performed using the cv2.normalize() function with the specified range of 0 to 255. The resulting normalized histogram (roi_hist) will be used for tracking purposes.
Line 23: Defines the termination criteria for the iterative optimization process used by the tracking algorithm. The cv2.TERM_CRITERIA_EPS and cv2.TERM_CRITERIA_COUNT are two termination criteria flags combined using the bitwise OR operator (|). The second parameter 10 specifies the maximum number of iterations. The third parameter 1 sets the minimum required to change between subsequent iterations to consider convergence.
Line 25: The loop will keep running until the end.
Line 27 & 33: These lines read the video frames and convert the color space from BGR (Blue-Green-Red) to HSV (Hue-Saturation-Value).
Line 36: The calcBackProject() method in this line highlights the regions in an image with similar characteristics to the region of interest.
Line 39: The CamShift() method on this line applies the CAMShift (Continuously Adaptive Mean Shift) algorithm to estimate the new location and size of the object in the current frame based on the back projection image and the previous bounding box.
Lines 42–44: The methods on these lines will keep drawing the box over the region of interest on each video frame.
Line 47: Displays the results on the screen.
Lines 50–51: The wait() method waits for the user to press any key if they want to quit.
Lines 54–55: Release the video capture object and close windows.

Conclusion

OpenCV provides a range of powerful techniques and algorithms for accurate object tracking on a diverse dataset. We can also track objects from a live camera feed.

Note: You can continue reading how to
Detect face using OpenCV.
Do video processing using OpenCV.
Detect objects using OpenCV.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments