Detect Objects Using Python and OpenCV

Nabarun Chakraborti
6 min readAug 10, 2020

This is a basic and simple documentation for those who never did any kind of video processing to detect different kind of objects like Car, Human, Bus etc. If you have free time and interested to play around then please follow this documentation. I hope this will give you some joy being a beginner.

What is OpenCV?

We all know OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. Mainly used for computer vision, machine learning, and image processing. The library has more than 2500 optimized algorithms and it helps to process images and videos to identify objects, faces, handwriting, track camera movements, stitch images together, find similar images from an image database and lot more.

Here I will demonstrate how easily we can detect Human, Cars, Two-wheeler and Bus from any video file combining OpenCV with Python.

Hope, it will be a fun learning. But I’m admitting at the beginning that the available classifiers will not provide you the accurate results. As a beginner it will be nice to see that our program can identify different objects from random image and video files.

What you need to start?

  • Hope Python is installed in your system.
  • And you are using some kind of IDE for programming. I use PyCharm.
  • Install opencv-python. Refer to the below screenshot.
  • You need the classifiers for object detection. You can easily search the following classifier files in google and download them in your local machine — cars.xml, pedestrian.xml, Bus_front.xml, two_wheeler.xml

Now, we are all set to start. But before working on video files let’s explain how the logic and library works on image file.

Work on Image file first:

Let’s we have to process the below image to identify Human and Car.

We will use some classifiers to identify the object types. There are few classifiers available and I’m using HAAR cascade classifier . The classifier is an xml file and has lot of definitions/patterns inside. When some object matches with those defined patterns then our code will identify and categorizes that object. Below are some sample patterns.

To make it simpler let’s consider the Car image from the above pic and try to apply the defined patterns to see in case any one of the combination justify the image. So, it does, and we can say it’s a car.

Step by Step Code Walk-through:

1. Read the file using OpenCV and create an instance.

2. Define the classifiers

3. Convert the color image into grey image for faster processing, as most of the cases color is not an important factor to identify the objects rather the patterns.

4. Now create trackers for individual entities (car/human/etc.) by passing the classifiers via OpenCv CascadeClassifier method.

5. Apply the trackers on the grey image to identify the position of the objects (car/human/etc.)

Once we print the above we will find multi-dimensional array

The array contains the location of the objects detected by the program.

6. Iterate through the above multi-dimensional array and draw rectangle around each object.

7. The final output will look like below –

The complete Image processing program:

import cv2

# Source data
img_file = "img1.jpg"

# create an openCV image
img = cv2.imread(img_file)

# pre trained Car and Pedestrian classifiers
car_classifier = 'cars.xml'
pedestrian_classifier = 'pedestrian.xml'

# convert color image to grey image
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# create trackers using classifiers using OpenCV
car_tracker = cv2.CascadeClassifier(car_classifier)
pedestrian_tracker = cv2.CascadeClassifier(pedestrian_classifier)

# detect cars
cars = car_tracker.detectMultiScale(gray_img)
pedestrian = pedestrian_tracker.detectMultiScale(gray_img)

# display the coordinates of different cars - multi dimensional array
print(cars)
print(pedestrian)

# draw rectangle around the cars
for (x,y,w,h) in cars:
cv2.rectangle(img, (x,y), (x+w, y+h), (0,255,0), 2)
cv2.putText(img, 'Car', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

# draw rectangle around the pedestrian
for (x,y,w,h) in pedestrian:
cv2.rectangle(img, (x,y), (x+w, y+h), (0,0,255), 2)
cv2.putText(img, 'Human', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# Finally display the image with the markings
cv2.imshow('my detection',img)

# wait for the keystroke to exit
cv2.waitKey()


print("I'm done")

Let’s process Video file now:

The basic logic remains same while we are working with video files. Here we will loop through the video file and consider each frame as an image file and apply the same logic.

Which means, we will read the video file first

Then define the Classifiers and Trackers (like in our above image processing program).

After that iterate through the video file till end and start reading frame by frame, convert into grey image, detect the objects and draw rectangle.

Few sample captured screens from processed video output file –

The complete Video processing program:

import cv2

# Source data : Video File
IP_file = 'Road3.mp4'

# Read the source video file
vid_file = cv2.VideoCapture(IP_file)

# pre trained classifiers
car_classifier = 'cars.xml'
pedestrian_classifier = 'pedestrian.xml'
bus_classifier = 'Bus_front.xml'
twowheeler_classifier = 'two_wheeler.xml'


# Classified Trackers
car_tracker = cv2.CascadeClassifier(car_classifier)
pedestrian_tracker = cv2.CascadeClassifier(pedestrian_classifier)
bus_tracker = cv2.CascadeClassifier(bus_classifier)
twowheeler_tracker = cv2.CascadeClassifier(twowheeler_classifier)


while True:
# start reading video file frame by frame like an image
(read_successful, frame) = vid_file.read()

if read_successful:
#convert to grey scale image
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
else:
break

# Detect Cars, Pedestrians, Bus and 2Wheelers
cars = car_tracker.detectMultiScale(gray_frame,1.1,9)
pedestrians = pedestrian_tracker.detectMultiScale(gray_frame,1.1,9)
bus = bus_tracker.detectMultiScale(gray_frame, 1.1, 9)
twowheeler = twowheeler_tracker.detectMultiScale(gray_frame, 1.1, 9)


# Draw rectangle around the cars
for (x, y, w, h) in cars:
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 0, 255), 2)
cv2.putText(frame, 'Car', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
#cv2.rectangle(gray_frame, (x, y), (x + w, y + h), (0, 0, 255), 2)

# Draw square around the pedestrians
for (x, y, w, h) in pedestrians:
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.putText(frame, 'Human', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# Draw square around the bus
for (x, y, w, h) in bus:
cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2)
cv2.putText(frame, 'Bus', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# Draw square around the twowheeler
for (x, y, w, h) in twowheeler:
cv2.rectangle(frame, (x, y), (x + w, y + h), (216, 255, 0), 2)
cv2.putText(frame, 'Bike', (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)


# display the imapge with the face spotted
cv2.imshow('Detect Objects On Road',frame)

# capture key
key = cv2.waitKey(1)

# Stop incase Esc is pressed
if key == 27:
break

# Release video capture object
vid_file.release()

print("That's it...")

You can download any video file from Youtube or some other sources and try.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Nabarun Chakraborti
Nabarun Chakraborti

Written by Nabarun Chakraborti

Big Data Solution Architect and pySpark Developer

Responses (2)

Write a response