Welcome to yet another chapter on our Vision series. Computer vision has changed the whole course of Artificial Intelligence. The idea that a computer can see, interpret and take actions just like a human would, is really mind-blowing.
What is an Image?
An image as we see it, on a granular level is made of pixels. The configuration which you see in the market differs in the number of pixels in width and height.
- VGA (640 x 480)
- HD (1280 x 720 )
- Full HD (1920 x 1080)
- 4K (3840 x 2160)
Say for example, if I want to write the number "9", it will look something like this on a 10 x 10 pixel matrix:
The boxes you see depict the pixels and the number written over it represents the light intensity at the pixel. We have given value 0 where the figure lies and 1 everywhere else. These kinds of images are also called "Binary Images".
In practice, an image is represented by an 8-bit value of light intensity, which makes it lie between "0" and "255". "0" represents black and "255" represents white.
Color images on the other hand have 3 channels. A color image can be represented as a combination of Red, Green, and Blue color channels usually referred to as RGB channels. Light intensity for each color channel also lies between 0-255.
To bridge the gap between human and machine interpretations, there are several libraries and frameworks available in the market. One of these widely used libraries is OpenCV.
OpenCV is an open-source library that is written in C and C++ and can run under Linux, Windows, and macOS. OpenCV is designed for computational efficiency and can take advantage of multi-core processors.
Enough theory let's get our hands dirty and have some fun with images using OpenCV.
Note: We will be using python version of the library
You can install OpenCV using the simple pip command. Run the below command on your terminal.
pip3 install opencv-python
Now, let's import the library into your favorite IDE.
import cv2 print(cv2.__version__)
Once we have validated the version of the library, let's load an image and visualize it. Image can be loaded in memory using "imload" function and can be visualized using "imshow" function.
import cv2 img = cv2.imread("lena.jpg") cv2.imshow("Original Image", img)
Now that we have loaded and visualized the image, what if we play around with the loaded image?
Oftentimes, the images we receive are of different sizes (height and width). We would need to make them identical in size so that identical pixel values are coming in for every image. This can be done using the "resize" function in OpenCV.
img = cv2.resize(img, (500, 500))
As we mentioned before, color images have 3 channels each of Red, Blue, and Green colors. This makes color images a bit heavier when compared to Binary Images. Let's see how can we convert a color image into a Greyscale image.
img_grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
First-line in the above code converts the color image into greyscale. Second-line converts it into HSV format. We are making use of cv2.cvtColor() function to change the color space of the image. There are more than 150 color-space conversion methods available in OpenCV.
An image can have a lot of noise components as well. These noises can be removed using multiple methods. One of the methods is Gaussian smoothing. Gaussian blurring is highly effective in removing Gaussian noise from an image. Let's see how can we execute the command.
img_blur = cv2.GaussianBlur(src=img, ksize=(9, 9), sigmaX=0)
What happens if I modify "kernel size" or the "sigmaX" (deviation from X-axis)?
Why don't you try it on your own and see the effects of the modifications you make?
Do you remember when we learned about the functioning of eyes in 6th class Science? When we see an image, our eyes gather information about the image and sends it to our brain in form of signals via neurons. Then our brain identifies does some computation and identifies the image.
Have you ever wondered what kind of information is passed to your brain by the eyes?
Eyes perceive the information about the image brightness that is reflected by the image. This image brightness varies over different areas of the image. The points at which image brightness changes sharply are typically organized into a set of curved line segments termed edges.
OpenCV provides us a function that applies the Canny Detection algorithm on images to identify the edges of the image. The algorithm was developed by John F. Canny in 1986. Let's see how can we apply that to our image.
img_canny = cv2.Canny(img_grey, 60, 60)
Till now we have seen all the basic functions that make a computer perceive an image the same way a human would do. But do we stop here?
What if don't stop at edge detection and take it a notch up? Let's step into some advanced functions.
Many cool mobile applications apply numerous filters on images and enhance the look and feel of an image. Consider an image of Ex-Captain of Indian Cricket team, M.S.Dhoni.
Is it possible to extract only the jersey number and the name of the player?
Yes, it is.
import cv2 import numpy as np img = cv2.imread("m_s_dhoni.jpg") img_hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV) lower = np.array([78, 101, 160]) upper = np.array([136, 255, 255]) mask = cv2.inRange(img_hsv, lower, upper) color_mask = cv2.bitwise_and(img, img, mask=mask) cv2.imshow("original", img) cv2.imshow("mask", mask) cv2.imshow("color mask", color_mask) cv2.waitKey(0)
In the above code, we are performing the following actions:
- Load the image
- Convert the image in HSV format
- Identify the right HSV combination for the color we want.
- Build a mask of the area which has the specified HSV configuration
- Apply the mask onto the image using bitwise_and function
Let's see the results.
Isn't it cool?
This was a brief introduction to the various functions available in OpenCV. These operations really open up new doors in the field of Artificial Intelligence.
In future articles, we will talk about some practical applications of these image operations and how they can be used to make smart solutions. Stay tuned for some awesome content.