Vision: Chapter 3 - Google Vision
In the previous chapters we explored What is Vision? and How we can leverage Open Source tools such as OpenCV to perform certain actions on Images.
In today's article we are going to take a look at some of the Variety of Computer Vision use cases that we can perform on Images straight out of the bat using Native services of one of the leading Cloud Platforms in AI - Google Cloud Platform (GCP).
Before we jump into it, let's try and understand what are the various use cases that can come up for an AI Designer dealing with Computer Vision -
Image Classification - Many a times for setting up organisational products, it is essential to be able to understand the difference between explicit content (NSFW) vs Safe for Work (SFW) content. These kind of categorisation come under Image Classification.
Object Detection - Most firms and organisations deal with multiple vendors and clients at the same time, it becomes critical that they are able to identify and organise their documents based on which vendor does it belong to. Under such circumstances leveraging Object Detection to identify the Logo or TradeMark of a company proves to be extremely beneficial.
Image Segmentation - Say you are setting up a Security system and need to identify Artilleries from X-Rays and map the contour to highlight and cross-verify. We use Masking Neural Networks to map those contours and identify and tag the objects. This is called Image segmentation.
Now that you all have a rough understanding of the broad categorisation of problem statements are possible under Computer Vision. How about we go ahead and explore a couple of them.
Let's follow the steps below to setup the necessary environment to run our tests.
Step 0: Setup Vision API
The basic requirement for setting up the API is procurement of a cloud account. As this is a very detailed procedure and out of scope of this article. You can follow this link to setup the API for your system.
Step 1: Install the necessary Packages
pip3 install google-cloud-vision
Step 2: Open your Code Editor and get yourself a cup of Coffee

Step 3: Import the necessary Packages
from google.cloud import vision
import io
Step 4: Define a Client for Vision API
client = vision.ImageAnnotatorClient()
Step 5: Define a function to read Images
def read_image_file(path: str):
with io.open(path, "rb") as image_file:
content = image_file.read()
image = vision.Image(content=content)
return image
Test 1 - Face Detection
def detect_faces(path: str):
image = read_image_file(path=path)
response = client.face_detection(image=image)
faces = response.face_annotations
# Names of likelihood from google.cloud.vision.enums
likelihood_name = (
"UNKNOWN",
"VERY_UNLIKELY",
"UNLIKELY",
"POSSIBLE",
"LIKELY",
"VERY_LIKELY",
)
print("Faces:")
for face in faces:
print("anger: {}".format(likelihood_name[face.anger_likelihood]))
print("joy: {}".format(likelihood_name[face.joy_likelihood]))
print("surprise: {}".format(likelihood_name[face.surprise_likelihood]))
vertices = [
"({},{})".format(vertex.x, vertex.y)
for vertex in face.bounding_poly.vertices
]
print("face bounds: {}".format(",".join(vertices)))
Test 2 - Label Detection
def detect_labels(path: str):
image = read_image_file(path=path)
response = client.label_detection(image=image)
labels = response.label_annotations
print("Labels:")
for label in labels:
print(label.description)
Test 3 - Landmark Detection
def detect_landmarks(path):
image = read_image_file(path=path)
response = client.landmark_detection(image=image)
landmarks = response.landmark_annotations
print("Landmarks:")
for landmark in landmarks:
print(landmark.description)
for location in landmark.locations:
lat_lng = location.lat_lng
print("Latitude {}".format(lat_lng.latitude))
print("Longitude {}".format(lat_lng.longitude))
Let's take another sip of our deliciously brewed coffee and then resume.

Test 4 - Logo Detection
def detect_logos(path):
image = read_image_file(path=path)
response = client.logo_detection(image=image)
logos = response.logo_annotations
print("Logos:")
for logo in logos:
print(logo.description)
Test 5 - Multiple Object Detection
def localize_objects(path):
# Localize objects in the local image.
image = read_image_file(path=path)
objects = client.object_localization(image=image).localized_object_annotations
print("Number of objects found: {}".format(len(objects)))
for object_ in objects:
print("\n{} (confidence: {})".format(object_.name, object_.score))
print("Normalized bounding polygon vertices: ")
for vertex in object_.bounding_poly.normalized_vertices:
print(" - ({}, {})".format(vertex.x, vertex.y))
Test 6 - Explicit Content Detection
def detect_safe_search(path):
image = read_image_file(path=path)
response = client.safe_search_detection(image=image)
safe = response.safe_search_annotation
# Names of likelihood from google.cloud.vision.enums
likelihood_name = (
"UNKNOWN",
"VERY_UNLIKELY",
"UNLIKELY",
"POSSIBLE",
"LIKELY",
"VERY_LIKELY",
)
print("Safe search:")
print("adult: {}".format(likelihood_name[safe.adult]))
print("medical: {}".format(likelihood_name[safe.medical]))
print("spoofed: {}".format(likelihood_name[safe.spoof]))
print("violence: {}".format(likelihood_name[safe.violence]))
print("racy: {}".format(likelihood_name[safe.racy]))
If you have followed all the steps above in declaring the functions in a single Python file. Then all you are left with is to call any one of the functions and pass the Image path and enjoy your results.
You can find detailed taxonomy and response structures as well as additional Features which you may find befitting your use case right on the link below.

As a bonus for all the readers, if you want to explore Google Cloud Vision click HERE!
Conclusion
As per our discussion by far, we have realised that sometimes leveraging Cloud Native services can provide State of the Art performance for even niche problem statements with much less hassles.
I hope this article find you well. The variety of use cases and possibilities of implementation is only limited by your imagination.
Keep Exploring and Keep Tinkering. STAY TUNED for more content. 😁