Vision: Chapter 4 - AWS Rekognition

Computer Vision Aug 30, 2021
Like it is said Thirst for Knowledge can never be quenched

In our previous Chapter we explore performing Computer Vision tasks using one of the leading Cloud Platforms' tool - Google Cloud Platform - Vision API.

But well one isn't enough.

So today we are going to open another Chapter in our journey of Vision. This story is about using another one of the Leading Platforms - Amazon Web Services (AWS) - Rekognition.

Before jumping right into the implementation let's take a look at the various problem statements that one may face in their journey to implement Computer Vision in their real world application.

Image Classification - Identifying and categorising the images into their respective classes
Object Detection - Pin pointing the exact location of a specific type of Object and tagging it with its class and returning the Coordinates of the Bounding Box
Label Detection - Categorise and tag objects in an Image based on general tags
Explicit content Detection - Many a times there are content which may be Not Safe for Work (NSFW) or may be Adult content and shouldn't be shown to children. In such cases these type of detection come in handy
Face Detection - Identifying Faces is a very common use case in the Security industry. For tracking, tagging and identifying humans for various needs
Text Detection - Reading text from banners, flyers, menus or documents and translating them proves to be very significant real world problem for travellers

Without wasting much time, let's get right to it.

Step 0: Setup the environment

If you already have an account and have setup your work space, feel free to skip this step. For all the first timers please follow the link below to setup your account.

AWS CLI and SDK - Setup for Devs
Before one embarks on a Journey to explore the plethora of services, heading towards innovation, the biggest obstacle for any developer is setting up their system. Sometimes going through vast expanse of documentation can prove to be extremely exhaustive. This article aims at bringing together all t…

Step 1: Install Client Library

pip3 install boto3

Step 2: Import the necessary packages

import boto3

client = boto3.client("rekognition")

For this article we will be using Boto3, the Python SDK offered by AWS for developers.

For detailed documentation of the same, you can find it right here.

Step 3: Read the Image

As we are dealing with images from our local system, to reutilise the reading of Image and pass it in Bytes format, we create a function which we can invoke in every subsequent utility.

def read_image_file(path: str):
    with io.open(path, "rb") as image_file:
        content = image_file.read()
    return {"Bytes": content}

Test 1 - Face Detection

def detect_faces(image_path: str):

    request_structure = read_image_file(path=image_path)
    response = client.detect_faces()
    
    print('Detected faces for ' + photo)    
    for faceDetail in response['FaceDetails']:
        print('The detected face is between ' + str(faceDetail['AgeRange']['Low']) 
              + ' and ' + str(faceDetail['AgeRange']['High']) + ' years old')

        print('Here are the other attributes:')
        print(json.dumps(faceDetail, indent=4, sort_keys=True))

		# Access predictions for individual face details and print them
        print("Gender: " + str(faceDetail['Gender']))
        print("Smile: " + str(faceDetail['Smile']))
        print("Eyeglasses: " + str(faceDetail['Eyeglasses']))
        print("Emotions: " + str(faceDetail['Emotions'][0]))
     
    return response

Test 2 - Label Detection

def detect_labels_local_file(image_path: str):

    request_structure = read_image_file(path=image_path)
    response = client.detect_labels(Image=request_structure)
        
    print('Detected labels in ' + photo)    
    for label in response['Labels']:
        print (label['Name'] + ' : ' + str(label['Confidence']))

Let's take a moment here to appreciate the results we have got and cherish our freshly brewed coffee.

Test 3 - Explicit Content Detection

Sometimes we come across content which are either not suited for the environment in which we are in or with age group.

In such cases it become crucial to be able to filter out content that may be unsuitable for the give circumstances.

def detect_explicit_content(image_path: str):

    request_structure = read_image_file(path=image_path)
    response = client.detect_moderation_labels(Image=request_structure)
    
    print('Detected labels for ' + photo)    
    for label in response['ModerationLabels']:
        print (label['Name'] + ' : ' + str(label['Confidence']))
        print (label['ParentName'])
    return len(response['ModerationLabels'])

Test 4 - PPE (Personal Protective Equipment) Detection

Since the pandemic has begun there has been a dire need to identify and assure that people are maintaining distance and taking necessary precautions for their own and the public's safety.

Creating custom models to identify several different safety equipments might be a little overwhelming for most of us.

AWS made our lives easier by introducing the PPE detection feature as a part of the suite of services offered under Rekognition.

You can select from a range of Attributes that you might want to look for in an image. Further details can be found here.

def detect_ppe(image_path: str):

    request_structure = read_image_file(path=image_path)
    response = client.detect_protection_equipment(Image=request_structure,SummarizationAttributes={'MinConfidence':80, 'RequiredEquipmentTypes':['FACE_COVER', 'HAND_COVER', 'HEAD_COVER']})
    
    print('Detected PPE for people in image ' + photo) 
    print('\nDetected people\n---------------')   
    for person in response['Persons']:
        
        print('Person ID: ' + str(person['Id']))
        print ('Body Parts\n----------')
        body_parts = person['BodyParts']
        if len(body_parts) == 0:
                print ('No body parts found')
        else:
            for body_part in body_parts:
                print('\t'+ body_part['Name'] + '\n\t\tConfidence: ' + str(body_part['Confidence']))
                print('\n\t\tDetected PPE\n\t\t------------')
                ppe_items = body_part['EquipmentDetections']
                if len(ppe_items) ==0:
                    print ('\t\tNo PPE detected on ' + body_part['Name'])
                else:    
                    for ppe_item in ppe_items:
                        print('\t\t' + ppe_item['Type'] + '\n\t\t\tConfidence: ' + str(ppe_item['Confidence'])) 
                        print('\t\tCovers body part: ' + str(ppe_item['CoversBodyPart']['Value']) + '\n\t\t\tConfidence: ' + str(ppe_item['CoversBodyPart']['Confidence']))
                        print('\t\tBounding Box:')
                        print ('\t\t\tTop: ' + str(ppe_item['BoundingBox']['Top']))
                        print ('\t\t\tLeft: ' + str(ppe_item['BoundingBox']['Left']))
                        print ('\t\t\tWidth: ' +  str(ppe_item['BoundingBox']['Width']))
                        print ('\t\t\tHeight: ' +  str(ppe_item['BoundingBox']['Height']))
                        print ('\t\t\tConfidence: ' + str(ppe_item['Confidence']))
            print()
        print()

    print('Person ID Summary\n----------------')
    display_summary('With required equipment',response['Summary']['PersonsWithRequiredEquipment'] )
    display_summary('Without required equipment',response['Summary']['PersonsWithoutRequiredEquipment'] )
    display_summary('Indeterminate',response['Summary']['PersonsIndeterminate'] )
   
    print()
    return len(response['Persons'])

Test 5 - OCR (Optical Character Detection)

As mentioned above digitising documents or reading text from Banners, menus or flyers may proves extremely helpful as a part of a broader picture.

Once the text is analysed it can be leveraged to translate, Text to Speech and a lot more.

def detect_ocr(image_path: str):

    request_structure = read_image_file(path=image_path)
    response = client.detect_text(Image=request_structure)
    
    textDetections=response['TextDetections']
    print ('Detected text\n----------')
    for text in textDetections:
            print ('Detected text:' + text['DetectedText'])
            print ('Confidence: ' + "{:.2f}".format(text['Confidence']) + "%")
            print ('Id: {}'.format(text['Id']))
            if 'ParentId' in text:
                print ('Parent Id: {}'.format(text['ParentId']))
            print ('Type:' + text['Type'])
            print()
    return response

Congratulations

If you were able to follow the steps above, you should have been able to explore a variety of Computer Vision related use cases.

As we clearly saw that leveraging Cloud Platforms to resolve some of your problem statements based on your choice can help you save a lot of time and efforts.

As a bonus if you are looking for further details on AWS Rekognition you can find it in the link mentioned below.

What is Amazon Rekognition? - Amazon Rekognition
Overview of Amazon Rekognition, a deep learning image analysis service.

Keep Exploring and Keep Tinkering. STAY TUNED for more content. 😁

Tags

Vaibhav Satpathy

AI Enthusiast and Explorer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.