Video Intelligence Chapter 1: GCP

Video Intelligence Sep 22, 2021
Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

The first-ever video was shot in the late 1800, and ever since the technology is evolving rapidly.

But we are not interested in history, are we?

So, let's talk about the present and the future. Since the introduction of the internet, the accumulation of data has drastically increased.

With the high-speed internet connections available, and various social media platforms like YouTube, Instagram, Facebook, etc. drastic amount of video content is getting introduced to the world every second.

With the huge pile of video content, there is a rising need to organize the content.

But maintaining and organizing it manually won't be a smart option right?

Video Intelligence

Cloud Video Intelligence API provides state-of-the-art results on video analysis. GCP provides us majorly 2 ways to extract useful information from media.

Video Intelligence API

It provides pre-trained machine learning models that automatically recognize a vast number of objects, places, and actions in stored and streaming video.

Offering exceptional accuracy out-of-the-box, it’s highly efficient for common use cases and improves over time as new concepts are introduced.

AutoML Video Intelligence

But pre-trained models are not always useful, which brings us to AutoML Video Intelligence which provides a graphical interface that makes it easy to train your custom models to classify and track objects within videos, even if you have a minimal machine learning experience.

It’s ideal for projects that require custom labels that aren’t covered by the pre-trained Video Intelligence API.

Setup and Usage

To start with Video Intelligence APIs, we need to first set up our GCP account with the necessary permissions and installations. Let's start.

Step 1: Create a GCP account

In our previous articles, we have discussed how to set up a GCP account setup in detail. If you need help in setup up the account, we recommend you go through those articles before proceeding further.

GCP - Vertex AI Setup for Devs
One of the biggest challenges for any Developer is scrolling through the massive expanse of detailed Documentation offered by a Company to setup their Product on your system. The same is the issue with Google Cloud Platform.Sometimes providing an extensive documentation can make a beginner’s life ve…
Video intelligence API is available in free trial as well.

Step 2: Install necessary libraries

pip install --upgrade google-cloud-videointelligence

Step 3: Upload necessary files to Cloud Storage

To get an analysis of your video file, place your file to Google Cloud Storage and note the URI of the file, we would need that in future steps.

Step 4: Get Analysis

Run the provided python script to get the video analysis.

"""All Video Intelligence API features run on a video stored on GCS."""
from google.cloud import videointelligence

gcs_uri = "gs://PATH_TO_VIDEO_FILE"
output_uri = "gs://PATH_TO_OUTPUT_JSON_FILE.json"

video_client = ( videointelligence.VideoIntelligenceServiceClient.from_service_account_file("PATH_TO_CREDENTIAL_FILE.json"))
    
# Getting results from all the features available in Video Intelligence API
features = [
    videointelligence.Feature.OBJECT_TRACKING,
    videointelligence.Feature.LABEL_DETECTION,
    videointelligence.Feature.SHOT_CHANGE_DETECTION,
    videointelligence.Feature.SPEECH_TRANSCRIPTION,
    videointelligence.Feature.LOGO_RECOGNITION,
    videointelligence.Feature.EXPLICIT_CONTENT_DETECTION,
    videointelligence.Feature.TEXT_DETECTION,
    videointelligence.Feature.FACE_DETECTION,
    videointelligence.Feature.PERSON_DETECTION,
]

# Transcription configurations
transcript_config = videointelligence.SpeechTranscriptionConfig(
    language_code="en-US", enable_automatic_punctuation=True
)

# Person Detection configurations
person_config = videointelligence.PersonDetectionConfig(
    include_bounding_boxes=True,
    include_attributes=False,
    include_pose_landmarks=True,
)

# Face Detection configurations
face_config = videointelligence.FaceDetectionConfig(
    include_bounding_boxes=True, include_attributes=True
)

# Transcription configurations
video_context = videointelligence.VideoContext(
    speech_transcription_config=transcript_config,
    person_detection_config=person_config,
    face_detection_config=face_config,
)

operation = video_client.annotate_video(
    request={
        "features": features,
        "input_uri": gcs_uri,
        "output_uri": output_uri,
        "video_context": video_context,
    }
)

# Getting long running operation id

print("\nProcessing video.", operation)
print(f"\nOperation Id: {operation.operation.name}")

result = operation.result(timeout=300)

print("\n finished processing.")

Let's understand what the above script is doing.

  1. Provided GCS URI for the video file and path where we want our output to be saved.
  2. Select the features we want to extract from the video.
  3. Provide additional parameters required for feature extraction.
  4. Send a request to extract features, sit back and relax.
That was fairly simple, right?

Step 5: Time to visualize the results we got

Download the output JSON from the Cloud storage path mentioned above. Upload your video and the output JSON on the link below to visualize the results.

Video Intelligence API Visualiser
Interactive visualiser for the Google Cloud Video Intelligence API.

See the sample video processed video from F.R.I.E.N.D.S. where Joey was trying to buy a birthday gift for his girlfriend. Enjoy!

Video Intelligence API results visualized
Isn't that cool?

Conclusion

As we have seen, GCP Video Intelligence APIs are loaded with features and can be used as-is to solve various use-cases. Scripts mentioned in the article and many more are added to the Github repository shared below.

But that is not enough, is it?

Even though it solves major use-cases, but still there is a scope to improve the results and make it more personalized for our problem.

Don't worry, GCP has got us coved in that front too.

AutoML Video Intelligence can be used to train and customize the video intelligence results as per our needs. We will cover that in our future posts.

Till then keep learning and stay tuned. :)

AI-kosh/video_intelligence/video_api at main · Chronicles-of-AI/AI-kosh
Archives of blogs on Chronicles of AI. Contribute to Chronicles-of-AI/AI-kosh development by creating an account on GitHub.

Tags

Arpit Jain

Machine Learning Engineer

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.