Getting hands-on with Amazon Comprehend – AWS Application Services for AI/ML – MLS-C01 Study Guide

Getting hands-on with Amazon Comprehend

In this section, you will build a pipeline where you can integrate AWS Lambda with Amazon Rekognition and Amazon Comprehend. You will then read an image file stored in an S3 bucket and detect the language of the text that has been extracted from the image. You will also use CloudWatch to print out the output. The following is a diagram of our use case:

Figure 8.15 – Architecture diagram of the required use case

Let’s begin by creating an IAM role:

Navigate to the IAM console page.
Select Roles from the left-hand menu.
Select Create role.
Select Lambda as the trusted entity.
Add the following managed policies:
- AmazonS3ReadOnlyAccess
- AmazonRekognitionFullAccess
- ComprehendFullAccess
- CloudWatchFullAccess
Save the role as language-detection-from-image-role.
Now, let’s create the Lambda function. Navigate to Lambda > Functions > Create Function.
Name the function language-detection-from-image.
Set the runtime to Python 3.6.
Use our existing role; that is, language-detection-from-image-role.
Download the code from https://github.com/PacktPublishing/AWS-Certified-Machine-Learning-Specialty-MLS-C01-Certification-Guide-Second-Edition/tree/main/Chapter08/Amazon%20Transcribe%20Demo/lambda_function, paste it into the function, and click Deploy.

This code will read the text from the image that you uploaded and detect the language of the text. You have used the detect_text API from Amazon Rekognition to detect text from an image and the batch_detect_dominant_language API from Amazon Comprehend to detect the language of the text.

Now, go to your AWS S3 console and create a bucket called language-detection-image.
Create a folder called input-image (in this example, you will only upload .jpg files).
Navigate to Properties > Events> Add notification.
Fill in the required fields in the Events section with the following information; then, click on Save:
- Name: image-upload-event
- Events: All object create events
- Prefix: input-image/
- Suffix: .jpg
- Send to: Lambda Function
- Lambda: language-detection-from-image
Navigate to Amazon S3>language-detection-image>input-image. Upload the sign-image.jpg image in the folder. (This file is available in this book’s GitHub repository at https://github.com/PacktPublishing/AWS-Certified-Machine-Learning-Specialty-MLS-C01-Certification-Guide-Second-Edition/tree/main/Chapter08/Amazon%20Comprehend%20Demo/input_image).
This file upload will trigger the Lambda function. You can monitor the logs from CloudWatch> CloudWatch Logs> Log groups> /aws/lambda/language-detection-from-image.
Click on the streams and select the latest one. The detected language is printed in the logs, as shown in Figure 8.16:

Figure 8.16 – The logs in CloudWatch for verifying the output

Important note

It is suggested that you use batch operations such as BatchDetectSentimentor BatchDetectDominantLanguage in your production environment. This is because single API operations can cause API-level throttling. More details are available here: https://docs.aws.amazon.com/comprehend/latest/dg/functionality.html.

In this section, you learned how to use Amazon Comprehend to detect the language of texts. The text is extracted into our Lambda function using Amazon Rekognition. In the next section, you will learn about translating the same text into English via Amazon Translate.

Translating documents with Amazon Translate

Most of the time, people prefer to communicate in their own language, even on digital platforms. Amazon Translate is a text translation service. You can provide documents or strings of text in various languages and get it back in a different language. It uses pre-trained deep learning techniques, so you should not be worried about the models, nor how they are managed. You can make API requests and get the results back.

Some common uses of Amazon Translate include the following:

If there is an organization-wide requirement to prepare documents in different languages, then Translate is the solution for converting one language into many.
Online chat applications can be translated in real time to provide a better customer experience.
To localize website content faster and more affordably into more languages.
Sentiment analysis can be applied to different languages once they have been translated.
To provide non-English language support for a news publishing website.

Next, you will explore the benefits of Amazon Translate.