Important note
For production use cases, it is recommended to use AWS Lambda with AWS Step Functions if you have dependent services or a chain of services.
Using the same S3 bucket to store input and output objects is not recommended. Output object creation in the same bucket may trigger recursive Lambda invocation. If you are using the same bucket, then you recommend that you use a prefix and suffix to trigger events. Similarly, you recommend using a prefix to store output objects.
In this section, you learned how to combine multiple services and chain their output to achieve a particular use case outcome. You learned how to integrate Amazon Rekognition to detect text in an image. The language can then be detected by using Amazon Comprehend. Then, you used the same input and translated it into English with the help of Amazon Translate. The translated output was then printed on CloudWatch logs for verification. In the next section, you will learn about Amazon Textract, which can be used to extract text from a document.
Manually extracting information from documents is slow, expensive, and prone to errors. Traditional optical character recognition software needs a lot of customization, and it will still give erroneous output. To avoid such manual processes and errors, you should use Amazon Textract. Generally, you convert the documents into images to detect bounding boxes around the texts in images. You then apply character recognition to read the text from it. Textract does all this for you, and also extracts text, tables, forms, and other data for you with minimal effort. If you get low-confidence results from Amazon Textract, then Amazon A2I is the best solution.
Textract reduces the manual effort of extracting text from millions of scanned document pages. Once the information has been captured, actions can be taken on the text, such as storing it in different data stores, analyzing sentiments, or searching for keywords. The following diagram shows how Amazon Textract works:
Figure 8.19 – Block diagram representation of Amazon Textract and how it stores its output
Some common uses of Amazon Textract include the following:
Next, you will explore the benefits of Amazon Textract.
There are several reasons to use Textract: