Snowball, Snowball Edge, and Snowmobile – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Snowball, Snowball Edge, and Snowmobile These belong to the same product category or family for the physical transfer of data between business operating locations and AWS. To move a large amount of data into and out of AWS, you can use any of the three: AWS DataSync AWS DataSync is designed to move data from […]

AWS Storage Gateway – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

AWS Storage Gateway Storage Gateway is a hybrid storage virtual appliance. It can run in three different modes – File Gateway, Tape Gateway, and Volume Gateway. It can be used for the extension, migration, and backups of an on-premises data center to AWS:

Storing and transforming real-time data using Kinesis Data Firehose – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Storing and transforming real-time data using Kinesis Data Firehose There are a lot of use cases that require data to be streamed and stored for future analytics purposes. To overcome such problems, you can write a Kinesis consumer to read the Kinesis stream and store the data in S3. This solution needs an instance or […]

Processing real-time data using Kinesis Data Streams – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Processing real-time data using Kinesis Data Streams Kinesis is Amazon’s streaming service and can be scaled based on requirements. It has a level of persistence that retains data for 24 hours by default or optionally up to 365 days. Kinesis Data Streams is used for large-scale data ingestion, analytics, and monitoring: Note Amazon Kinesis shouldn’t […]

Querying S3 data using Athena – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Querying S3 data using Athena Athena is a serverless service designed for querying data stored in S3. It is serverless because the client doesn’t manage the servers that are used for computation: Now, to help you understand this, here’s an example, where you will use AWSDataCatalog created in AWS Glue on the S3 data and […]

Getting hands-on with AWS Glue ETL components – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Getting hands-on with AWS Glue ETL components In this section, you will use the Data Catalog components created earlier to build a job. You will start by creating a job: This is optional. Then, click on the Run job button: Figure 3.6 – A screenshot of the AWS Glue ETL job Figure 3.7 – A […]

Features of AWS Glue – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Features of AWS Glue AWS Glue is a completely managed serverless ETL service on AWS. It has the following features: AWS Glue has the Data Catalog, and that’s the secret to its success. It helps with discovering data from data sources and understanding a bit about it: As you now have a brief idea of […]

Technical requirements – AWS Services for Data Migration and Processing – MLS-C01 Study Guide

Technical requirements You can download the data used in the examples from GitHub, available here: https://github.com/PacktPublishing/AWS-Certified-Machine-Learning-Specialty-MLS-C01-Certification-Guide-Second-Edition/tree/main/Chapter03. Creating ETL jobs on AWS Glue In a modern data pipeline, there are multiple stages, such as generating data, collecting data, storing data, performing ETL, analyzing, and visualizing. In this section, you will cover each of these at a […]

Exam Readiness Drill – AWS Services for Data Storage – MLS-C01 Study Guide

Exam Readiness Drill For the first three attempts, don’t worry about the time limit. ATTEMPT 1 The first time, aim for at least 40%. Look at the answers you got wrong and read the relevant sections in the chapter again to fix your learning gaps. ATTEMPT 2 The second time, aim for at least 60%. […]

Amazon DynamoDB for NoSQL Database-as-a-Service – MLS-C01 Study Guide

Amazon DynamoDB for NoSQL Database-as-a-Service Amazon DynamoDB is a NoSQL database-as-a-service product within AWS. It’s a fully managed key/value and document database. Accessing DynamoDB is easy via its endpoint. The input and output throughputs can be managed or scaled manually or automatically. It also supports data backup, point-in-time recovery, and data encryption. One example where […]