• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/37

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

37 Cards in this Set

  • Front
  • Back
John is working a company and he has been given access todata with fixed schema and he is tasked to run SQL queries andtransactions on that data. What is the most suitable data storefor this kind of need?



QuickSight


DocumentDB


DynamoDB


RDS

RDS

Which amazon service helps you store raw data collected froma website traffic without having to structure it first?




AWS datalake


AWS Data Pipeline


AWS Kinesis


AWS REDSHIFT

AWS datalake

Your company is using RedShift as its data warehouse. You aretasked to run some analytics and queries in the files, which oneof the following would you choose?




QuickShift


Apache Spark


Redshift Spectrum


Athena

Redshift Spectrum

You are working on an object detection project and your teamhas collected many images to train an artificial neural network.You are tasked with labeling the images. Which of the followingAWS service can help you with this?




Data Labeling


Sagemaker


Athena


RDS


Sagemaker Ground Truth

Sagemaker Ground Truth

You have to use XML file containing image information in yourproject in AWS. What is the datatype of this data?




Structured data


Semi-structured data


Unstructured data


Relational Structured data

Semi-structured data

You have collected 10000 images in to train your imageclassifier. What is the type of this data?




Structured data


Semi-structured data


Unstructured data


Relational Structured data

Unstructured data

The startup you are working on wants to store the data in anon-relational database in key-value pair as their data isunstructured and it has no data schema. Which AWS service isthe best match for this criteria?




AWS RDS


DynamoDb


DocumentDb


AWS DMS

AWS RDS

Jack has uploaded his PDF and JPEG files to his S3 bucket. Hewants to access his JPEG files only. What is the most effectiveway to download the JPEG files?




He can’t use S3 for this type of task


He can use AWS QUERY to get the JPEG files


He can manually select the required files one by one anddownload


He can use S3 SELECT to download the JPEG files

He can use S3 SELECT to download the JPEG files

Which amazon service helps you store both structured andunstructured data collected from your game applicationwithout any processing?




AWS datalake


AWS Data Pipeline


AWS Kinesis


AWS REDSHIFT

AWS datalake

You want to analyze the unstructured data in an S3 data lake.Which AWS service can be used?




QuickShift


Apache Spark


Redshift Spectrum


Streamer

Redshift Spectrum

OLTP Database

Relational store. Typically structured with a defined, relational schema.


Row optimized for transactions not analytics.




AWS: RDS, Aurora

Data Warehouse

Exists on top of other databases


Used for business intelligence.


Creates a layer optimized to perform data analytics.Schema is done on import.




AWS: Redshift



Data Lake

A data lake is a centralized repository for structured and unstructured data storage.


Store raw data as is without any structure (schema).


No need to perform ETL or transformation jobs on it.


Store arbitrary data types: machine learning models artifacts, realtime data, and analytics outputs in data lakes.


Processing could be done on export so schema is defined on read.


S3


AWS data lake reference solution

S3

Amazon Simple Storage Service (Amazon S3)


Store and protect any amount and kind of data. Durable to 99.999999999% (11 9’s)

RDS Aurora

Fully managed by Amazon Relational Database (RDS) service.


Transactional style database.


MySQL and PostgreSQL-compatible


No administration tasks such as hardware provisioning, creating backups (auto s3 backups) and database setup


Replication across three Availability Zones. Many engines available to create database.

Redshift

Data warehouse for fast business analytics.


Columnar storageand data compression. Queries are run against data stored in redshift storage or against data stored inS3.

Redshift Spectrum

Allows running SQL queries on data storedin Amazon S3 buckets directly.

Does not require transferring datafrom S3 to a database. Works well with unstructured S3 data lakes.


versus Athena?

DynamoDB
Fully managed NoSQL key-value and documentdatabase.

Scalable with minimum latency:
*10 trillion requests/day
* 20 million requests/second

You are building an image classifier using AWS Sagemaker andyou want to upload the data for training in an S3 bucket. Youwant to access the data frequently for training and evaluationpurposes. What storage class would you choose?




S3 Intelligent-Tiering


S3 Standard-Infrequent Access


S3 Standard


Amazon S3 Glacier

S3 Standard

What are the policies that can be created using Object Tags forthe S3 objects? (select 2)




IAM policies


S3 LifeCycle Policies


Restricted Policies


Associated Tags

IAM policies


S3 LifeCycle Policies

AWS Machine Learning_2 You want to store the backup of your laptop securely and youwant to access the backup only when you wish to reboot thelaptop. What storage class would you prefer for this purpose?





S3 Standard


S3 Intelligent-Tiering


S3 Standard-Infrequent Access


Amazon S3 Glacier

S3 Standard-Infrequent Access

Jim is working on a project and wants to store all his data in hisS3 bucket but he wishes to keep his files in the bucket only fora year and after that he wants the S3 bucket to be deletedautomatically. What is the best possible option to achieve this?




Using Expiration role from Object Tags


Using Delete Action from lifecycle configuration


Using Delete tag available in S3


Using Expiration Action from lifecycle configuration

Using Expiration Action from lifecycle configuration

John stores all of his work-related documents on S3. Herequires the data that he has uploaded in the past 15 days. Datawhich he has uploaded before are required only once in a year.What approach will help him lower the billing cost on S3?



He must manually choose the document from the bucket


He should use Transfer tags from Object tags to move the documents based on duration requirements


He could use Object Lifecycle Management feature per bucketto transfer the documents between S3 storage classes based onhis duration requirements


He should use two different buckets to maintain frequentlyaccessed documents and infrequently accessed documents

He could use Object Lifecycle Management feature per bucket to transfer the documents between S3 storage classes based on his duration requirements

You want to build a ML model in Sagemaker. You want to makesure that the sensitive data being used for training is secured.Which type of encryptions will help you achieve this? (select 2)




SSE-S3


AWS SE


AWS SSD


AWS KMS

SSE-S3, AWS KMS

SSE-S3

Server-side encryption protects data at rest. Amazon S3 encrypts each object with a unique key and the key itself with a master key that it rotates regularly.

Since your model is being trained on very sensitive data, youhave enabled inter-container traffic encryption. How will thisaffect the training?




Increase in training time when using distributed deep learningalgorithms


Increase in training accuracy when using distributed deeplearning algorithms


Decrease in training time when using distributed deep learningalgorithms


Decrease in training accuracy when using distributed deeplearning algorithms

Increase in training time when using distributed deep learning algorithms

AWS Machine Learning_2 You have created a model using XGBoost algorithm and it hasbeen running in production. You want to check the statisticsreported by the inference algorithm. How can you check that?




Cloudtrail Logs


CloudWatch Events


Cloudtrail Events


CloudWatch Logs

CloudWatch Logs

You are training the machine learning model that you builtusing Sagemaker. The model started throwing errors in themiddle of the training. Which of the following tools can helpwith troubleshooting this issue?




AWS Logs


CloudWatch


CloudSecurity


AWS Monitor

CloudWatch

You want to improve the security and reduce the egress costsfor your S3. How can you achieve this?




Use VPC endpoint


Use inference endpoint


Use IAM polices to modify these


Use Cloudtrail

Use VPC endpoint

S3 Method to categorize objects?

Object Tags (key value pairs).


Example object tag: Project = Machinelearning, Classification = confidential, PHI = True




The maximum number of tags per object is 10.




Use cases:


Granting or denying permission.


Managing object lifecycle by creating a lifecycle rule basedon associated tags.


Performing analytics.

Provide an example of data partitioning in S3

s3://bucket/data/year/month/day/hour/photo1.csv


s3://bucket/data/product/photo1.csv

Benefits of S3 data partitioning

Can reduce the cost required forscanning for example when querying with Amazon Athena.

S3 Storage Tiers

S3 Standard: works well with storage that is general purpose and frequently accessed


S3 Intelligent-Tiering: works well with data that has varying access patterns


S3 Standard-Infrequent Access (Standard-IA): for long-lived but less frequently accessed data


S3 One Zone-Infrequent Access (One Zone-IA): for long-lived, but less frequently accessed data


Amazon S3 Glacier (S3 Glacier) and Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive): works well for long-term archived data.

S3 Lifecycles

Transition actions: policy that governs when an object transitions from one storage class to another. transition object from STANDARD_IA storage class 30 days creation Transition objects to the GLACIER storage class after 1 year.




Expiration actions: policy that governs when objects expire (deleted by S3).

S3 Lifecycle Configuration

Lifecycles are managed via a lifecycle configuration file (XML file).


XML files consists of rules/polices that S3 can apply to objects.


For each bucket, Amazon S3 keeps the configuration file attached to it.

SSE-KMS

Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS)

Similar to SSE-S3, but with some additional benefits and charges.


There are separate permissions for the use of a CMK that provides added protection against unauthorized access of your objects in Amazon S3.


SSE-KMS also provides you with an audit trail that shows when your CMK was used and by whom.


Can create and manage customer managed CMKs or use AWS managed CMKs that are unique to you, your service, and your Region.

S3 Encryption Options

Se


SSE-S3 -


SSE-KMS -


SSE-C - Server-Side Encryption with Customer-Provided Keys (SSE-C)