Storing Data in AWS
In this sample chapter from AWS Certified Developer - Associate (DVA-C01) Cert Guide, you will review content related to development with AWS Services and refactoring exam domains.
This chapter covers the following subjects:
Storing Static Assets in AWS: One of the key aspects of cloud computing is the ability to consume a virtually unlimited amount of resources. The first part of this chapter covers how to use the S3 and Glacier services to store and deliver unlimited amounts of static data in AWS.
Relational Versus Nonrelational Databases: To prepare you for the next two chapters, this section provides a short overview of the differences between relational and nonrelational databases and data types suitable for each database type.
Deploying Relational Databases in AWS: This section examines the deployment of relational databases using AWS Relational Database Service (RDS).
Handling Nonrelational Data in AWS: Some datasets are just not suitable for relational databases. When sustained and predictable performance for relatively simple datasets is required, you can use the DynamoDB service in AWS. This section examines the characteristics of DynamoDB and shows how to use DynamoDB in your applications.
Caching Data in AWS: The last part of this chapter covers the different options for caching data and accelerating the delivery of content from the storage systems covered in this chapter.
This chapter covers content important to the following exam domains:
Domain 3: Development with AWS Services
3.1 Write code for serverless applications.
3.2 Translate functional requirements into application design.
3.3 Implement application design into application code.
3.4 Write code that interacts with AWS services by using APIs, SDKs, and AWS CLI.
Domain 4: Refactoring
4.1 Optimize application to best use AWS services and features.
The challenge of maintaining and storing data in the most efficient manner has been plaguing enterprises for decades. There is never enough storage, and storage performance is quite often a factor in poor application performance. Moreover, storing data securely and preventing disastrous consequences of losing data can be a huge challenge. As the old saying goes, “If your data is not stored in three places at once, it does not exist persistently.”
A typical enterprise might make tremendous investments in data storage hardware, storage area networks, storage management software, replication, snapshots, backup software, virtual tape libraries, and all kinds of different solutions for storing different data types on different tiers, only to find itself needing to make more hefty investments a year later. I have personally witnessed millions of dollars being spent on data storage solutions with little effect on the final outcome over the long term. It seems the storage industry has no need to plan obsolescence of their products as storage is the only resource in computing that will keep growing and growing.
So what is the solution? Well, it’s mostly about selecting the right storage back end for the right type of data. It is not possible to solve a data crisis with a one-size-fits-all service; rather, you need to take a multipronged approach including classifying your data, deciding which data is suitable for the cloud, and selecting the right type of cloud solution for storing that data. Some data might be bound by compliance, confidentiality, or governance that therefore might need to stay on premises, but for most other data, a much more cost-effective way is to store it in the cloud. AWS offers several different services for storing your data, and this chapter takes a look at each of them.
“Do I Know This Already?” Quiz
The “Do I Know This Already?” quiz allows you to assess whether you should read the entire chapter. Table 4-1 lists the major headings in this chapter and the “Do I Know This Already?” quiz questions covering the material in those headings so you can assess your knowledge of these specific areas. The answers to the “Do I Know This Already?” quiz appear in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Q&A Sections.”
Table 4-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping
Foundations Topics Section
Storing Static Data in AWS
1, 2, 5, 10, 11
Deploying Relational Databases in AWS
3, 6, 7
Handling Nonrelational Data in AWS
4, 8, 12
Caching Data in AWS
You are asked to provide an HTTP-addressable data store that will have the ability to serve a static website. Which data back end would be the most suitable to complete this task?
Complete this sentence: The S3 service allows for storing an unlimited amount of data as long as individual files are not larger than _____ and any individual PUT commands do not exceed _____.
5 GB; 5 MB
5 GB; 5 GB
5 TB; 5 GB
5 TB; 5 MB
Which of these databases is not supported by RDS?
To determine the number of read capacity units required for your data, what do you need to consider ?
Whether reads are performed in the correct sequence
Whether reads are strongly or eventually consistent
Whether reads are coming from one or multiple sources
All of these answers are correct.
Which of the following is not an S3 service tier?
S3 Accelerated Access
S3 Infrequent Access
S3 Reduced Redundancy Store
RDS has the ability to deliver a synchronous replica in another availability zone in which mode?
Your company is implementing a business intelligence (BI) platform that needs to retain end-of-month datasets for analytical purposes. You have been asked to create a script that will be able to create a monthly record of your complete database that can be used for analytics purposes only if required. What would be the easiest way of doing this?
In RDS, choose to create an automated backup procedure that will create a database snapshot every month. The snapshot can be restored to a working database if required by the BI software.
Write a script that will run on a predetermined day and hour of the month and snapshot the RDS database. The snapshot can be restored to a working database if required by the BI software.
Write a script that will offload all the monthly data from the database into S3. The data in S3 can be imported into a working database if required by the BI software.
In RDS, choose to create an automated export procedure that will offload all the monthly data from the database into S3. The data in S3 can be imported into a working database if required by the BI software.
If your application has unknown and very spiky read and write performance characteristics, which of the following should you consider choosing?
Using a NoSQL solution such as Memcached
Auto-scaling the DynamoDB capacity
Distributing data across multiple DynamoDB tables
Using the on-demand model for DynamoDB
Which service would you select to accelerate the delivery of video files?
S3 Accelerated Access
When uploading files to S3, it is recommended to do which of the following? (Choose all that apply.)
Split files 100 MB in size to multipart upload them to increase performance
Use a WAN accelerator to increase performance
Add metadata when initiating the upload
Use a VPN connection to increase security
Use the S3 HTTPS front end to increase security
Add metadata after the upload has completed
Which of these data stores would offer be the least expensive way to store millions of log files that are kept for retention purposes?
DynamoDB reads are performed via:
HTTP NoSQL requests to the DynamoDB API.
HTTP HEAD requests to the DynamoDB API.
HTTP PUT requests to the DynamoDB API.
HTTP GET requests to the DynamoDB API.
Which ElastiCache engine can support Multi-AZ deployments?
All of these answers are correct.