AWS Interview Questions 5
- When you need to move data over long distances using the internet, for instance across countries or continents to your Amazon S3 bucket, which method or service will you use?
- Amazon Glacier
- Amazon CloudFront
- Amazon Transfer Acceleration
- Amazon Snowball
Explanation: You would not use Snowball, because for now, the snowball service does not support cross region data transfer, and since, we are transferring across countries, Snowball cannot be used. Transfer Acceleration shall be the right choice here as it throttles your data transfer with the use of optimized network paths and Amazon’s content delivery network upto 300% compared to normal data transfer speed.
22. How can you speed up data transfer in Snowball?
The data transfer can be increased in the following way:
- By performing multiple copy operations at one time i.e. if the workstation is powerful enough, you can initiate multiple cp commands each from different terminals, on the same Snowball device.
- Copying from multiple workstations to the same snowball.
- Transferring large files or by creating a batch of small file, this will reduce the encryption overhead.
- Eliminating unnecessary hops i.e. make a setup where the source machine(s) and the snowball are the only machines active on the switch being used, this can hugely improve performance.
- If you want to launch Amazon Elastic Compute Cloud (EC2) instances and assign each instance a predetermined private IP address you should:
- Launch the instance from a private Amazon Machine Image (AMI).
- Assign a group of sequential Elastic IP address to the instances.
- Launch the instances in the Amazon Virtual Private Cloud (VPC).
- Launch the instances in a Placement Group.
Explanation: The best way of connecting to your cloud resources (for ex- ec2 instances) from your own data center (for eg- private cloud) is a VPC. Once you connect your datacenter to the VPC in which your instances are present, each instance is assigned a private IP address which can be accessed from your datacenter. Hence, you can access your public cloud resources, as if they were on your own network.
24. Can I connect my corporate datacenter to the Amazon Cloud?
Yes, you can do this by establishing a VPN(Virtual Private Network) connection between your company’s network and your VPC (Virtual Private Cloud), this will allow you to interact with your EC2 instances as if they were within your existing network.
- Is it possible to change the private IP addresses of an EC2 while it is running/stopped in a VPC?
Primary private IP address is attached with the instance throughout its lifetime and cannot be changed, however secondary private addresses can be unassigned, assigned or moved between interfaces or instances at any point.
- Why do you make subnets?
- Because there is a shortage of networks
- To efficiently utilize networks that have a large no. of hosts.
- Because there is a shortage of hosts.
- To efficiently utilize networks that have a small no. of hosts.
Explanation: If there is a network which has a large no. of hosts, managing all these hosts can be a tedious job. Therefore we divide this network into subnets (sub-networks) so that managing these hosts becomes simpler.
- Which of the following is true?
- You can attach multiple route tables to a subnet
- You can attach multiple subnets to a route table
- Both A and B
- None of these.
Explanation: Route Tables are used to route network packets, therefore in a subnet having multiple route tables will lead to confusion as to where the packet has to go. Therefore, there is only one route table in a subnet, and since a route table can have any no. of records or information, hence attaching multiple subnets to a route table is possible.
- In CloudFront what happens when content is NOT present at an Edge location and a request is made to it?
- An Error “404 not found” is returned
- CloudFront delivers the content directly from the origin server and stores it in the cache of the edge location
- The request is kept on hold till content is delivered to the edge location
- The request is routed to the next closest edge location
Explanation: CloudFront is a content delivery system, which caches data to the nearest edge location from the user, to reduce latency. If data is not present at an edge location, the first time the data may get transferred from the original server, but from the next time, it will be served from the cached edge.
- If I’m using Amazon CloudFront, can I use Direct Connect to transfer objects from my own data center?
Yes. Amazon CloudFront supports custom origins including origins from outside of AWS. With AWS Direct Connect, you will be charged with the respective data transfer rates.
30. If my AWS Direct Connect fails, will I lose my connectivity?
If a backup AWS Direct connect has been configured, in the event of a failure it will switch over to the second one. It is recommended to enable Bidirectional Forwarding Detection (BFD) when configuring
your connections to ensure faster detection and failover. On the other hand, if you have configured a backup IPsec VPN connection instead, all VPC traffic will failover to the backup VPN connection automatically. Traffic to/from public resources such as Amazon S3 will be routed over the Internet. If you do not have a backup AWS Direct Connect link or a IPsec VPN link, then Amazon VPC traffic will be dropped in the event of a failure.
- If I launch a standby RDS instance, will it be in the same Availability Zone as my primary?
- Only for Oracle RDS types
- Only if it is configured at launch
Explanation: No, since the purpose of having a standby instance is to avoid an infrastructure failure (if it happens), therefore the standby instance is stored in a different availability zone, which is a physically different independent infrastructure.
- When would I prefer Provisioned IOPS over Standard RDS storage?
- If you have batch-oriented workloads
- If you use production online transaction processing (OLTP) workloads.
- If you have workloads that are not sensitive to consistent performance
- All of the above
Explanation: Provisioned IOPS deliver high IO rates but on the other hand it is expensive as well. Batch processing workloads do not require manual intervention they enable full utilization of systems, therefore a provisioned IOPS will be preferred for batch oriented workload.
33. How is Amazon RDS, DynamoDB and Redshift different?
Amazon RDS is a database management service for relational databases, it manages patching, upgrading, backing up of data etc. of databases for you without your intervention. RDS is a Db management service for structured data only.
- DynamoDB, on the other hand, is a NoSQL database service, NoSQL deals with unstructured data.
- Redshift, is an entirely different service, it is a data warehouse product and is used in data analysis.
- If I am running my DB Instance as a Multi-AZ deployment, can I use the standby DB Instance for read or write operations along with primary DB instance?
- Only with MySQL based RDS
- Only for Oracle RDS instances
Explanation: No, Standby DB instance cannot be used with primary DB instance in parallel, as the former is solely used for standby purposes, it cannot be used unless the primary instance goes down.
- Your company’s branch offices are all over the world, they use a software with a multi-regional deployment on AWS, they use MySQL 5.6 for data persistence.
The task is to run an hourly batch process and read data from every region to compute cross-regional reports which will be distributed to all the branches. This should be done in the shortest time possible. How will you build the DB architecture in order to meet the requirements?
- For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
- For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
- For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
- For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region
Explanation: For this we will take an RDS instance as a master, because it will manage our database for us and since we have to read from every region, we’ll put a read replica of this instance in every region where the data has to be read from. Option C is not correct since putting a read replica would be more efficient than putting a snapshot, a read replica can be promoted if needed to an independent DB instance, but with a Db snapshot it becomes mandatory to launch a separate DB Instance.
36. Can I run more than one DB instance for Amazon RDS for free?
Yes. You can run more than one Single-AZ Micro database instance, that too for free! However, any use exceeding 750 instance hours, across all Amazon RDS Single-AZ Micro DB instances, across all eligible database engines and regions, will be billed at standard Amazon RDS prices. For example: if
you run two Single-AZ Micro DB instances for 400 hours each in a single month, you will accumulate 800 instance hours of usage, of which 750 hours will be free. You will be billed for the remaining 50 hours at the standard Amazon RDS price.
- Which AWS services will you use to collect and process e-commerce data for near real-time analysis?
- Amazon ElastiCache
- Amazon DynamoDB
- Amazon Redshift
- Amazon Elastic MapReduce
Explanation: DynamoDB is a fully managed NoSQL database service. DynamoDB, therefore can be fed any type of unstructured data, which can be data from e-commerce websites as well, and later, an analysis can be done on them using Amazon Redshift. We are not using Elastic MapReduce, since a near real time analyses is needed.
- Can I retrieve only a specific element of the data, if I have a nested JSON data in DynamoDB?
Yes. When using the GetItem, BatchGetItem, Query or Scan APIs, you can define a Projection Expression to determine which attributes should be retrieved from the table. Those attributes can include scalars, sets, or elements of a JSON document.
- A company is deploying a new two-tier web application in AWS. The company has limited staff and requires high availability, and the application requires complex queries and table joins.
Which configuration provides the solution for the company’s requirements?
Explanation: DynamoDB has the ability to scale more than RDS or any other relational database service, therefore DynamoDB would be the apt choice.
40. What happens to my backups and DB Snapshots if I delete my DB Instance?
When you delete a DB instance, you have an option of creating a final DB snapshot, if you do that you can restore your database from that snapshot. RDS retains this user-created DB snapshot along with all other manually created DB snapshots after the instance is deleted, also automated backups are deleted and only manually created DB Snapshots are retained.