Tuesday, 9 April 2019

AWS Concepts Learning

What is SNS, SES

What is scale up and scale out of DB instance based on CPU and memory utilization
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/

https://www.quora.com/Can-I-add-CPU-and-memory-in-AWS-vertical-scaling-on-a-schedule-without-starting-any-new-instances-horizontal-scaling

When the load balancer detects a problem with an instance, it stops distributing traffic to it. When the instance is healthy again, the load balancer restarts
distributing traffic to it. This process allows your application to automatically react to failed instances without your having to be involved beyond configuring
the healthcheck.


A load balancer accepts incoming traffic from clients and routes requests to its registered targets (such as EC2 instances) in one or more Availability Zones. The load balancer also monitors the health of its registered targets and ensures that it routes traffic only to healthy targets. When the load balancer detects an unhealthy target, it stops routing traffic to that target, and then resumes routing traffic to that target when it detects that the target is healthy again.

You configure your load balancer to accept incoming traffic by specifying one or more listeners. A listener is a process that checks for connection requests. It is configured with a protocol and port number for connections from clients to the load balancer and a protocol and port number for connections from the load balancer to the targets.

Elastic Load Balancing supports three types of load balancers: Application Load Balancers, Network Load Balancers, and Classic Load Balancers. There is a key difference between the way you configure these load balancers. With Application Load Balancers and Network Load Balancers, you register targets in target groups, and route traffic to the target groups. With Classic Load Balancers, you register instances with the load balancer.

When you enable an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. If you register targets in an Availability Zone but do not enable the Availability Zone, these registered targets do not receive traffic. Note that your load balancer is most effective if you ensure that each enabled Availability Zone has at least one registered target.

We recommend that you enable multiple Availability Zones. (Note that with an Application Load Balancer, we require you to enable multiple Availability Zones.) With this configuration, if one Availability Zone becomes unavailable or has no healthy targets, the load balancer can continue to route traffic to the healthy targets in another Availability Zone.

After you disable an Availability Zone, the targets in that Availability Zone remain registered with the load balancer, but the load balancer will not route traffic to them.

http://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html


What is the maximum length of a file name in s3?

https://aws.amazon.com/blogs/aws/latency-based-multi-region-routing-now-available-for-aws/

If you choose to create a Hardware VPN Connection to your VPC using a Virtual Private Gateway, you are charged for each "VPN Connection-hour" that your VPN connection is provisioned and available. Each partial VPN Connection-hour consumed is billed as a full hour.

-------------------------------------------------------

Private IP address is the IP address that̢۪s not reachable over the internet and can be resolved only within the network
When an instance is launched, the default network interface eth0 is assigned a private IP address and an internal DNS hostname which resolves to the private IP address and can be used for communication between the instances in the same network only
Private IP address and DNS hostname cannot be resolved outside the network that the instance is in
Private IP address behaviour
remains associated with the Instance when it is stopped or rebooted
is disassociated only when the instance is terminated
An instance when launched can be assigned a private IP address or EC2 will automatically assign an IP address to the instance within the address range of the subnet
An additional private IP addresses, known as secondary private IP addresses can also be assigned. Unlike primary private IP addresses, secondary private IP addresses can be reassigned from one instance to another.
A public IP address is reachable from the Internet
Each instance assigned a public IP address is also given an External DNS hostname. External DNS hostname resolves to the public IP address outside the network and to the private IP address within the network.
Public IP address is associated with the primary Private IP address through NAT
Within a VPC, an instance may or may not be assigned a public IP address depending upon the subnet Assign Public IP attribute
Public IP address assigned to the pool is from the public IP address pool and is assigned to the instance, and not to the AWS account. It cannot be reused once disassociated and is released back to the pool

Public IP address behaviour
cannot be manually associated or disassciated with an instance
is released when an instance is stopped or terminated. Stopped instance when started receives a new public IP address
is released when an instance is assigned an Elastic IP address
is not assigned if there are more than one network interface attached to the instance

Subnets:

Which of the following are characteristics of Amazon VPC subnets?
1) Each subnet maps to a single Availability Zone
2) By default, all subnets can route between each other, whether they are private or public

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html

http://jayendrapatil.com/tag/eni/

http://jayendrapatil.com/aws-ec2-instance-purchasing-option/

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-reserved-instances.html


Configuring VPC:

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario1.html

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Internet_Gateway.html

http://docs.aws.amazon.com/AmazonVPC/latest/NetworkAdminGuide/Introduction.html

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario3.html
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html


What is the minimum size of an s3 object?
Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 terabytes. The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.


Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.

S3 bucket: A bucket is a logical unit of storage in Amazon Web Services (AWS) object storage service, Simple Storage Solution S3. Buckets are used to store objects, which consist of data and metadata that describes the data

What is s3 bucket policy?
An S3 ACL is a sub-resource that's attached to every S3 bucket and object. It defines which AWS accounts or groups are granted access and the type of access. When you create a bucket or an object, Amazon S3 creates a default ACL that grants the resource owner full control over the resource

Bootstrapping:

https://s3.amazonaws.com/cloudformation-examples/BoostrappingApplicationsWithAWSCloudFormation.pdf

 http://www.lynda.com/AWS-tutorials/Understanding-bootstrapping/163929/187356-4.html

 Application Services and HA/Disaster Recovery:

http://www.lynda.com/AWS-tutorials/Simple-Queue-Service-SQS/163929/187362-4.html


http://docs.aws.amazon.com/ses/latest/DeveloperGuide/Welcome.html

http://www.lynda.com/AWS-tutorials/Simple-Notification-Service-SNS/163929/187364-4.html


 https://aws.amazon.com/swf/
  http://www.lynda.com/AWS-tutorials/Simple-Workflow-Service-SWF/163929/187363-4.html


  http://media.amazonwebservices.com/AWS_Disaster_Recovery.pdf



  Migration to AWS Cloud:

  https://aws.amazon.com/whitepapers/migrating-your-existing-applications-to-the-aws-cloud-with-3-example-scenarios/
  http://www.lynda.com/AWS-tutorials/What-we-building-Overview-web-application-architecture/163929/187372-4.html

http://www.lynda.com/AWS-tutorials/Signing-up-AWS/163929/187373-4.html?

http://www.lynda.com/AWS-tutorials/Creating-new-IAM-user/163929/187374-4.html?

http://www.lynda.com/AWS-tutorials/Creating-key-pair/163929/187375-4.html?

http://www.lynda.com/AWS-tutorials/Configuring-security-group/163929/187376-4.html?

http://www.lynda.com/AWS-tutorials/Creating-ELB/163929/187377-4.html?

http://www.lynda.com/AWS-tutorials/Launching-EC2-instance-configuring-Apache-PHP-user-data/163929/187378-4.html?

http://www.lynda.com/AWS-tutorials/Connecting-EC2-instance-via-HTTP/163929/187379-4.html?

http://www.lynda.com/AWS-tutorials/Connecting-EC2-instance-via-SSH/163929/187380-4.html?

http://www.lynda.com/AWS-tutorials/Creating-MySQL-RDS-database/163929/187381-4.html?

http://www.lynda.com/AWS-tutorials/Creating-custom-server-image/163929/187382-4.html?

http://www.lynda.com/AWS-tutorials/Auto-Scaling/163929/187383-4.html?


Data storage and data services on AWS:
https://www.lynda.com/SharedPlaylist/a44db7b0e1104488897699fedd0940bd


Key points
----------
1) By placing resources in separate AZ, u can protect your website or application from a service disruption impacting a single location.
2) You can achieve high availability by deploying ur appliation across multiple AZ. Redundant instances for each tier (ex: db or application) of an appliation shud be placed in distinct AZ. Thereby, creating a multisite solution. At a minimum, the goal is to have an independent copy of each appliation stack in two or more AZ.

3) Following is a partial list of many certifications and standards with which AWS complies:
Service Organization Controls (SOC) 1 / International Std on Assurance Engagements (ISAE) 3402, SOC2, SOC3
Federal Information Security Mgmt Act (FISMA), Department of Defense Information Assurance Certification and Accreditation Process (DIACAP), and
Federal Risk and Authorization Mgmt Program (FedRAMP)
Payment Card Industry Data Security Standard (PCI DSS) Level 1
International Organization for Standardization (ISO) 9001, ISO270001, ISO 27018

4) AWS Cloud computing platform:
Platform services: Databases - Relational, NoSQL, Caching
   Analytics - Hadoop, Real-time, Data warehouses, Data workflows
   App services - Queing, Orchestration, App Streaming, Transcoding, Email, Search
   Deployment & Mgmt - Containers, DevOps tools, Resources templates, Usage tracking, Monitoring & logs
   Mobile services - Identify, Syns, Mobile Analytics, Notifications

Foundation services: Compute - VMs Auto scaling & LB
Storage - Object, Block and archive
Security & Access control
Networking

Infrastructure: Regions, AZ, Content Delivery Networks and Points of Presence


5) Accessing the AWS platform - there are 3 ways to access it - 1) AWS Mgmt console, 2) AWS CLI 2) AWS SDK
--------------------------------
Compute and Networking services
--------------------------------
6) EC2 - EC2 is a web service that provides resizable compute capacity in the cloud. Allows organizations to obtain and configure virtual servers in Amazon's data centers and to harness those resources to build and host software systems. Organizations can select from a variety of OS and resource configuratoins (memory, CPI , storage,etc) that are optimal for the application profile of each workload.

7) AWS Lambda - Its a platform for back-end web developers that runs ur code for u on AWS cloud and provides u with fine-grained pricing structure. AWS Lambda runs ur back-end code on its own AWS compute fleet of EC2 instances across multiple AZs in a region, which provides the high availability, security, performance and scalability of the AWS infrastructure.

8) Auto scaling - Allows organizations to scale Amazon EC2 capacity up or down automatically according to conditions defined for the particular workfload. Not only can it be used to help maintain application availability and ensure that desired no of EC2 instances are running, but it also allows resources to scale in and out to match the demands of dynamic workloads. Instead of provisioning for peak load (i.e pay the cost of maximum machines+resources for waiting to reach highest pick load), organizations can optimize the costs and use only the capacity that is actually needed.
Auto scaling is well suited both to applications that have stable demand patterns and to appliations that experience hourly,daily or weekly variability in usage.

9) ELB - Automatically distributes incoming appliation traffic across multiple EC2 instances. Enables organizations to achieve greater fault tolerance in their applications, seamlessly providing the required amount of load balancing capacity needed to distribute appliation traffic.

10) Elastic beanstalk - is the fastest n simplest way to get a web app up and running on aws. Developers can just upload their appliation code, and the service automatically handles all details, like resource provisioning, load balancing, auto scaling and monitoring. It supports multiple platforms like PHP, Java, Python, Ruby, Node.js. .Net, Go, etc.

11) VPC - VPC lets organizations provision a logically isolated section of the AWS cloud where they can launch AWS resources in a virtual network that they define.

12) AWS Direct connect - Allows organizations to establish direct connection from their data center to AWS. Using direct connect, organizations can establish private connectivity between AWS and their data center, office or colocation envt; which may reduce network costs, increase bandwidth throughput, and provide a more consistent network experience than internet-based VPN connections.

13) Route 53 - is a highly available n scalable DNS web service. Its designed to give developers n businesses a reliable and cost effective way to route end users to internet applications by translating human readable names (ex: www.google.com) into IP address like 192.0.2.1 that computers use to connecct to each other. Route 53 also serves a domain registrar allowing u to purchase n manage domains directly from AwS.

--------------------------------
Storage and Content Delivery
--------------------------------
1) S3 - provides highly durable n scalable object storage that handles virtually unlimited amounts of data n large no. of concurrent users. Organizations can store any no of objects (of any type- html pages,src code files ,img files, and encrypted data) and access them using HTTP based protocols. S3 can be used for many use cases like backup and recovery, nearline archive, big data analytics, disaster recovery, cloud appliations and content distribution.

2) Amazon Glacier - secure, durable and very low cost storage service for data archiving and long-term backup. Organizations can store large/small amt of data for a very low cost per GB per month. For low cost benefit for customers, Glacier is optimized for infrequently accessed data where retrieval time of several hours is suitable.

3) EBS - provides persistent block-level storage volumes for use with EC2 instances. Each EBS volume is automatically replicated within its AZ to protect organizations from component failure, offering high availability and durability. By delivering consistent n low-latency performance, EBS provides disk storage needed to run a wide variety of workloads.
NOTE: Alfresco product (when installed on AWS) supports storage on EBS; but for that the key (access key/secret key to access EBS) changes much freqeuntly (in minutes or hours atleast). And in alfresco, acccess key and secret key is configured in alf-global.properties file. So this freqeuntly, its not possible to restart alfresco. So S3 is a better option with Alfresco. So it depends on the product features also that which storage mechanism should be used.

4) AWS Storage Gateway -  its a service connecting on-premise software appliance with cloud based storage to provide seamless and secure integration between an organization's on-premises IT envt and AWS storage infrastructure. It provides low-latency performance by maintaining a cache of freqeuntly accessed data on-premises while securely storing all of your data encrypted in S3 or Glacier.

5) Amazon CloudFront - is a content delivery web service. Integrates with other cloud services to give developers and business easy way to distribute content to users acorss the world with low latency, high speed, and no minimum usage commitments. CF can b used to deliver ur entire website, including dynamic, static, streaming, interactive content using a gloabl network of edge locations. Requests for content r automatically routed to nearest edge locations, so content is delivered with best possible performance to end users around the globe.

DB services
------------
RDS - provides a fully managed RDS with support for many open source and commercial db engines. Cost-effective serivce alowing organizations to lanuch secure, highly available, fault-tolerant, prod-ready db in minutes. Manages backups, software patching, monitoring, scaling and replication, organizational resources can focus on revenue-generating applications and business instead of operational tasks.

DynamoDB - is a fast n flexible NoSQL db service for all applications that need consistent, single-digit millisecond latency at any scale. Its a fully managed db and supports both document n key/value data models. Its flexibl data model n reliable performance make it great fit for web . mobile gaming, IoT, etc


Redshift - is a fast, petabyte scale data warehouse service that makes it simple n cost effective to analyze structured data. It provides a std SQL interface that lets organizations use existing BI tools. By leveraging columnar storage technology that improves I/O efficiency and parellel queries across multiple nodes, RedShift is able to deliver fast query performance. Redshift allows organizations to automate most of the common admin tasks associated with provisioning, configuring and monitoring a cloud data warehouse.

ElastiCache - is a websrvice that simplifies deployment, operation and scaling of an in-memory cache in the cloud. It improves the performance of web app by allowing organizations to retrive info from fast managed in-memory caches instead of relying on slower disk-based dbs. ElastiCache supports Memcached and Redis cache engines as of now.

-----------
Mgmt Tools
-----------

CloudWatch - monitoring service for monitoring cloud resources and applications running on AWS. (used for track metrics, collect n monitor log files and set alarms). Organizations can gain system wide visibility into resrc utilization, appliation performance, and operational health.

CloudFormation - gives developers n system admins a way to create n manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion. It defines n JSON-based templating language that can b used to describe all AWS resources that r necessary for a workload. Templates can be submitted to CloudFormation and the service will take care of provisioning and configuring those resources in appropriate order.

CloudTrail - service tht records AWS API calls for an account and delivers log files for audit n review. The recorded info includes identity of the API caller, time of API call, source IP addr of API caller, request param, response param retruned by the service.

AWS Config - service that provides organizations with AWS resource inventory, configuration history, and config change notifications to enable security and governance. With AWS Config, organizations can discover existing AWS resources, export and inventory of their AWS resources with all config details, n determine how a resrc was configured at any point in time.

----------------
SEcurity and Identity
---------------------
IAM - enables org to securely control access to AWS cloud services n resrc for their users. Org can create n manage users n groups n use permissions to allow n deny their access to AWS resources.

KMS (Key Mgmt Service) - service that makes it easy for org to create n control the encryption keys used to encrypt their data n uses Hardware Security Modules (HSM) to protect the security of ur keys.

AWS directory service - Allows org to setup and run Microsoft AD on AWS cloud or connect their AWS resources with an existing on-premise Microsoft AD. Org can use it to manage users n groups, provide SSO to applications n services, create n apply group policies, domain join EC2 instances, and simplify deployment n mgmt of linux n windows workloads.

AWS Certificate Manager - lets org easily provide, manage, n deploy SSL/TLS certificates for use with AWS cloud services. It removes the time-consuming manual process of purchasing, uploading n renewing SSL/TLS certificates. Org can quickly request a certificate, deploy it on AWS resources like ELB or CloudFront distributions, and let AWS Certificate Manager handle certificate renewals.

AWS Web Application Firewall (WAF) - helps protect web apps from common attacks and exploits that could affect appliation availability, compromise security or consume excessive resources.WAF allows org to control with traffic to allow or block to their web apps by defining customizable web security rules.

------------
Application services
-----------------------
Amazon API Gateway - a fully managed service that makes it easy for developers to create, publish, maintain, monitor n secure APIs at any scale. Org can create an API that acts as a front door for applications to access data, biz logic or functionality from back-end services, like workloads running on EC2, code running
on AWS Lambda, or any web app. It handles all tasks involved in accepting n processing upto 100s or 1000s of concurrent API calls, including traffic mgmt, authorization, n access control, monitoring n API version mgmt.

Amazon Elastic Transcoder - is a media transcoding in the cloud. It converts (or transcodes) media files from their src formats into the versions that will play back on devices like Smartphones, tablets, and PCs.

Amazon SNS (Simple Notification Service) - web service that coordinates n manages delivery or sending of msgs to recipients. In SNS, thr r 2 types of clients - publishers n subscribers - also called producers n  consumers. Publishers communicate asnychronously with subscribers by producing n sending a  msg to a topic, which is a logical access point n communication channel. Subscribers consume or receive the msg or notification over one of the supported protocols when they r subscribed to the topic.

Amazon SES (Simple Email Service) -  email service that org can use to send transactional email, marketing msgs, or any type of content to their customers.
 SES can also b used to receive msgs n deliver them to an S3 bucket, call custom code via Lambda function, or publish notifications to SNS

SWF - Simple Workflow Service - helps developers build, run n scale background jobs that have parallel or sequential steps. SWF can b thought of as a fully managed state tracker n task coordinator on the cloud.

SQS - Simple Queue Service - its a msging-queue service. Makes it simple n cost-effective to decouple d components of a cloud application. Org can transmit any volume of data, at any level of throughput, without losing msgs or requiring other services to be always avl.

---------
IMP POINTS:
-----------
1) AWS Provides highly available technology infrastructure platform with multiple locations worldwide. These locations r composed of regions n AZs. Each region is located in a separate geographic area n has multiple, isolated locations called AZs.

2) Region is a physical geographic location that consists of a cluster of data centers. AWS regions enable placement of resources n data in multiple locations around the globe. Each region is completely independent n is designed to be isolated from other regions. This achieves greatest possible fault tolerance n stability. Resources aren't replicated across regions unless organizations choose to do so.

2) AZ is one or more data centers within a region that r designed to be isolated from failures in other AZ. AZ provides inexpensive, low latency network connectivity to other zones in same region. By placing resources in seperate AZ, org can protect their website or application from a service disruption impacting a single location.

3) A hybrid deployment model is an architectural pattern providing connectivity for infrastructure n applications between cloud-based resrcs n existing resrc that r not located in cloud.


---------------
Questions:
---------------
1) Which of the following describes a physicl location around the world where AWS clusters data centers ?
Ans - Region

2) Each AWS region is composed of 2 or more locations that offer organizations the ability to operate prod systems that r more highly avl, fault tolerant, n scalable than wud be possible using a single data center. What r these locations caled ?
Ans - AZ

3) Wat is the deployment term for an envt that extends an existing on premise infrastructure to cloud to connect cloud resrcs to internal systems ?
Ans. Hybrid deployment

4) Which AWS cloud service allows org to gain system-wide visibility into resrc utilization, application performance and operational health ?
Ans: Amazon CloudWatch

5) Which cloud services is a fully-managed NoSQL db service ?
Ans. DynamoDB

6) Ur company experiences fluctuations in traffic patterns to their e-commerce website based on flash sales. What service can help ur company dynamically match the required compute capacity to the spike in traffic during flash sales ?
Ans. Auto scaling

7) Ur company provides an online photo sharing service. The devt team is looking for ways to deliver img files with lowest latency to end users so the website content is delivered with best performance. Wht service can help speed up distribution of these img files to end users >
Ans. Amazon CloudFront

8) Ur company runs Amazon EC2 instance periodically to perform a batch processing job on a large n growing filesystem. At the end of the batch job, u shutdown the EC2 instance to save money but need to persist the filesystem on EC2 instance from the previous batch runs. What cloud service can u leverage to meet these requirements ?
Ans. EBS

9) What cloud service provides a logically isolated section of AWS cloud where org can launch AWS resrcs in a virtual network that they define ?
Ans. VPC

10) Ur company provides a mobile voting app for a popular TV show, and 5 to 25 million viewers all vote in a 15-second timespan. What mechanism can u use to decouple the voting app from ur back-end services that tally d votes ?
Ans. SQS (Simple Queue Service)


================================================================
Chapter 2: S3 and Glacier
================================================================
Nearly any app running on AWS uses S3 either directly or indirectly. In order to control who has access to ur data, S3 provides a rich set of permissions, access controls, n encryption options.

Object Storage VS  Block Storage & File Storage
-----------------------------------------------
In traditional IT envts, 2 types of storage dominate - block storage (which operates at lower level - raw storage device level -  n manages data as a set of numbered fixed-size blocks), and File storage (which operates at higher level - OS level, manages data as named hierarchy of files n folders).
Block n file storage r mostly accessed over a network in the form of SAN for block storage, using protocols like iSCSI or Fibre Channel or NAS using protocols like  CIFS or NFS. Whether directly attached or network attached, block or file storage is closely associated with service n the OS that is using d storage.

Object storage is differnt; instead of being closely associated with server, S3 is independent of  server n is accessed over the internet. Instead of managing data as blocks or files using SCSI, CIFS or NFS protocols, data is managed as objects using API built on std HTTP protocol.

Each S3 obj contains both - data n metadata. Objects reside in containers called buckets, and each obj is identified by a unique user-specified key (filename).
Buckets r a simple flat folder with no file system hierarchy. That is, u can have multiple buckets but u cant have a sub-bucket within a bucket. Each bucket can hold unlimited no. of objects.
Just think of S3 object as a file, and key as the filename. In S3, u GET an object or PUT an obj,  operating on the whole obj at once, instead of incrementally updating portions of the OS on S3, or run a DB on it.
U dont have to worry on data durability or replication across AZ - S3 object r automatically replicated on multiple devices in multiple facilities within a region. Same with scalability - if ur request rate grows steadily, S3 automatically partitions buckets to support high request rates n simultaneous access by many clients.

If u need traditional block or file storage along with S3, AWS provides EBS service for EC2 instances. Amazon EFS (Elastic File System) provides network attached shared file storage using NFS v4 protocol.

S3 basics
----------
A bucket is a web folder (container) for objects (files) stored in S3. Every s3 obj is contained in a bucket.
Buckets form the top-level namespace for S3, n bucket names r global; means ur bucketname shud b unique across all AWS accounts, much like DNS.
Bucket names can contain upto 63 lowercase letters, no.s, hyphens and periods (dot).
U can create n use multiple buckets; u can have upto 100 per account by default.

Best practice is to use bucket names that contain ur domain name n conform to the rules for DNS names. THis ensures that the bucket names r ur own, can be used in all regions, and can host static websites.

AWS Regions:
Although S3 buckets are global, each S3 bucket is created in a specific region that u choose. This lets u control where ur data is stored. U normally create n use buckets that r located close to a particular set of end users or customers to minimize latency, or located in a particular region to satisfy data locality and sovereignity concerns. U control d location of ur data; data in an S3 bucket is stored in that region unless u explicitly copy it to another bucket located in a different region.

Objects:
Objects r the entities or files stored in S3 buckets. An obj can store virtually any data in any format. Object size can range from 0 bytes to 5 TB, and a single bucket can store unlimited no. of objects. Means S3 can store unlimited amount of data.
Each obj consists of data (file itself) and metadata (data about the file). Data portion of an obj is opaque to S3. Its treated as simple a stream of bytes.
Metadata associated with S3 obj is a set of name/value pairs that describe the obj. Thr r 2 types of metadata: system metadta and user metadata.
System metadata is created n used by S3 itself. and it includes things like date last modified, obj size, MD5 digest, content type,etc. User metadata is optional, n it can only b specified at the time an obj is created. U can use custom metadata to tag ur data with attributes.

Keys:
Every obj stored in S3 is identified by a unique identifier called a key. U can think of key as a filename.
A key can be upto 1024 bytes of UTF-8 chars including slash, backslash, dots and dashes.
Keys must be unique within a single bucket, but diff buckets can contain objects with same key. THe combination of a bucket, key and optional version ID uniquely identifies an S3 obj.

Obj URL:
S3 is storage for the internet, and every S3 obj can be addressed by a unique URL formed using web services endpoint, bucket name, and obj key.
Example:
http://mybucket.s3.amazonaws.com/jack.doc
mybucket is the bucket name, jack.doc is the key or filename.
Example 2 :
http://mybucket.s3.amazonaws.com/fee/fi/fo/fum/jack.doc
bucket name is still mybucket, but now the key or filename is the string fee/fi/fo/fum/jack.doc

A key may contain delimiter chars like slashes or baskslashes to help u name n logically organize ur S3 objects, but to S3 it is just a long key name in a flat namespace. There is no actual file n folder hierarchy.
For convenience, S3 console and Prefix n Delimiter feature allow u to navigate within an S3 bucket as if thr were a folder hierarchy. But remember that a bucket is a single flat namespace of keys with no structure.

S3 Operations:
Create/Delete a bucket ; Write an obj ; Read an obj ; Delete an obj; List keys in bucket.
Interface for S3 is a REST API. With REST API, u can use http or https requests to create n delete buckets, list keys, and r/w objs.
REST maps std http verbs to familiar CRUD operations. Create is HTTP PUT (sometimes POST), read is HTTP GET, delete is HTTP DELETE, and update is HTTP POST (sometimes PUT).
Always use HTTPS for S3 API requests to ensure that ur requests n data are secure.

In most cases, users do not use REST API directly, but interact with S3 using high-level interfaces avl. These includes AWS SDK (wrapper libraries) for iOS, Android, JS, Java, .Net , Node.js, PHP, Python, C++, etc. ; AWS CLI and AWS Mgmt console.

S3 originally supported SOAP protocol API in addition to REST, but u shud use REST API. The legacy HTTPS endpoint is still avl, but new features r not supported.

Durability n Availability:
S3 std storage is designed for 99.999999999% durability and 99.99% availability of objects over a given year.
S3 achieves high availability by automatically storing data redundantly on multiple devices in multiple facilities within a region.
It sustains concurrent loss of data in 2 facilities  without loss of user data.
If u need to store non-critical or easily reproducible derived data (like img thumbnails) that doesnt require high level of durability, u can choose to use RRS (Reduced Redundancy Storage) at a low cost. RRS offers 99.99% durability with low cost of storage than traditional S3 storage.

Data Consistency:
S3 is eventually consistent system. Bcoz ur data is automatically replicated across multiple servers and locations within a region, changes in ur data may take some time to propogate to all locations. As a result, thr r some situations where info that u read immediately after an update may return stale data.
For PUTs to new objs, its not a concern as S3 provides read-after-write consistency. But for PUTs to existing objs (obj overwrite to an existing key), and for objects DELETEs, s3 provides eventual consistency. It means if u PUT new data to an existing key, a subsequent (GET call at this same time) GET might return the old data. Similarly, if u DELETE an obj, subsequent GET for that obj might still read the deleted object. In all cases, updates to a single key are atomic-for eventually consistent reads, u will get the new data or old data, but never an inconsistent mix of data.

Access control:
S3 is secure by default, when u create a bucket or obj in S3, only u have access. To allow u to give access to others, S3 provides coarse-grained access control (ACLs) and fine-grained access controls (S3 bucket policies, IAM, and query string authentication)
S3 allows u to grant certain coarse-grained permissions - READ, WRITE or Full control at the obj or bucket level. ACLs r legacy control mechanism created b4 IAM existed.
S3 bucket policies r the recommended access control mechanism and provide much fine-grained control. Bucket policies r similar to IAM policies but r different in that:
- They are associated with bucket resource instead of IAM principal.
- They include an explicit reference to IAM principal in the policy. This principal can b associated with a different AWS account, so S3 bucket policies allow u to assign cross-account access to S3 resources
Using S3 bucket policy, u can specify who can access the bucket, from where at what time of the day.
IAM policies may be associted directly with IAM principals that grant access to S3 bucket, just as it can grant access to any AWS service n resource.
Obviously, u can only assign IAM policies to principals in AWS accounts that u control.

Static website hosting:
Common use case for S3 is static website hosting (all pages with static data, no dynamic data).
Static websites r very fast, scalable and more secure than a dynamic website.
U get all benefits of S3 like durability, security, availability, and scalability of S3.
Bcoz every S3 object has a URL, it's straight forward to turn a bucket into a website. To host a static website, u simply configure a bucket for website hosting and then upload the content of the static website to the bucket.
To configure S3 bucket for static website hosting:
1) Create a bucket with same name as the desired website hostname.
2) Upload static files to the bucket.
3) Make all files public (World readable)
4) Enable static website hosting for the bucket. It includes specifying an index document and error document.
5) The website will now be avl at S3 website url.
    <bucket-name>.s3.website-<AWS-region>.amazonaws.com.
6) Create a friendly DNS name in ur own domain for the website using a DNS CNAME, or an Amazon Route 53 alias that resolves to the S3 website url.
7) The website will now be avl at ur website domain name.

Prefixes and Delimiters:
S3 users a flat structure in a bucket, but it supports using prefix n delimiter parameters when listing key names.
This feature lets u organize, browse, and retrieve the objects within a bucket hierarchically. U can use slash or backslash as delimiter and then use key names with delimiters to create a file n folder hierarchy within the flat obj key namespace of a bucket.
Ex: u might want to store a series of server logs by server name (server42), but organized by year n month:
logs/2016/January/server42.log
logs/2016/feb/server42.log

REST API, CLI, SDK all support using prefixes n delimiters.
Use delimiters n obj prefixes to hierarchically organize the objects in ur S3 buckets, but remember that S3 is not really a file system.


Storage Classes:
S3 offers a range of storage classes for various use cases:
1) Amazon S3 standard -> most commonly used storage class; as it offers high availability, low latency, n high performance obj storage for short-term or long-term storage of freqeuntly accessed data.
2) Amazon S3 standard - Infreqent Access (Standar-IA) -> Offers same high avl, low latency, etc but designed for long-lived less freqeuntly accessed data.
   Standard-IA has lower per GB-month storage cost than Standard; min obj size-128 KB, min duration 30 days, and per-GB retrieval costs so its best suited for infreqeuntly accessed data that is stored for longer than 30 days.
3) Amazon S3 RRS (Reduced Redundancy Storage) - offers lower durability (4 nines) than Std or Std-IA at reduced cost. Suited for derived data than can be easily reproduced, like image thumbnails.
4) Amazon Glacier storage - extremely low cost storage for data that does not require real time access like archives n long-term backups. Retrieval time of several hours is suitable. To retrieve a glacier obj, u issue a restore cmd using one of the S3 APIs; and 3 to 5 hrs later, the glacier obj is copied to S3 RRS.
Note that the restore simply creates a copy in S3 RRS; original data obj remains in glacier until explicitly deleted. Also glacier allows u to retrieve upto 5% of S3 data stored in glacier for free each month; restores beyond the daily restore allowance incur a cost.
Along with acting as a storage tier in S3, glacier also is a standalone storage service with seperate API n some unique features. But if u use glacier as a storage class of S3, u always interact with data via S3 API.

Obj Lifecycle mgmt:
Using S3 lifecycle configuration rules, u can significantly reduce ur storage costs by automatically transitioning data from one storage class to another or even automatically deleting after a period of time. For ex:, lifecycle rules for backup data might be:
1) Store backup data initially in S3 Standard
2) After 30 days, transition to Standard-IA
3) After 90 days, transition to Amazon Glacier
4) After 3 years, delete.
Lifecycle configurations r attached to the bucket and can apply to all objects in the bucket or only to objects specified by a prefix.


Encryption:
To encrypt ur S3 data in flight, u can use the S3 Secure Sockets Layer (SSL) API endpoints. This ensures tht all data sent to n from S3 is encrypted while in transit using HTTPS.
To encrypt ur S3 data at rest, u can use several variations of SSE (Server-Side Encryption).
S3 encrypts ur data at obj level as it writes it to disks in its data centers n decrypts it for u when u access it.
All SSE performed by S3 and AWS KMS (Key Mgmt Service) uses 256-bit Advanced Encryption Standard (AES).
U can also encrypt ur S3 data at rest using Client-side encryption, encrypting ur data on the client b4 sending it to S3.
1) SSE-S3
AWS handles key mgmt n key protection for S3. Every obj is encrypted with a uniqye key. Actual key itself is further encrypted by a seperate master key. A new master key is issued atleast monthly, with AWS rotating the keys. Encrypted data, encryption keys, and master keys r all stored seperately on secure hosts  in order to enhance the protection.
2) SSE-KMS
AWS handles key mgmt n key protection for S3; but here u manage the keys. SSE-KMS offers additional benefits compared to SSE-S3: There r seperate permissions for using master key, which provide protection against unauthorized access to ur objs stored in S3 and an additional layer of control. Also, AWS KMS provides auditing so u can see who used ur key to access which obj and when they did it. Also shows failed attempts to access data from users who did not have permissions to decrypt the data.
3) SSE-C (Customer provided keys)
Used when u want to maintain ur own encryption keys but dont want to manage or implement ur own client-side encryption library. Here, AWS will do encryption/decryption of ur objs while u maintain full control of the keys used to encrypt/decrypt the objs in S3.
4) Client-Side encryption
Encrpting data on client side of ur app b4 sending it to S3. U have 2 options for using data encryption keys: 1) Use AWS-KMS managed customer master key 2) Use a client-side master key.
With client-side encryption, u get end-to-end control of the encryption process, including mgmt of encryption keys.

For simplicity n ease of use, use server-side encryption with AWS managed keys (SSE-S3 or SSE-KMS)

Versioning:
S3 versioning allows u to protect ur data against accidental or malicious deletion by keeping  multiple versions of each obj in the bucket, identified by a unique version ID. Versioning allows u to preserve, retreive n restore every version of every obj stored in bucket. U can restore the obj by just referencing the version ID along with the bucket and obj key.
Versioning is turned on at bucket level. Once enabled, u cannot remove versioning from a bucket; it can only be suspended.

MFA Delete:
MFA delete provides data protection on top of bucket versioning. MFA Delete requires additional authentication to permanently delete an obj version or change the versioning state of the bucket. In addition to normal security credentials, MFA delete requires an OTP generated by a hardware or virtual device.
NOTE: MFA Delete can only b enabled by root account.

Pre-signed URLs:
All S3 objs r privte (i.e only owner has access). But obj owner can share objs with others by creating a pre-signed url (using their own security credentials) to grant time-limited permission to download the objs.
When u create a pre-signed url for ur obj, u must provide ur own credentials n specify a bucket name, an obj key, HTTP method (GET) to download the obj, and an expiration date n time.
Pre-signed urls r valid only for the specified duration. This is useful to protect against 'content scraping' of web content like media files stored in S3.


Multipart Upload:
To upload or copy larger objs, S3 provides multipart upload API. This allows to upload large objs in parts, which gives better network utilization (thru parallel transfers), ability to pause n resume, and upload objs where size is initially unknown.
Multipart upload is a 3 step process: inititation, uploading the parts, completion (or abort). Parts can b uploaded independently in arbitrary order, with retransmission if needed. After all parts r uploaded, S3 assembles them to create an obj.
Normally u shud use multipart upload for objs > 100 MB, and u must always use multipart upload for objs > 5 GB.
When using low level APIs, u must break the file into parts n keep track of them.
When using high level APIs, and high level S3 commands in CLI (like aws s3 cp, aws s3 mv, aws s3 sync), multipart upload is automatically performed for large objects.
U must set an obj lifecycle policy on a bucket to abort incomplete multipart uploads after a specific no. of days. This will minimize the storage costs associated with multipart uploads that were not completed.

Range GETs:
Its possible to download (GET) only a portion of an obj in both S3 and Amazon Glacier by using Range GET.
Using the Range HTTP header in GET request or equivalent params in one of the SDK wrapper libraries, u specify a range of bytes of the obj. This is useful while dealing with larger objs when u have poor network connectivity or want to download only a portion of a large Amazon Glacier backup.

Cross-Region Replication:
Allows u to asynchronously replicate all new objs (from src bucket in one AWS region) to a target bucket in another region. Metadata n ACLs associated to that obj r also part of the replication.
After u set cross-region replication on ur src bucket, any changes to the data, metadata or ACLs on an obj of source bucket  triggers a new replication to the destination bucket.
To enable CR replication, versioning must b turned on for both src n destination buckets, n u mst use an IAM policy to give S3 permissions to replicate objs on ur behalf.
CR replication is commonly used to reduce the latency required to access objs in S3 by placing objs closer to a set of users or to meet requirements to store backup data at a certain distance from orig src data.
NOTE: If turned on in an existing bucket, CR replication will only replicate new objs. Existing objs wont be replicated n must be copied to new bucket via a seperate command.

Logging:
To track requests to ur S3 bucket, u can enable S3 server access logs. Logging is off by default, but can be turned on easily.
When u turn on logging on a source bucket, u need to specify the location where logs will be stored (the target bucket). U can store logs in same bucket or in a different bucket.
Best practice is to specify a prefix (like logs/ or urbucketname/logs/) so u can identify ur logs (but this is optional).
Once enabled, logs r delivered on best-effort basis (with slight delay).
Logs include:
Requestor account and IP address
Bucket name
Request time
Action (GET, PUT, LIST, etc)
Response status or error code


Event notifications:
S3 even notifications can b sent in response to actions taken on objs uploaded or stored in S3. Evnt notifications enable u to run workflows, send alerts, or perform other actions in response to changes in ur objs stored in S3.
U can use S3 event notifications to setup triggers to perform actions like transcoding media files when they r uploaded (converting media file to device supported format), processing data files when they become avl, synching S3 objs with other data stores.
S3 event notifications r set at bucket level, and u can configure them thru S3 console, REST API or AWS SDK.
S3 can publish notifications when new objs r created (by a PUT, POST, COPY or multipart upload completion), when objs r removed (by DELETE), or S3 detects that an RRS obj was lost. U can also setup event notifications based on obj name prefixes n suffixes.
Notification msgs can be sent thru either SNS (SImple Notification Service) or SQS or delivered directly to AWS Lambda to invoke AWS Lambda functions.

Best practices:
Its a common pattern to use S3 storage in hybrid IT envts and apps; For ex: data in on-premise file systems, dbs, and archives can easily b backed up over the internet to S3 or Glacier, while the primary app or db storage remains on-premise.
Another common pattern is to use S3 as bulk BLOB storage for data, while keeping an index to that data in another service like DynamoDB or RDS. This allows  quick searches n complex queries on key names without listing keys continually.
S3 will scale automatically to support high request rates, automatically re-partitioning ur buckets as needed. If u need request rates higher than 100 requests per second, u may want to review the S3 best practices guidelines in developer guide. To support higher request rates, its best to ensure some level of random distribution of keys i.e including a hash as prefix to key names.
TIP: If u r using S3 in a GET intensive mode (like normally all GET requests r only sent to S3), like static website hosting, for best performance u shud consider using CloudFront distribution as a caching layer in front of ur S3 bucket.


AMazon Glacier:
Extremely low cost storage service for data archiving n online backup. Retrieval time 3 to 5 hrs is expected.
Can storage unlimited data.
Common use-cases : Replacement of traditional tape solutions for long-term backup n archive n storge of data required for compliance purposes. In most cases, data on Glacier consists of large TAR file (Tape Archive) or ZIP files.
Like s3, Glacier is extremely durable, storing data on multiple devices across multiple facilities in a region.
Designed for 99.999999999% durability on objs over a given year.

Archives:
In glacier, data is stored in archives. An archive can contain upto 40TB of data, n u can have unlimited no. of archives.
Each archive is assigned a unique archive ID at the time of creation. But unlike S3 obj key, u cannot specify a user-freindly archive name.
All archives r automatically encrypted and non-modifiable (immutable) - after an archive is created, it cannot be modified.

Vaults:
Vaults r containers for archives. Each AWS account can have upto 1000 vaults. U can control access to ur vaults, and actions allowed to it using IAM policies or vault access policies.

Vault Locks:
U can easily deploy n enforce compliance controls for Glacier vaults with 'vault lock policy'.

Data Retrieval:
U can retrive upto 5% of ur data stored in glacier for free each month, calculated on a pro-rate basis. Above 5%, u incur additional cost. To eliminate those fees, u can set a data retrieval policy on a vault to limit ur retrievals to the free tier or upto a specified data rate.


Glacier VS S3:
Glacier supports 40TB archives vs 5TB objs in S3
Archives in glacier are identified by system-generated archive IDs, whereas S3 allows u to use user-freindly key names.
Archives r automatically encrypted, while encryption at rest is optional in S3.
However, by using Glacier as S3 storage class together with obj lifecycle policies, u can use S3 interface to get most benefits of Glacier without learning a new interface.

IMP:
S3 Standard Storage is designed for 11 nines durability and 4 nines availability of objects over a year. Other storage classes differ.

--------------
Questions:
--------------
1) What ways does S3 differ from block and file storage ?
Ans. Objs r stored in buckets; Objs contain both data n metadata

2) Which is not correct for S3 ?
Ans.
Storing a file system mounted to an EC2 instance
Primary storage for a database
NOTE: S3 cannot be mounted to EC2 instance like a file system and should not serve as primary db storage.

3) Key chars of S3:
Ans.
All objs have a URL
S3 uses REST API
S3 can store unlimited amount of data.
NOTE: Storage in bucket does not need to be pre-allocated

4) Which features can b used to restrict access to S3 data
Ans.
Create a pre-signed URL for an obj
Use S3 ACL on a bucket or obj
Use S3 bucket policy

5) Ur application stores critical data in S3, which must be protected against intentional deletion. How can this data b protected ?
Ans.
Enable versioning on the bucket
Enable MFA Delete on the bucket
NOTE: Cross -region replication will not prevent intentional deletion; neither a lifecycle policy to migrate data to glacier will do it.

6) Ur company stores documents in S3, but wants to minimize cost. Most docs r used actively for only about a month, then much less frequently. However, all
data needs to be avl within minutes when requested. How to meet these requirements ?
Ans.
Migrate the data to S3 Standard-IA (Infreqent Access) after 30 days.
NOTE: Migrating to glacier is cost effective but retreival can take hours
And RRS shud only b used for easily replicated data, not critical data.

7) How is data stored in S3 for high durability
Ans.
Data is automatically replicated within a region.
NOTE: Replication to other regions and versioning is optional. S3 data is not backed up to tape.

8) Which is correct for:
https://bucket1.abc.com.s3.amazonaws.com/folderx/myfile.doc
Ans.
The obj "folderx/myfile.doc" is stored in bucket - "bucket1.abc.com."

9) To have a record of who accessed ur S3 data and from where, u should:
Ans. Enable server logs on the bucket

10) Reason to enable cross-region replication on S3 bucket
Ans.
You have a set of users,customers who can access the second bucket with lower latency
For compliance reasons, u need to store data in a location atleast 300 miles away from the first region.
NOTE: S3 is designed for 11 nines durability for objs in a single region, so 2nd region wont increase durability. And cross region replication does not
protect against accidental deletion

11) Ur company requires that all data sent to external storage be encrypted before being sent. Which S3 encryption solution will meet this ?
Ans.
Client-side encryption with customer-managed keys.
If data is to be encrypted b4 being sent to S3, client side encryption is needed.

12) U have a web app that accesses data stored in S3 bucket. U expect the access to be very read-intensive, with expected request rates upto 500 GETs per second from many clients. How can u increase the performance and scalability of S3 in this case
Ans.
Ensure randomness in the namespace by including a hash prefix to key names.
NOTE: S3 scales automatically, but for request rates above 100 GETS per sec, it helps to make sure there is some randomness in the key space. Replication n loggin will not affect performance or scalability. Using sequential key names could have a negative effect on performance or scalability.


13) What is needed before u enable cross-region replication on S3 bucket
Ans.
Enable versioning on bucket
Create an AWS IAM policy to allow S3 to replicate objs on ur behalf.
NOTE for other options mentioned:  Lifecycle rules migrate data from one storage class to another, not from one bucket to another.

14) Ur company has 100TB of financial records that need to be stored for 7 years by law. Experience shows that any record more than 7 year old is unlikely to be accessed. Which storage plans meets these needs in most effective manner ?
Ans.
Store the data on S3 with lifecycle policies that change the storage class to Glacier after 1 year and delete the obj after 7 years.

15) S3 bucket policies can restrict access to an S3 bucket and objs by:
Ans.
IP Address Range, AWS Account, Objects with a specific prefix

16) S3 is an eventually consistent storage system. For what kinds of operations is it possible to get stale data as a result of eventual consistency ?
Ans.
GET after overwrite PUT (PUT to an existing key)
GET or LIST after a DELETE
NOTE for other options mentioned: S3 provides read-after-write consistency for PUTs to new objs (new key), but eventual consistency for GETs and DELETEs of existing objs (existing keys)

17) What must be done to host a static website in S3 bucket ?
Ans.
COnfigure the bucket for static hosting and specify an index n error document
Create a bucket with same name as the website
Make objs in bucket world-readable
NOTE: S3 does not support FTP transfers, and HTTP does not need to be enabled (its by default enabled)

18) U have valuable media files hosted on AWS and want them to be served only to authenticated users of ur web application. U are concerned that ur content could be stolen n distributed for free. How can u protect ur content
Ans.
Generate pre-signed URLs for content in the web application
NOTE: Pre-signed URLs allow u to grant time-limited permission to download objects from S3 bucket.
 Static website hosting generally requires world-readable access to all content.
 AWS IAM policies do not know who the authenticated users on the web app are.
 Logging can help track content loss, not prevent it.


19) Glacier is suited for long-term archival storage and not suited to data that needs immediate access or short-lived data that is erased within 90 days

20) What is true about Glacier
Ans.
Glacier takes 3 to 5 hrs to restore
Glacier vaults can b locked
Glacier can be used as a standalone service and as an S3 storage class.
NOTE: Glacier stores data in archives, which r contained in vaults. Archives r identified by system-generated archive IDs, not key names.
This stmt is incorrect - Glacier stores data in objects that live in archives.


======================================================================================================================================
Chapter-3 EC2 and EBS
======================================================================================================================================
EC2 is an AWS primary web service that provides resizable compute capacity in the cloud.
Ec2 allows u to acquire compute through launching of virtual servers called instances.
When u launch an instance, u can make use of the compute as u wish.
Bcoz u r paying for the computing power of the instance, u r charged per hour while instance is running. When u stop the instance,u are no longer charged.
Two key concepts r there while launching instances on AWS:
a) amount of virtual hardware dedicated to the instance
b) software loaded on the instance
These 2 dimensions r controlled by the instance type and the AMI.


Instance types:
Instance types defines the virtual hardware supporting an EC2 instance. There are many instnace types avl varying in the following dimensions:
Virtual CPUs (vCPUs)
Memory
Storage (size and type)
Network performance

Instance types r grouped into families based on the ratio of these values to each other. For ex: m4 family provides a balance of compute, memory, and network resources, and is a good choice for many applications.
m4.xlarge instance (current) costs twice as much as the m4.large instance.
Sample instance type families:
c4  -  Compute optimized  -  For workloads requiring significant processing
r3 (current db.r3.large)  -  Memory optimized  -  For memory intensive workloads
i2  -  Storage optimized  -  For workfloads requiring high amount of fast SSD storage
g2  -  GPU based instances - Intended for graphics n general-purpose GPU compute workloads

AWS keeps adding new instance families, check on website for latest if any
Another variable to consider when choosing an instance type is network performance. FOr most instance types, AWS publishes a relative measure of network performance: low, moderate or high.
Some instance types specify a network performance of 10 GBPS. The network performance increases within a family as the instance type grows.

For workloads requiring high network performance, many instance types support enhanced networking. Enhanced networking reduces the impact of virtualization on network performance by enabling a capability called Single Root I/O Virtualization (SR-IOV). This results in more packets per second (PPS), lower latency and less jitter.
Enhanced networking is avl only for instances launched in Amazon VPC.


Ideally, these are the configurations avl for m4.xlarge instances:
vCPU (Virtual CPU): 4 cores
Processor: 2.3 GHz Intel Xeon E5 processor
Memmory: 16 GB
Storage: EBS-only
Dedicated EBS Bandwidth : 750 MBPS
Network Performance: High
Support for Enhanced networking: Yes
------------------------------------
Configuration of C5 (Compute Optimized) instances: example: c5.xlarge

vCPU (Virtual CPU): 4 cores
Processor: 3 GHz Intel Xeon Platinum processor
Memmory: 8 GB
Storage: EBS-only
Dedicated EBS Bandwidth : 3500 MBPS
Network Performance: Upto 10 GBPS
Support for Enhanced networking: Yes
-----------------------------------------------------
Configuration of r5 (Memory Optimized) instances: example: r5.xlarge

vCPU (Virtual CPU): 4 cores
Processor: 3.1 GHz Intel Xeon Platinum 8000 processor
Memmory: 32 GB
Storage: EBS-only
Dedicated EBS Bandwidth : 3500 MBPS
Network Performance: Upto 10 GBPS
Support for Enhanced networking: Yes


-----------------------------------------------------

NOTE:
Instance family of h1., i3., d2. are storage optimized instances.
Instance family of r5,r4,x1e,x1,z1d. are memory optimized instances.
Instance family of c5, c4, c5n. are compute optimized instances.
Instance family of t3,t2,m5,m4,m5a, t3a. are general purpose instances (provide a balance of compute,memory,network resources) - and a good choice for many applications.
Instance family of g3,f1,p2,p3. are for for accelerated computing GPU instances (basically needed for high-graphics outcome)


AMIs (AMazon Machine Images):
AMI define the initial software that will be on an instance when it is launched. AMI defines every aspect of the software state at instance launch, including:
OS and its configuration
Initial state of any patches
Application or system software
All AMIs r based on x86 OSs, either linux or windows.
There r four sources of AMIs:
1) Published by AwS
2) AWS MarketPlace
3) Generated from Existing instances - commonly used
4) Uploaded Virtual Servers


Addressing an instance:
There r many ways an instance may be addressed over the web upon creation:
1) Public DNS Name:
When u launch an instance, AWS creates a DNS name that can b used to access the instance. THis DNS name is generated automatically and cannot be specified by customer. This DNS name persists only while the instance is running n cannot be transferred to another instance
2) Public IP:
A launched instance can have a public IP assigned. Its assigned from the addresses reserved by AWS and cannot be specified. This IP is unique on the internet, persists only while the instance is running and cannot be transferred to another instance
3) Elastic IP:
Elastic IP address is an address unique on the internet that u reserve independently and associate with an EC2 instance. Its diff from Public IP in: This IP address persists until the customer releases it and is not tied to the lifetime or state of an instance. Bcoz it can b transferred to a replacement instance in case of an instance failure, it is a public address that can be shared externally without coupling clients to a particular instance.

ANother methods of addressing an instance are : 4) Private IP address  5) Elastic Network INterfaces (ENIs) - but discussed in chapter 4 - VPCs


Initial Access:
EC2 uses public key cryptography o encrupt n decrypt login info. Public key cryptography uses a public key to encrypt a piece of data and an associated private key to decrypt the data. These two keys together are called a key pair.
Key pairs can be created thru AWS Mgmt console, CLI, or API or customers can upload their own key pairs.
AWS stores the public key and the private key is kept by the customer. Private key is essential to acquire secure access to an instance for the 1st time.

Store ur private keys securely. When EC2 launches a linux instance, the public key is stored in /home/SSO/.ssh/authorized_keys file on the instance and an initial user is created. The initial user can vary depending on the OS. For ex: initial user is ec2-user. Initial access to the instance is obtained by using ec2-user and the private key to log in via SSH. At this point, u can configure other users and enroll in a directory such as LDAP.

(similar concept that u use for logging in via WinSCP/Putty to DEV and Prod envts; and
same concept of GlobalScape and SAP team sharing their public key,  we installed on our system and they access our servers with their private keys)

When lauching a windows instance, EC2 generates a random pwd for the local admin account and encrypts the pwd using the public key.
Initial access to the instance is obtained by decrypting the pwd with private key, either in console (putty) or through API. The decrypted pwd can be used to login to the instance with local admin account via RDP. At this point, u can create other local users and/or connect to an AD domain.
Best practice is to change the initial local admin pwd.

Virtual Firewall Protection:
AWS allows u to control traffic in and out of ur instances thru virtual firewalls called security groups. SG allow u to control trafiic based on port, protocol and source/destination.
SG r associated with instances when they are launched. Every instance must have atleast 1 SG but can have more.
A SG is default deny i.e it does not allow any traffic that is not explicitly allowed by a SG rule.
A rule is defined by 3 attributes :
port (ex: 80 for HTTP traffic)
protocol (ex: TCP or UDP)
Source/destination: Identifies the other end of the communication, the src for incoming traffic rules, or destination for outgoing traffic rules. The src/destination cn be defined in 2 ways: 1) CIDR block (x.x.x.x/x style defining a range of IP addresses)  2) SG - Includes any instance that is associated with the give SG. This helps prevent coupling SG rules with specific IP Addresses.

When an instance is associated with multiple SG, the rules are aggreegated and all traffic allowed by each SG is allowed. For ex: If SG 'A' allows RDP traffic from 72.58.0.0/16 and SG 'B' allows HTTP and HTTPS traffic from 0.0.0.0/0 and ur instance is associted with both SG, then both RDP and HTTP/HTTPS traffic will be allowed in to your instance.

A SG is a stateful firewall (i.e outgoing msg is remembered so that the response is allowed thru the SG without an explicit inbound rule being required)
SG r applied at instance level (opposite to traditional on-premise firewall that protects at the perimeter). The advantage is that: Instead of having to breach a single perimeter to access all instances in ur SG, an attacker would have to breach the SG repeatedly for each individual instance.

Lifecycle of instances:
1) Bootstrapping:
In cloud, u can script virtual hardware mgmt in a way that is not possible with on-premise hardware. There has to be some way to configure instances n install apps programatically when an instance is launched. THe process of providing code to be run on an instance at launch is called bootstrapping.
One of the parameters when an instance is launched is a string value called UserData. This string is passed to OS to be executed as part of the launch process the 1st time instance is booted. On Linux instances, it can be a shell script, for windows, it can be a batch script or powershell script.
These scripts can perform:
Applying patches n updates to OS
Enrolling in a directory service
Installing app s/w
Copy a long script or program from storage to be run on the instance
Installing chef or puppet n assigning the instance a role so the config mgmt s/w can configure the instance.
NOTE: UserData is stored with the instance and is not encrypted, so it is imp to not include any secrets like pwd or keys in UserData.

2) VM Export/Import:
In addition to importing virtual instances as AMIs, VM import/export enables u to easily import VMs from ur existing envt as an EC2 instance and export them back to ur on-premise envt. U can only export previously imported EC2 instances. Instances launched within AWS from AMIs cannot be exported.

3) Instance metadata:
Instance metadata is data about ur instnce that u can use to configure or manage the running instance. IM includes attributes like:
Associated SG
Instance ID
Instance type
AMI used to launch the instance


Managing instances:
When no of instances begin to rise, it becomes difficult to keep track of them. TAGS can help u manage the instances.
Tags are key/value pairs u can associate with ur instance or other service. Tags can b used to identify attributes of an instance like project, envt (dev,test and so on), billing info, etc. U can apply upto 10 tags per instance.
Sample tags:
Key Value
Project TimeEntry
Envt Production
BillingCode 4004

Monitoring instances:
AWS CloudWatch provides monitoring n alerting for EC2 instances, and also other AWS infrastructure.

Modifying instance:
There are many aspects of an instance that can be modified after launch.
a) Instance Type:
If the compute needs prove to be higher or lower than expected, instances can b changed to diff size more appropriate to the workload. Instances can be resized using AWS Mgmt console, CLI  or API.
To resize an instance, set the state to Stopped. Choose 'Change instance type' and select desired instance type. Restart the instance.
b) SG:
If an instance is running in VPC, u can change which SG are associated with an instance while instance is running. For instances outside of VPC (called EC2-Classic), the association of the SG cannot be changed after launch.

Termination protection:
In order to prevent termination via AWS Mgmt console, or CLI or API, the 'termination protection' can be enabled for an instance.
When enabled, calls to terminate the instnace will fail until termination protection is disabled. This prevents accidental termination thru human error.
NOTE: This just protects termination calls from AWS mgmt console, CLI or API. It doesnot prevent termination triggered by OS shutdown command, termination from an Auto Scaling group or termination of a Spot instance due to Spot price changes

Pricing Options:
U r charged for EC2 instances for each hour tht they r in a running state, but the amount u r charged per hour can vary based on 3 pricing options:
On-demand instances , Reserved instances , Spot instances
On-Demand instances:
Price per hour for on-demand instances is published on aws website. This is the most flexible pricing option, as it requires no up-front commitment, and the customer has control over when the instance is launched and when it is terminated. It is the least cost effective of the 3 pricing options per compute hour, but its flexibility allows customer to save by provisioning a variable level of compute for unpredictable workloads.
Reserved Instances:
Enables customers to make capacity reservations for predictable workloads. By using reserved instances for these workloads, customers can save upto 75% over the on-demand hourly rate. When purchasing a reserved instance, customer specifies the instance type n AZ for that reserved instance and achieves a lower effective hourly price for that instance for the duration of the reservation. Additional benefit is: Capacity in the AWS data centers is reserved for that customer.
Two factors determine the cost of the reservation: 1) Term commitment  2) Payment option
Term commitment is the duration of the reservation and can be either one or three years. The longer the commitment, bigger the discount.
There are 3 diff payment options for Reserved instances:
All upfront : Pay for entire reservation upfront. There is no monthly charge for the customer during the term.
Partial upfront: Pay a part of the reservation charge upfront and the rest in monthly installments for the duration of the term
No upfront: Pay entire reservation charge in monthly installments for the duration of the term
Amount of discount is greater in more the customer pays upfront.
Example: We assume a 3 year reservation on the effective hourly cost of an m4.2xlarge instance. Cost of running 1 instance continuously for 3 years (i.e 26,280 hours) at both pricing options is shown below:
Pricing Effective Hourly Rate Total 3-year cost
On-Demand $0.479/hour $0.479/hour * 26280 hours = $12,588.12
3-year upfront reservation $4694/26280 hours=$0.1786/hour $4694

So savings is almost 63%.

When ur computing needs change, u can modify ur reserved instances and continue to benefit from ur capacity reservation. Modification does not change the remaining term of your reserved instances; their end dates remain the same. There is no fee, and u do not receive any new bills or invoices.
You can do modification like:
Switch AZ within the same region
Change between EC2-VPC and EC2-Classic
Change instance type within the same instance family (Linux instances only)
Spot Instances:
For workloads that r not time critical and are tolerant of interruption, Spot instances offer greatest discount. Here, customers specify the price they are willing to pay for a certain instance type. When customer's bid price is above the current Spot price, the customer will receive the requested instance(s).
These instances will operate like all other EC2 instances, and customer will only pay the Spot price for the hours that instance(s) run.
The instances will run until:
Customer terminates them.
Spot price goes above the customer's bid price.
There is not enough unused capacity to meet the demand for Spot instances.
If EC2 needs to terminate a Spot instance, instnce will receive a termination notice providing a 2-min warning prior to EC2 terminating the instance.
Bcoz of the possibility of interruption, Spot instances should only b used for workloads tolerant of interruption.

Architectures with different pricing models:
For instance, a website that averages 5000 visits per day, but ramps upto 20000 per day during peak periods, may purchase two Reserved instances to handle the average traffic, but depend on On-Demand instances to fulfill compute needs during peak times.

Tenancy options:
There are many tenancy options for EC2 instances that can help customers achieve security and compliance goals:
Shared Tenancy:
Default tenancy model for all EC2 instances. Shared tenancy means a single host machine may house instances from different customers. As AWS isolates instances from other instances on the same host, this is a secure tenancy model.
Dedicated Instances:
Dedicated instances run on hardware that's dedicated to a single customer. As a customer runs more Dedicated instances, more underlying hardware may be dedicated to their account. Other instances in the account (not designated as dedicated) will run on shared tenancy and will be isolated at h/w level from dedicated instances in the account.
Dedicated Host:
EC2 dedicated host is a physical server with EC2 instance capacity fully dedicated to a single customer's use. Dedicated hosts can help u address licensing requirements and reduce costs by allowing u to use your existing server-bound software licenses. Customer has complete control over which specific host runs an instance at launch. This differs from Dedicated Instances in that a Dedicated Instance can launch on any hardware that has been dedicated to the account.

Placement groups:
Placement group is a logical grouping of instances in a single AZ. PG enable apps to participate in a low-latency, 10 Gbps network.
PGs r recommended for apps that benefit from low network latency, high network throughput or both. Remember that this represents netowkr connectivity between instances. To fully use this netowrk performance for ur placement group, choose an instance type that supports enhanced networking and 10 Gbps network performance.

Instance Stores:
An instance store (also called ephemeral storage) provides temporary block-level storage for ur instance. This storage is located on disks that r physically attached to the host computer. This storage is ideal for temp storage of info that changes freqeuntly (like buffers, caches, scratch data, other
temp content), or for data that is replicated across a fleet of instances, like load-balanced pool of web servers.
The size n type of instance stores avl with EC2 instance depend on the instance type. Storage available with various instance types range from 0 to 24 2TB instance stores. The instance type also determines the type of h/w for the instance store volumes. Some provide HDD (Hard Disk Drive) instance stores, other instance types use SSDs (Solid State Drives) to deliver high random I/O performnce.
Instance stores r included in the cost of EC2 instance. Key aspect if instance stores is that they are temporary.
Data in instance store is lost when:
Underlying disk drive fails
Instance stops (data will persist if instance reboots)
Instance terminates.
So dont rely on instance stores for valuable, long-term data. Instead, build redundancy via RAID or use file system that supports redundancy and fault tolerance like Hadoop's HDFS. Backup the data to more durable storage like S3 or EBS.



Amazon EBS (Elastic Block Store)
While instance stores are economical way to fulfill appropriate workloads, their limited persistance makes them ill-suited for many other workloads.
For workloads requiring more durable block storage, Amazon provides EBS.
EBS Basics:
EBS provides persistent block-level storage volumes for use with EC2 instances.
Each EBS volume is replicated automatically within its AZ to protect u from component failure, offering high avl and durability.
EBS volumes r avl is a variety of types that differ in performance characteristics and price.
Multiple EBS volumes can b attached to a single EC2 instance, although a volume can only be attached to a single instance at a time.
Types of EBS volumes
Magnetic volumes:
MV have lowest performance characteristics, So, they cost lowest per GB. They r excellent cost-effective solution for appropriate workloads.
A MV can range in size from 1 GB to 1 TB and will average 100 IOPS, but has ability to burst to 100's of IOPS.
Best suited for:
Workloads where data is accessed infreqeuntly
Sequential reads
Situations where low-cost storage is a requirement
MV are billed based on the amount of data space provisioned, regardless of how much data u actually store on the volume.
General-Purpose SSD:
Offer cost effective storage that is ideal for a broad range of workloads.
GP SSD volume can range in size from 1 GB to 16 TB and provides a performance of 3 IOPS per GB provisioned, capping at 10,000 IOPS.
FOr ex: If u provision a 1 TB volume, u can expect a performance of 3000 IOPS. A 5 TB volume will not provide a 15,000 IOPS baseline, as it would hit the limit (cap) at 10,000 IOPS.
GP-SSD volumes under 1 TB also feature the ability to burst to upto 3000 IOPS for extended periods of time.
For ex: If u have a 500 GB volume u can expect a baseline of 1500 IOPS. Whenever u r not using these IOPS, they are accumulated as I/O credits. When ur volume then has heavy traffic, it will use the I/O credits at a rate of up to 3000 IOPS until they are used up. At that point, ur performance reverts to 1500 IOPS.
At 1 TB, the baseline performance of the volume is already 3000 IOPS, so bursting behaviour does not apply.
GP-SSD volumes are billed based on the amount of data space provisioned, regardless of how much data u store on the volume.
They are suited for a wide range of workloads where the very highest disk performance is not critical like:
System boot volumes
Small to medium sized DBs
Devt and test envt
Provisioned IOPS SSD:
PI SSD volumes r designed to meet the needs of I/O intensive workloads, particularly db workloads that r sensitive to storage performance and consistency in random access I/O throughput.
These r the most expensive EBS volume type per GB, they provide the highest performance of any EBS volume type.
PI SSD volume can range in size from 4 GB to 16 TB.
When u provision a PI SSD volume, u not just specify the size, but also the desired IOPS, upto the lower of the max of 30 times the no. of GB of the volume, or 20,000 IOPS.
U can stripe multiple volumes together in a RAID 0 configuration for larger size and greater performance.
EBS delivers within 10% of the provisioned IOPS performance 99.99% of the time over a given year.
Pricing is based on the size of the volume and amount of IOPS reserved.
Cost per GB is slightly more than GP-SSD volumes and is applied based on the size of the volume, not the amount of volume used to store data.
An additional monthly fee is applied on the no. of IOPS provisioned, whether they are consumed or not.
PI SSD provide high performance and best suited for:
Critical business apps that need sustained IOPS performance
Large DB workloads

Comparison:
Characteristic GP-SSD Provisioned IOPS SSDs Magnetic
Use cases System boot volumes Critical biz apps that need high performance Cold workloads where data is infrequently accessed
Virtual desktops or > 10,000 IOPS/160 MB of throughput per vol Where lowest storage cost is imp
Small-to-medium sized db Large db workloads
Devt and test envts

Volume size 1 GB - 16 TB 4 GB - 16 TB 1 GB - 1 TB
Max throughput 160 MB 320 MB 40-90 MB

IOPS performance Baseline performance Consistently performs at provisioned level, Averages 100 IOPS, with ability to burst upto
of 3 IOPS/GB (upto upto 20,000 IOPS max. 100's of IOPS/GB
10,000 IOPS) with ability
to burst to 3000 IOPS
for vol under 1 TB.

AWS has released two new HDD volumes types: 1) Throughput-Optimized HDD  2) Cold HDD.
With time, its expected the magnetic vol type will be decommissioned.
Throughput-Optimized HDD volumes r low-cost HDD volumes designed for frequent access, throughput intensive workloads like big data, data warehouses, and log processing. Volumes cn be upto 16 TB with max IOPS of 500 and max throughput of 500 MB/s. These volumes r less expensive than GP-SSD.
Cold HDD r designed for less freqeuntly accessed workloads, like colder data requiring fewer scans per day. Volumes can be upto 16 TB with max IOPS of 250 and max throughput of 250 MB/s. These volumes are less expensive than Throughput-Optimized HDD volumes.

Amazon EBS-Optimized Instances:
It is imp to use Amazon EBS-Optimized instances to ensure that the EC2 instance is prepared to take advantage of the I/O of the EBS Volume.
An EBS-Optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O. This optimization provides the best performance for ur EBS volumes by minimizing contention between EBS I/O and other traffic from ur instance. When u select Amazon EBS-Optimized for an instance, u pay an additional hourly charge for that instance.

Backup/Recovery (Snapshots):
U can backup the data stored on ur EBS volumes, by taking point-in-time snapshots. Snapshots r incremental backups, which means that only the blocks on the deivce that have changed since ur most recent snapshot are saved.
U can take snapshot through any way: CLI , AWS Mgmt console, API, setting up schedule of regular snapshots.
Data for the snapshot is stored using S3 technology. The action of taking snapshot is free; u pay only for the storage costs for the snapshot data.
When u request a snapshot, the point-in-time snapshot is created immediately and the volume may continue to be used, but the snapshot may remain in pending status until all modified blocks are trasferred to S3.
While snapshots r stored using S3, they r stored in AWS-Controlled storage and not in ur account's S3 buckets. It means u cannot manipulate them like other S3 objects.
Snapshots r constrained to the region in which they are created, means u can use them to create new volumes only in the same region. If u need to restore a snapshot in a different region, u can copy a snapshot to another region.

Creating a volume from a Snapshot:
To use a snapshot, u can create a new EBS volume from the snapshot. When u do this, volume is created immediately but the data is loaded lazily. It means that the volume can be accessed upon creation, and if the data being requested has not yet been restored, it will be restored upon first request. Because of this, it's a best practice to initialize a volume created from a snapshot by accessing all blocks in the volume.
Snapshots can also be used to increase the size of an EBS volume. To do this, take a snapshot of the volume, then create a new volume of the desired size from the snapshot. Replace the original volume with new volume.

Recovering volumes:
Bcoz EBS volumes persist beyond the lifetime of an instance, its possible to recover data if an instance fails. If an EBS backed instance fails and there is data on the boot drive, its straightforward to detach the volume from the instance. Unless the 'DeleteOnTermination' flag for the volume is set to false, the volume shud be detached b4 the instance is terminated. The volume can then be attached as a data volume to another instance.

Encryption Options:
EBS offers native encryption on all volume types.
When u launch an encrypted EBS volume, Amazon uses the AWS KMS to handle key mgmt. A new master key will be created and unless u select a master key that u created separately in the service. Ur data n associated keys r encrypted using AES-256 algo. The encryption occurs on the servers that host EC2 instances, so the data is actually encrypted in transit between the host and the storage media and also on the media.
Encryption is transparent, so all data access is the same as unencrypted volumes, and u can expect the same IOPS performance on encrypted volumes as u would with unencrypted volumes, with a minimal effect on latency. Snapshots that r taken from encrypted volumes r automatically encrypted, as are volumes that are created from encrypted snapshot.

Exam essentials:
1) To launch an instance, u must specify an AMI, which defines the s/w on the instance at launch, and an instance type, which defines the virtual hardware supporting the instance (memory, vCPUs, and so on).
2) Spot instances r best suited for workloads that can accomodate interruption. Reserved instances r best for consistent, long-term compute needs. On-demand instances provide flexible compute to respond to scaling needs.


======================================================================================================================================
Chapter-4 VPC - Virtual Private Cloud
======================================================================================================================================
VPC is a custom defined virtual network within the AWS Cloud.
VPC is the networking layer for EC2, and it allows u to build ur own virtual network within AWS. You control various aspects of VPC including IP address range, creating ur subnets, route tables, network gateways, security settings.
In a region, u can create multiple VPCs, and each VPC is logically isolated even if it shares its IP address space.

When u create a VPC, u must specify the IPv4 address range by choosing a CIDR (Classless Inter-Domain Routing) block like 10.0.0.0/16.
The address range of VPC cannot be changed after VPC is created.
VPC address range may be as large as /16 (65,536 available addresses) or as small as /28 (16 available addresses) and shud not overlap any other network with which they are to be connected.

VPC service was released after the EC2 service, so thr are 2 diff networking platforms avl with AWS - EC2-Classic and EC2-VPC. AWS accounts  created after Dec 13 only support launching instances using EC2-VPC.
AWS accounts that support EC2-VPC will have a default VPC created in each region with a default subnet created in each AZ. The assigned CIDR block of the VPC will be 172.31.0.0/16.
VPC consists of:
Subnets
Route tables
DHCP (Dynamic Host Configuration Protocol) option sets
SG
Network ACLs
VPC has following optional components:
Internet Gateways (IGWs)
Elastic IP addresses (EIP)
Elastic Network Interfaces (ENIs)
Endpoints
Peering
Network Address Translation (NATs) instances and NAT gatways
Virtaul Private Gateways (VPG), Customer Gateways (CGWs) and VPNs

--------
Queries
--------
1) Diff between Region, AZ and Data center
A Region is a geographic location
2) What is Points of presence
Ans. Its basically where AWS's presence exists on the globe. Mainly used in context with Amazon CloudFront (CDN).
POPs are used for both CloudFront to deliver content to end users at high speeds, and Lambda@Edge to run lambda functions with low latency.

3) Diff between storage mechanism of S3 and EBS (object storage vs Block storage)
4) Dif between DynamoDB and RDS and RedShift


AWS Questions and Learning:

Two types of licensing options avl for using Amazon RDS for oracle ?

What types of licensing options are available with Amazon RDS for Oracle?

There are two types of licensing options available for using Amazon RDS for Oracle:

Bring Your Own License (BYOL): In this licensing model, you can use your existing Oracle Database licenses to run Oracle deployments on Amazon RDS. To run a DB instance under the BYOL model, you must have the appropriate Oracle Database license (with Software Update License & Support) for the DB instance class and Oracle Database edition you wish to run. You must also follow Oracle's policies for licensing Oracle Database software in the cloud computing environment. DB instances reside in the Amazon EC2 environment, and Oracle's licensing policy for Amazon EC2 is located here.
License Included: In the "License Included" service model, you do not need separately purchased Oracle licenses; the Oracle Database software has been licensed by AWS. "License Included" pricing is inclusive of software, underlying hardware resources, and Amazon RDS management capabilities.


A user has created a VPC with public n private subnets using VPC wizard. VPC has CIDR 20.0.0.0/16.
Private subnet uses CIDR 20.0.0.0/24.
NAT instance id is i-a12345.
Which below entries are required in main route table attached with the private subnet to allow instances to connect to internet

EBS snapshots occur:
Asynchronously

An existing application stores sensitive info on a non-boot amazon EBS data volume attached to an EC2 instance. Which of the following approaches
would protect sensitive data on EBS volume

A media cmpny produces new video files on-premises  every day with total size of around 100GB after compression. All files have a size of 1-2 GB and need to be uploaded to S3 every night in a fixed time window between 3 am to 5 am. Current upload takes 3 hours although less than half of the avl bandwidth is used.
What steps would ensure  that the file uploads are able to complete in the alloted time window

Your company is getting ready to do a major announcement of a social media site on AWS. Website is running on EC2 instances deployed on mutiple AZ with a multi-AZ RDS MySQL Extra Large DB instance.
Site performs high no of small reads and writes per sec and relies on eventual consistency model.
After comprehensive tests u discover that thr is read contention on RDS mysql. Which r the best approaches to meet these reqts
Options:
Increase RDS Mysql instance size and implement provisioned IOPS
Add RDS mysql read replica in each AZ
Implement sharding to distribute load to multiple RDS Mysql instance
Deploy elasticache in-memory cache running in each AZ

A user has created a VPC with public n private subnets using VPC wizard. VPC has CIDR 20.0.0.0/16.
Private subnet has CIDR 20.0.0.0/24. Which entries are required in main route tabe to allow inst in VPC to commun with each other.

What will be the status of snapshot until the snapshot is complete

User has created a VPC with CIDR 20.0.0.0/24. User has used all the IPs of CIDR and wants to increase the size of the VPC. User has two subnets:
public (20.0.0.0/28 and private 20.0.1.0/28).
How can he change the size of VPC

when u put objs to S3, what is the indication that obj was successfully stored

when shud i choose provisioned IOPS over std RDS storage

U would like to create a mirror img or ur prod envt in another region for DR purpose. Which AWS resources do no need to be recreated in 2nd region

A company is deploying a new 2-tier web application in aws. Company has limited staff and requires high avl. and app requires complex queries and table joins. Which config provides the solution for company's reqts

A company is building a 2-tier app to server dynamic transaction-based content. Data tier is levaraging an Online Transactional Processing (OLTP) db.
What services shud u levarage to enable an elastic and scalable web tier

ur web app front end consists of multiple EC2 inst behind ELB. U configured ELB to perform health checks on these ec2 inst, if an inst failes to pass health chks which stmt will be true

u r designing a web app that stores static assets in S3 bucket. U expect this bucket to immediately receive over 150 PUT req per sec. Wht shud u do to ensure optimal performance

Which approches provides lowest cost for EBS snapshots while giving u the ability to fully restore data

When EC2 inst backed by an S3-backed AMI is terminated, what happens to data on the root volume

Ur business is buiding a new app that will store its entire cust db on RDS mysql db; and will have various apps and users that will query that data for diff purpses. Large analysis jobs on the db are likely to cause other apps to not be able to get the query results they need to, before time out.
Also, as ur data grows, these analysis jobs will start to take more time; increasing the negative effect on the other apps.
How do u solve the contention issues between diff workloads on the same data ?

Can i detach the primary (eth0) network interface when inst is running or stopped.


------
FYI (from tutorialspoint)
------
Amazon S3 stores data as objects within resources called buckets. The user can store as many objects as per requirement within the bucket, and can read, write and delete objects from the bucket.

Amazon EBS is effective for data that needs to be accessed as block storage and requires persistence beyond the life of the running instance, such as database partitions and application logs.

Amazon EBS volumes can be maximized up to 1 TB, and these volumes can be striped for larger volumes and increased performance.

Amazon EBS currently supports up to 1,000 IOPS per volume. We can stripe multiple volumes together to deliver thousands of IOPS per instance to an application.

On AWS:
network devices like firewalls, routers, and load-balancers for AWS applications no longer reside on physical devices and are replaced with software solutions.

Multiple options are available to ensure quality software solutions. For load balancing choose Zeus, HAProxy, Nginx, Pound, etc. For establishing a VPN connection choose OpenVPN, OpenSwan, Vyatta, etc.

What is Route 53 ?

When u create account in AWS, AWS assigns two unique IDs to each AWS account. 1) AWS account id 2) conical user ID
AWS Account is a 12 digit no. used to construct Amazon Resource Names (ARN). This ID helps to distinguish our resources from resources in other AWS accounts.

Conical String User ID: It is a long string of alphanumeric characters like 1234abcdef1234. This ID is used in Amazon S3 bucket policy for cross-account access, i.e. to access resources in another AWS account.

Account alias - its the URL for ur sign-in page and contains the account ID by default.
You can access your (already created alias page) by: https://console.aws.amazon.com/iam/
Example account alias - https://332227776006.signin.aws.amazon.com/console


Users can create/edit alias, or enable MFA (Multi Factor Authentication) from above url only - https://console.aws.amazon.com/iam/

AWS Identity & Access Management (IAM)
IAM is a user entity which we create in AWS to represent a person that uses it with limited access to resources. Hence, we do not have to use the root account in our day-to-day activities as the root account has unrestricted access to our AWS resources.

5 users created from   https://332227776006.signin.aws.amazon.com/console
praveenverma
mohitsingh
ashilashkari
pankajverma

Pwd: Alfresco@123
NOTE: ALL above users have been created with default pwd as well as they have their own access key and secret key for logging in through putty/WinSCP, and
access AWS through CLI/Linux command prompt.

All users added to previously created group - Alfresco.
If no group exists, create a new group and 'Attach policy to it and add users to it.


Every group and user created has a unique ARN
Ex: Alfresco ARN - arn:aws:iam::332227776006:group/Alfresco
Mohit singh ARN - arn:aws:iam::332227776006:user/mohitsingh

Same with policies of AWS -
Ex: arn:aws:iam::aws:policy/aws-service-role/AmazonRDSBetaServiceRolePolicy (OOTB policy)
If u create ur own custom policy, it will have a unique ARN.


EC2 instances can be resized and the number of instances scaled up or down as per our requirement. These instances can be launched in one or more geographical locations or regions, and Availability Zones (AZs). Each region comprises of several AZs at distinct locations, connected by low latency networks in the same region.

VPC - new VPC can be created from https://console.aws.amazon.com/vpc/
U can create new VPC or choose default VPC.


STEPS TO USE EC2:
Step 1 − Sign-in to AWS account and open IAM console by using the following link https://console.aws.amazon.com/iam/.

Step 2 − In the navigation Panel, create/view groups and follow the instructions.

Step 3 − Create IAM user. Choose users in the navigation pane. Then create new users and add users to the groups.

Step 4 − Create a Virtual Private Cloud using the following instructions.

Open the Amazon VPC console by using the following link − https://console.aws.amazon.com/vpc/

Select VPC from the navigation panel. Then select the same region in which we have created key-pair.

Select start VPC wizard on VPC dashboard.

Select VPC configuration page and make sure that VPC with single subnet is selected. The choose Select.

VPC with a single public subnet page will open. Enter the VPC name in the name field and leave other configurations as default.

Select create VPC, then select Ok.
Step 5 − Create WebServerSG security groups and add rules using the following instructions.

On the VPC console, select Security groups in the navigation panel.

Select create security group and fill the required details like group name, name tag, etc.

Select your VPC ID from the menu. Then select yes, create button.

Now a group is created. Select the edit option in the inbound rules tab to create rules.

Step 6 − Launch EC2 instance into VPC using the following instructions.

Open EC2 console by using the following link − https://console.aws.amazon.com/ec2/

Select launch instance option in the dashboard.

A new page will open. Choose Instance Type and provide the configuration. Then select Next: Configure Instance Details.

A new page will open. Select VPC from the network list. Select subnet from the subnet list and leave the other settings as default.

Click Next until the Tag Instances page appears.

Step 7 − On the Tag Instances page, provide a tag with a name to the instances. Select Next: Configure Security Group.

Step 8 − On the Configure Security Group page, choose the Select an existing security group option. Select the WebServerSG group that we created previously, and then choose Review and Launch.

Step 9 − Check Instance details on Review Instance Launch page then click the Launch button.

Step 10 − A pop up dialog box will open. Select an existing key pair or create a new key pair. Then select the acknowledgement check box and click the Launch Instances button.

-----------------------
Auto-Scaling:
As the name suggests, auto scaling allows you to scale your Amazon EC2 instances up or down automatically as per the instructions set by the user. Parameters like minimum and maximum number of instances are set by the user. Using this, the number of Amazon EC2 instances you’re using increases automatically as the demand rises to maintain the performance, and decreases automatically as the demand decreases to minimize the cost.

Auto Scaling is particularly effective for those applications that fluctuate on hourly, daily, or weekly usage. Auto Scaling is enabled by Amazon CloudWatch and is available at no extra cost. AWS CloudWatch can be used to measure CPU utilization, network traffic, etc.

Elastic Load Balancing
Elastic Load Balancing (ELB) automatically distributes incoming request traffic across multiple Amazon EC2 instances and results in achieving higher fault tolerance. It detects unfit instances and automatically reroutes traffic to fit instances until the unfit instances have been restored in a round-robin manner. However, if we need more complex routing algorithms, then choose other services like Amazon Route53.


ELB Features:
ELS is designed to handle unlimited requests per second with gradually increasing load pattern.

We can configure EC2 instances and load balancers to accept traffic.

We can add/remove load balancers as per requirement without affecting the overall flow of information.

It is not designed to handle sudden increase in requests like online exams, online trading, etc.

Customers can enable Elastic Load Balancing within a single Availability Zone or across multiple zones for even more consistent application performance.

----------------------------------------
Amazon Workspaces:
Amazon WorkSpaces is a fully managed desktop computing service in the cloud that allows its customers to provide cloud-based desktops to their end-users. Through this the end users can access the documents, applications, and resources using devices of their choice such as laptops, iPad, Kindle Fire, or Android tablets. This service was launched to meet its customers rising demand for Cloud based 'Desktop as a Service' (DaaS).

How It Works?
Each WorkSpace is a persistent Windows Server 2008 R2 instance that looks like Windows 7, hosted on the AWS cloud. Desktops are streamed to users via PCoIP and the data backed up will be taken on every 12 hours by default.

User Requirements
An Internet connection with TCP and UDP open ports is required at the user’s end. They have to download a free Amazon WorkSpaces client application for their device.

YOu can create aws workspace from https://console.aws.amazon.com/workspaces/
Steps involves: 1) Selecting VPC 2) Creating AD 3) Create workspace (Name,Title prompt) 4) Create users 5) Test workspace (by downloading n installing amazon client applications from  https://clients.amazonworkspaces.com/

--------------------------------------
AWS Lambda:
-------------------------------------
AWS Lambda is a responsive cloud service that inspects actions within the application and responds by deploying the user-defined codes, known as functions. It automatically manages the compute resources across multiple availability zones and scales them when new actions are triggered.
i.e if you want some custom code to get executed automatically on trigger of an event, then u can do it using Lambda (like rule in Alfresco)

AWS Lambda supports the code written in Java, Python and Node.js, and the service can launch processes in languages supported by Amazon Linux (includes Bash, Go & Ruby).

recommended tips while using AWS Lambda:
Write your Lambda function code in a stateless style.

Never declare any function variable outside the scope of the handler.

Make sure to have a set of +rx permissions on your files in the uploaded ZIP to ensure Lambda can execute code on your behalf.

Delete old Lambda functions when no longer required.

Steps to configure Lambda:
1) Select blueprint (optional) - can skip this step
2) Create Lambda function (you can write node.js/java/python, etc code here)
3) After creating lambda function, select 'Event Sources' tab. Add atleast one source to the Lambda function to work.
4) Select stream tab and associate it with Lambda function.
5) Now, for ex: if u have kept source event as adding adding entry in DynamoDB, then add some entries in the table. When the entry gets added n saved, then
lambda service shud trigger the function. It can be verified using lambda logs.
6) To verify lambda logs, select Lambda service and click Monitoring tab. Then click View Logs in CloudWatch.

Benefits of Lambda:
1) Lambda tasks need not to be registered like Amazon SWF activity types.
2) We can use any existing Lambda functions that you’ve already defined in workflows.
3) Lambda functions are called directly by Amazon SWF; there is no need design a program to implement and execute them.
4) Lambda provides us the metrics and logs for tracking function executions.


Q: What is Amazon SWF?
Amazon Simple Workflow Service (SWF) is a web service that makes it easy to coordinate work across distributed application components. Amazon SWF enables applications for a range of use cases, including media processing, web application back-ends, business process workflows, and analytics pipelines, to be designed as a coordination of tasks. Tasks represent invocations of various processing steps in an application which can be performed by executable code, web service calls, human actions, and scripts.

-------------------------------------------------------------------------

My AWS:
----------------------------------------------------------Root account:
username: sanket2008@gmail.com
pwd: $...@
IAM Link:https://332227776006.signin.aws...
User: administrator
pwd: Admin@myaws
Group name: Alfresco

My Key Pair Name:1) sanket2008@gmail.com2sanket2008@yahoo.co.in

Created instance id: i-05eb4188c64c88853 Public DNS Name (IPv4) : ec2-13-126-129-156.ap-south-1.compute.amazonaws.comInstance type: t2.microAvailability Zone: ap-south-1aAMI ID: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2 (ami-d7abd1b8)Private IP: 172.31.29.58Private DNS: ip-172-31-29-58.ap-south-1.compute.internalSecurity groups: launch-wizard-1
Key pair name: sanket2008@gmail.com
Owner (AWS account number of the AMI owner) : 332227776006 
VPC ID: vpc-8999ade0
Subnet ID: subnet-3e6b5457
Root device type: ebs
Root device: /dev/xvda
Block devices: /dev/xvda

PuttyGen:Unique keyphrase: testaws

Elastic BeanStalk: https://www.lynda.com/Amazon-Web-Services-tutoria...


CloudFormation:
https://tcsltd.skillport.com/skillportfe/main.acti...
https://tcsltd.skillport.com/skillportfe/main.acti...
https://tcsltd.skillport.com/skillportfe/main.acti...
https://tcsltd.skillport.com/skillportfe/main.acti...
ELB:http://techbus.safaribooksonline.com/video/operati...

http://docs.aws.amazon.com/ElasticLoadBalancing/la...
http://www.lynda.com/AWS-tutorials/Elastic-Load-Ba...
Bootstrapping:http://www.lynda... 
https://s3.amazonaws.com/cloudformation-examples/B...
Dedicated instances: http://docs.aws.amazon.com/AWSEC2/latest/UserGuid... ELB pricing:http://docs.aws.amazon.com/AWSEC2/latest/UserGuid... Security groups:http://docs.aws.amazon.com/AWSEC2/latest/UserGuid...http://docs.aws.amazon.com/AmazonVPC/latest/UserG...http://docs.aws.amazon.com/AWSEC2/latest/UserGuid... 

AWS provides the root or system privileges only for a limited set of services, which includesElastic Cloud Compute (EC2)Elastic MapReduce (EMR)Elastic BeanStalkOpsworkAWS does not provide root privileges for managed services like RDS, DynamoDB, S3, Glacier etcFor RDS, if you need Admin privileges or want to use features not enabled by RDS, you can go with the Database on EC2 approach
Courtesy: http://jayendrapatil.com/aws-root-access-enabled-...

EBS vs Instance store:
http://jayendra-patil.blogspot.in/2016/03/aws-ebs-...
http://jayendrapatil.com/aws-ebs-vs-instance-store...

Routing tables with VPC:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGu...

https://stackoverflow.com/questions/36608349/aws-e...
AWS EBS performance:http://jayendrapatil...
https://dzone.com/articles/when-amazon-ebs-optimiz...
CloudFront:
http://docs.aws.amazon.com/AmazonCloudFront/latest...
Route 53:
Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS) web service. It is designed to give developers and businesses an extremely reliable and cost effective way to route end users to Internet applications by translating names like www.example.com into the numeric IP addresses like 192.0.2.1 that computers use to connect to each other. Amazon Route 53 is fully compliant with IPv6 as well.Amazon Route 53 effectively connects user requests to infrastructure running in AWS – such as Amazon EC2 instances, Elastic Load Balancing load balancers, or Amazon S3 buckets – and can also be used to route users to infrastructure outside of AWS. You can use Amazon Route 53 to configure DNS health checks to route traffic to healthy endpoints or to independently monitor the health of your application and its endpoints. Amazon Route 53 Traffic Flow makes it easy for you to manage traffic globally through a variety of routing types, including Latency Based Routing, Geo DNS, Geoproximity, and Weighted Round Robin—all of which can be combined with DNS Failover in order to enable a variety of low-latency, fault-tolerant architectures. Using Amazon Route 53 Traffic Flow’s simple visual editor, you can easily manage how your end-users are routed to your application’s endpoints—whether in a single AWS region or distributed around the globe. Amazon Route 53 also offers Domain Name Registration – you can purchase and manage domain names such as example.com and Amazon Route 53 will automatically configure DNS settings for your domains.

4 comments:

  1. Very informative blog and useful article thank you for sharing with us , keep posting learn more about aws with cloud computing

    AWS Online Training

    AWS Course

    ReplyDelete
  2. Thanks for sharing good information with us. I like your message and everything you share with us is up to date
    AWS Training
    AWS Online Training
    Amazon Web Services Online Training

    ReplyDelete
  3. As we all know,Data Lakesare hard to manage. The data in them is ever changing and increasing. As a result, it is difficult to know which data is important to keep and how long it stays. Data Lakes are also hard to manage from a security standpoint. As a result, it is difficult to know which data can be accessed by the appropriate people and what kind of access they should have.

    ReplyDelete