* Price calculations using AWS Price List API
In previous articles, I’ve written about how to save money in AWS by purchasing Reserved capacity for EC2, RDS and EMR. In this article I will focus on Amazon Redshift, a popular and potentially very expensive AWS service.
Amazon Redshift is a managed Data Warehouse service offered by AWS. With Redshift you launch a cluster, which consists of a number of nodes optimized for compute and storage. Then you can store and query data stored in the cluster using Redshift’s own engine based on PostgreSQL. In many cases, Redshift users query data stored inside the cluster, but Redshift also gives the option to access data stored in S3 (Redshift Spectrum).
How does pricing work in Redshift?
When you launch a Redshift cluster, you choose a number of nodes and their instance type. Then you pay an hourly compute fee based on the instance type and number of nodes.
In addition to compute fees, you pay for data transfer, backup storage and optionally for features such as Concurrency Scaling. In the case of Redshift Spectrum, in addition to compute fees, you pay for the amount of data scanned in S3.
The price dimension relevant to Reserved pricing is Instance Type. Unlike other services, such as EC2, RDS or EMR, there are not a lot of instance types available in Redshift. There are only two instance families: Dense Compute (dc) and Dense Storage (ds). For each family, there are only 3 instance sizes: large, xlarge and 8xlarge.
As you can see below, monthly On Demand cost for a single 8xlarge instance can range between approximately $4,800 to $8,000 per month, depending on the AWS region. That’s why Redshift Reserved can easily result in savings worth several thousands of dollars in your AWS bill.
In the graph below, you can select the AWS region relevant to you.
Redshift Reserved Options
Reserved instances in Redshift is a similar concept compared to other services, such as EC2 or RDS. You save money by committing to pay for servers for a period of time, either 1 year or 3 years.
Terms and Payment Options
Redshift offers the following payment options for Reserved instances:
- All Upfront. Pay for the full term (1 year or 3 years) in a single payment, at the beginning of the purchase period. This is the option where you save the most money.
- Partial Upfront. A portion of the full amount, also paid upfront, and the rest is paid monthly.
- No Upfront. Pay a fee every month and commit to pay for the full term. Only supported for 1-year term in Redshift.
As usual, you save more money by committing to a 3-year term. Also, the more money you pay upfront, the higher the savings. In Redshift, there are cases where committing to a 3-year term can save you close to 75% compared to On Demand.
As you can see below, a single Redshift node can cost thousands of dollars per year and paying upfront might be a considerable investment for your business. Therefore it’s very important to compare the savings percentage for each instance type and payment option - just hover on each bar and you’ll see the savings percentage vs. On Demand.
Note: - Unlike EC2, RDS or EMR, in Redshift there are cases with a considerable savings difference (10%-20%) if you purchase All-Upfront vs. No-Upfront. This range varies by region and instance type, therefore it’s important to calculate savings before making a purchase. This makes No-Upfront a less attractive option compared to other services.
Lack of Instance Size Flexibility and no Convertible Offering Class
Redshift doesn’t automatically apply Reserved Instance Size Flexibility and it doesn’t give you the option to select a Convertible Offering Class.
- If you purchase a Reserved Redshift instance for a particular family and size (i.e. dc2.8xlarge) and you provision an instance of the same family but different size (i.e. dc2.xlarge), you don’t automatically get the Reserved discount.
- If you need to switch to a different instance family in the middle of the reservation period, Redshift doesn’t give you the option to make this change and still receive Reserved discount. For example, if you purchase a ds2.xlarge but you need to switch to a dc2.xlarge, you won’t get any Reserved discount applied to any nodes that are not ds2.xlarge.
You get less flexibility in Redshift when it comes to Reserved purchases, compared to other services such as EC2, RDS and EMR. In addition to that, the AWS Reserved Instance Marketplace doesn’t support Redshift reservations, which means you can’t sell your Redshift reservations to other AWS customers (unlike EC2 Reserved purchases).
As a result, once you buy a Redshift instance, there’s nothing much you can do to correct any provisioning inefficiencies. That’s why it’s essential to have a solid process in place before committing to Redshift Reserved instances.
If you have AWS Consolidated Billing enabled, Reserved purchases for Redshift instances are applied to all running nodes in your linked accounts, up to the number of Reserved instances you’ve purchased - beyond that, you pay On-Demand rates for any running nodes.
Savings vary by AWS Region
In Redshift Reserved, cost and percentage of savings compared to On Demand vary by region.
The following graph shows the difference between Reserved Standard All Upfront vs. On Demand. You can select a different instance type from the drop-down and hover on the chart to see more details on savings.
Inaccurate Provisioning is the most important risk when purchasing Redshift Reserved
This is the highest risk in Redshift, for a number of reasons:
- Redshift instances can cost you tens of thousands of dollars per year. Therefore, making the wrong Reserved purchase can become a VERY expensive mistake.
- Redshift Reserved offers no flexibility when it comes to instance type or instance family. This means you can’t easily correct the purchase of a wrong instance type for your workload.
- If you don’t purchase the right Redshift instances for your workload, you can’t sell it in the AWS Reserved Marketplace.
The way to mitigate this risk is to systematically monitor your applications in test and Production environments, execute load tests, gather all relevant metrics and optimize your applications before making a Redshift Reserved purchase.
On the other hand, given the high cost of Redshift nodes, if you purchase the right Redshift Reserved instances for your applications, you’ll save thousands of dollars per year.
Important Numbers when considering Redshift Reserved
Similar to other Reserved purchases, the following numbers are relevant in your decision:
- Savings vs. On Demand. Dollar and percentage amount saved by purchasing Reserved compared to On Demand. Calculate for the whole period: 1 year or 3 years.
- Upfront Fee and Savings. This applies to All Upfront and Partial Upfront. Since this fee can result in thousands of dollars spent upfront, it’s important to be clear on what the amount will be and the long-term savings you’ll get in return compared to the No Upfront option. In Redshift, paying All Upfront can have a significant impact on savings compared to the No Upfront option (10%-20% difference depending on region and instance type).
- Months to Recover. How much time you have to wait before you start to see savings compared to On Demand pricing. This is relevant for All Upfront and Partial Upfront, since No Upfront results in savings from day-1.
Comparing All Options
In the following chart you’ll get a visual of the different Purchase Options (All Upfront, Partial Upfront, No Upfront) and terms (1 year vs. 3 years). You can select instance type, region, term and number of nodes from the drop-down.
The chart displays how cost accumulates throughout your commitment period and it shows Months to Recover (MtR) as vertical annotations. Hover over the chart to see more pricing details.
Differences between Redshift and EMR that impact Reserved purchases
When it comes to data analysis tools, EMR and Redshift are two very popular services. Even though EMR and Redshift solve very similar problems, there are a few differences that are worth mentioning, particularly related to compute and Reserved purchases.
- EMR supports a wide variety of EC2 instance families, such as c5, m5, r4, r5, h1, cg1 and g2, among others. As a result, in EMR you can choose instance types from dozens of options. In contrast, Redshift supports only two instance families: Dense Storage (ds) and Dense Compute (dc) and 3 instance sizes: large, xlarge and 8xlarge. Therefore, instance type options in Redshift are significantly more limited compared to EMR.
- Both services run on EC2 infrastructure; however, Redshift instance families (ds1, dc1, ds2, dc2) aren’t available as standalone EC2 instances - only as Redshift nodes - and are subject to Redshift restrictions regarding Reserved purchases.
- EMR doesn’t offer Reserved instances directly. Reserved purchases for EMR clusters are managed through EC2, and EMR charges a management fee that varies by instance type and amounts to approximately 20% of EC2 On Demand compute cost.
- Redshift only offers No-Upfront for the 1-year term option, not for the 3-year term. EC2 instances deployed in an EMR cluster support No-Upfront for both 1-year and 3-year.
- In EMR, the difference between All-Upfront and No-Upfront is typically 3%-5%, while in Redshift it ranges between 10%-20%.
- Due to EMR’s per-instance fee, Reserved savings in EMR are lower compared to those you can get in Redshift. For example, in EMR it’s common to see savings in the 30% range for 1-year reserved purchases All-Upfront, while in Redshift it’s common to see savings > 40%. For 3-year terms, Redshift savings can get close to 75%, while EMR savings can be between 45%-50%, depending on instance type and region.
- EC2 instances used in an EMR cluster support Instance Size Flexibility and Convertible Offering Class, which gives users more flexibility in terms of how Reserved discounts are applied. Redshift doesn’t offer Instance Size Flexibility or Convertible Offering Class.
- If for some reason you no longer need a particular EC2 Reserved instance in an EMR cluster, you can either repurpose it in a different application component or you can sell the Reserved EC2 instance in the AWS Reserved Instance Marketplace. Redshift Reserved instances cannot be re-allocated outside of Redshift and cannot be sold in the marketplace.
A Step-by-Step Process for purchasing Redshift Reserved
Given the high cost of Redshift Reserved, it’s particularly important to have a repeatable, step-by-step process to choose the right instance type. Below is a process that applies to Redshift; the steps might be similar to other AWS services, but I’m including points that are specific to Redshift.
1. Gather All Relevant Data
Given the high cost of Redshift and the lack of flexibility once a Reserved purchase is completed, it’s essential to make a decision based on system metrics after you have optimized your queries and cluster configuration.
Gather and Analyze System Metrics
Wait until you’ve provisioned a Redshift cluster that can handle the type of workload you expect throughout the Reservation period. Then gather and analyze the following Redshift metrics:
- QueryExecuting (read/insert/delete/update/ctas)
Ask the following questions, based on analyzed metrics and Business Requirements:
- What are the current and future storage needs (number of GB/TB)?
- Are queries compute intensive?
- Are queries storage intensive? (there’s a high number of read/write IOPS)
- Would performance likely be improved if a different node type or size is selected for the cluster?
- If you already have an operational Redshift cluster, is it being under-utilized?
- Can queries be optimized?
Redshift instance types are either optimized for compute (dc types) or storage (ds types). Based on the collected metrics in the previous step, you’ll be able to figure out the storage and compute requirements for your workloads.
If you’re not sure which node types to choose, Dense Storage offers magnetic hard drives (HHD) while Dense Compute comes with SSD storage. DC is faster, but comes with less storage space compared to DS. Also, DS nodes are more expensive than DC ones (you can check out the charts in this article to make a comparison).
2. Apply Optimizations and Wait
After analyzing metrics and optimizing queries (if applicable), you might need to update your cluster’s node type. If this is the case you’ll have to run your queries again and monitor metrics.
- Confirm whether Dense Compute or Dense Storage node types are the right choice for your workload.
- Confirm the updated node size is optimal for your queries.
- Confirm nodes are not under-utilized.
3. Determine Usage Periods and Resizing Requirements - calculate Monthly Node Hours
Redshift clusters aren’t as flexible as EC2 Auto Scaling groups, but they can be resized. You can do so based on usage requirements. For example, you can decrease size during weekends or nights. However, since data is stored inside the cluster, adjusting its size results in some downtime while data is redistributed across different nodes. In some cases, you might not be able to reduce cluster size due to insufficient disk space in the reduced size cluster. This means you might have to create a snapshot and terminate the cluster if you need to resize.
Based on your business requirements, resizing may or may not be a suitable option. There are cases where a cluster needs to constantly receive and analyze incoming data. In these situations, regular resizing might not be an option. But there are cases where data needs to be analyzed infrequently and it might make sense to resize or launch a new cluster from an existing snapshot.
The main objective in this step is to determine the monthly node hours that a Redshift cluster will consume:
monthlyNodeHours = (Uptime Number of Nodes) * (Uptime Hours) + (ReducedSize number of nodes) * (ReducedSize Hours)
For example, a cluster needs 10 nodes, 2 hours per day during weekdays (40 hours per month) and it can be reduced to 5 nodes when not in use. In this case, assuming 720 hours in a month:
monthlyNodeHours = (10 nodes) * (40 hours) + (5 nodes) * (680 hours) = 3,800 monthly hours
Assuming 720 hours in one month, this would result in 5.27 Redshift nodes. In this case, we’d be looking at 5 Reserved Redshift purchases.
Calculate Term and Purchase Plan
This is where you choose:
- All Upfront vs. Partial Upfront vs. No Upfront. In Redshift, going All-Upfront vs. No-Upfront can result in a 10%-20% difference, depending on the instance type and region. The higher the upfront fee, the higher the savings. Also, No-Upfront is only available for the 1-year term. Take a look again at this chart and evaluate the different options.
- 1-year vs 3-years. The longer the term, the higher the savings - in some cases up to about 75% compared to On Demand.
This diagram summarizes the steps described in this article:
Do you need help lowering your AWS cost?
Avoid overspending on AWS. If you’re not sure how to lower your AWS cost, or simply don’t have time, I can help you save a lot of money. Click on the button below to schedule a free consultation or use the contact form.