Here’s a simple guide on creating a new AWS RedShift Cluster within a personal AWS test environment.
Warning, if you’re testing this AWS Service you have to be rigorous with managing the cluster for billing reasons. I delete my cluster right after I’m done with it as you’ll still be charged for storage if you pause a cluster.
# Create RedShift Cluster
# Connect to RedShift via DBeaver
Create RedShift Cluster
1. Open the AWS RedShift Service.
2. Click to create a cluster.
3. Give the new cluster a name and we’re selecting the free trial option.
If your ‘organization’ has never created an Amazon Redshift cluster, you’re eligible for a two month free trial of the dc2.large node.
This trial allows 750 hours per month for free to continuously run one dc2.large node with 160GB of compressed SSD storage. You can also build clusters with multiple nodes to test larger data sets, which will consume your free hours more quickly. Once your two month free trial expires or your usage exceeds 750 hours per month, it’ll start charging at the standard On-Demand Rate.
If you’re no longer eligible for the free trial, the cheapest production option at the time of writing is the same as above, a 1 node dc2.large cluster (below).
Warning!…As mentioned at the top of this post. This shows that the estimated cost of leaving this minimum spec 1 node cluster online every hour of a month will cost $230 at $0.32 per hour. Storage and snapshots will increase costs and pausing clusters will only stop compute costs, not storage. I recommend deleting the cluster after use – Ideally, you’d do this on an employer’s AWS account though.
More on billing! Remember to have billing alerts in-place when working with cloud labs. My environment has $10/15/25 alerts and my expenses are around $10 per month, mostly for storage (EBS/S3). When testing RedShift my £25 alert triggered for the first time in about a year!
4. The default database name is dev which is suitable here. Enter a username/password, and there’s never a need to deviate from default database ports.
5. For this simple cluster create guide, there’s no need for an IAM role. I’m confident I’ll update this with a new post soon.
6. Below, select your lab VPC and Security Group, ensuring your local machine has access over port 5439 as configured above.
Enabling Enhanced VPC Routing won’t increase cost, but it might result in additional complexity in network configuration.
I’m making my cluster publicly accessible as my VPC is set-up for external addresses.
7. Default parameter group and no encrpytion needed for now.
8. Maintenance period is only 30 minutes, you ca fit that somewhere within your week easy.
Note: In production environments you may want to configure trailing updates, just to take a precautious approach. AWS do roll out new changes thick and fast, but that said I haven’t seen or heard of any issues with new features. It’s pretty solid.
9. No need for monitoring atm.
10. Select minimum snapshot retention, which is 1 day.
Note: You can set the snapshot retention period to a maximum of 35 days although manual snapshots can have infinite retention.
11. Click to create the new cluster. It’ll take a few minutes.
12. Click into the cluster when it’s ready and copy the endpoint address for the next section; connecting to the cluster with a SQL client app.
Connect to RedShift via DBeaver
1. Open DBeaver and click to create a new database connection as shown.
2. Search and select Redshift.
3. Enter details as defined during the cluster create (username/password/database name & host address). The host is the cluster endpoint address, gained from step 12 above in this post.
Nothing should need changed in here if it’s a relatively up to date version of DBeaver.
4. Click to test connectivity before saving this as a new connection.
5. Start querying the cluster!
Remember to either pause the cluster or delete it when you’re finished.
Leave a Reply