Skip to content

Getting Started on AWS

These instructions will get you the Certified RO Labs version of Apache Airflow up and running on AWS. You can start launching an instance of the AMI by accessing it from the AWS Marketplace.

Choose Instance Type

Select the instance type of your choice. If you are looking to test out Apache Airflow and work with sample data, a t2.large instance will be sufficient. However, if you are working with larger datasets, it's recommended you use a t2.xlarge as a starting point and scale up accordingly.

aws instance type

Add Storage

Amazon Web Services defaults to a disk size of 8GB. This should be upgraded to at least 40GB for Airflow dependencies and the underlying operating system.

aws add storage

Configure Security Group

Select a preexisting security group or create a new one. The only required port that needs to be open is 8080 for the HTTP Webserver. If you need access to server, make sure to keep SSH (22) open. The Standalone version also launches the Airflow Scheduler on port 8793. However, this should not be exposed to the internet without proper whitelisting.

aws configure security group

Review and Launch

Look over your instance details and make sure everything looks correct. If you need to make any changes, go back and adjust the settings. Once you're happy with everything click the Launch button. Lastly, you'll be presented with an option to choose an existing key pair, or create a new one for this instance. Once you're ready, click the Launch Instance button.

aws confirmation

Once the EC2 instance has been deployed, you're ready to start working with Apache Airflow. Open up your EC2 instances in your AWS console, and select the Airflow instance to view its details.

Logging In

aws instance details

At launch, a password for the Airflow admin user is generated that is unique to your instance. To log into to Airflow for the first time, go to the Public IPv4 Address or the Public IPv4 DNS in your browser.

NOTE: HTTPS is not supported at this time, and that you must access your instance over HTTP. For example, in the screenshot above, it would be http://3.16.47.245:8080.

When redirected to the log screen, provide a username of admin and the password will be your Instance ID. This can be obtained from your instance details and is in the form of i-XXXXXXXXXXXXXXXXX. Refer to the image above for more details.

aws instance details

Once you've successfully logged in, please make sure you go to the Settings and change the password.