Working with zero-ETL integrations - Amazon Redshift

Working with zero-ETL integrations

This topic includes prerelease documentation for Aurora PostgreSQL and RDS for MySQL zero-ETL integrations with Amazon Redshift, which are in preview release. The documentation and the features are both subject to change. We recommend that you use RDS for MySQL and Aurora PostgreSQL zero-ETL integrations only in test environments and not in production environments. For preview terms and conditions, see Betas and Previews in AWS Service Terms.

Zero-ETL integration is a fully managed solution that makes transactional or operational data available in Amazon Redshift in near real time. With this solution, you can configure an integration from your source to an Amazon Redshift data warehouse. You don't need to maintain an extract, transform, and load (ETL) pipeline. We take care of the ETL for you by automating the creation and management of data replication from the data source to the Amazon Redshift cluster or Redshift Serverless namespace. You can continue to update and query your source data while simultaneously using Amazon Redshift for analytic workloads, such as reporting and dashboards.

The following sources are currently supported for zero-ETL integrations:

  • Aurora MySQL-Compatible Edition

  • Aurora PostgreSQL-Compatible Edition (preview)

  • RDS for MySQL (preview)

To create a zero-ETL integration, you specify an integration source and an Amazon Redshift data warehouse as the target. The integration replicates data from the source to the target data warehouse. The data becomes available in Amazon Redshift within seconds. The integration monitors the health of the data pipeline and recovers from issues when possible. You can create integrations from sources of the same type into a single Amazon Redshift data warehouse to derive holistic insights across multiple applications.

With the data in Amazon Redshift, you can use analytics that Amazon Redshift provides. For example, built-in machine learning (ML), materialized views, data sharing, and direct access to multiple data stores and data lakes. A zero-ETL integration keeps your compute resources isolated from your data resources, so you're using the most efficient tools to process data. For data engineers, zero-ETL integration provides access to time-sensitive data that otherwise can get delayed by intermittent errors in complex data pipelines. You can run analytical queries and ML models on transactional data to derive near real-time insights for time-sensitive events and business decisions.

You can create an Amazon Redshift event notification subscription so you can be notified when an event occurs for a given zero-ETL integration. To view the list of integration-related event notifications, see Zero-ETL integration event notifications with Amazon EventBridge. The simplest way to create a subscription is with the Amazon SNS console. For information on creating an Amazon SNS topic and subscribing to it, see Getting started with Amazon SNS in the Amazon Simple Notification Service Developer Guide.

As you get started with zero-ETL integrations, consider the following concepts:

  • A source database is the database where data is replicated into Amazon Redshift.

  • A target data warehouse is the Amazon Redshift provisioned cluster or Redshift Serverless workgroup where data is replicated to.

  • A destination database is the database that you create from a zero-ETL integration in the target data warehouse.

You can monitor your zero-ETL integrations by querying the following system views in Amazon Redshift.

For pricing information for zero-ETL integrations, see the appropriate pricing page:

For more information about zero-ETL integration sources, see the following topics: