Skip to main content

Guide: Azure Blob Storage Egress

Kevin Mattice avatar
Written by Kevin Mattice
Updated over 2 weeks ago

Summary

Cherre Egress for Azure allows Cherre clients to access their connected data in their own Azure Blob Storage account, enabling seamless integration into larger data strategies and facilitating interoperability with additional datasets within their organization.

The implementation of the egress process is straightforward and leverages Azure Blob Storage’s secure data transfer capabilities. The basic steps of the implementation are:

  1. Set Up an Azure Storage Container for Data Delivery

  2. Configure Access Control

  3. Provide Azure Blob Storage Account Information to Cherre

  4. Cherre Schedules Automatic Data Transfers from Cherre to Azure

  5. Verify Data Transfer and Integrity

Once these steps are completed, clients will have direct access to their Cherre data in Azure Blob Storage.

Introduction to Cherre Azure Egress

The Cherre Egress for Azure enables the delivery of read-only copies of data connected through Cherre into a client-managed Azure Blob Storage container. This allows seamless integration with Azure-based analytics pipelines, including Azure Synapse, Databricks, and Power BI.

The initial dataset will be seeded with historical data and then incrementally updated according to the client’s configured egress schedule (e.g., daily, hourly, or near real-time). Access to the data is controlled by the client’s Azure security policies, (SAS tokens).

Implementation Checklist

  • Create or Configure an Azure Blob Storage Container

  • Grant Cherre Access via SAS Token

  • Provide Azure Blob Storage Account Information to Cherre

  • Verify Successful Data Transfer

Implementation Steps

Provide Azure Blob Storage Account Information & Desired Tables

Before Cherre can begin setting up the egress, the following details need to be provided by the client:

  • Azure Storage Account Name

  • Azure Container Name where the data should be delivered

  • Preferred authentication method (SAS Token)

  • List of datasets/tables to be included in the egress

  • Desired update frequency (e.g., daily, hourly)

Once Cherre receives this information, the implementation process can begin.

Set Up an Azure Storage Containers for Data Delivery

Connector Stages

Cherre delivers data through 3 different Connector “Stages” that allow clients, partners and Cherre team members to test and verify data in a Connector before it moves to production.

DEV Stage

The DEV stage is meant to allow for the review of Connector updates, including schema changes, transformation updates, and other updates that could impact downstream systems and products.

QC Stage

The QC stage is meant to support pre-production environments for clients and partners. Once updates are verified within the DEV stage, QC stage pipelines are meant to support what is often referred to as a UAT environment for products.

PUB Stage

The PUB stage represents a production environment pipeline that is stable and is subject to SLAs within our agreements.

The client must create 3 containers in their Azure Blob Storage account to receive Cherre data through the 3 Connector Stages:

  1. DEV

  2. QC

  3. PUB

Data from each Connector stage will be delivered to the corresponding Container.

To create a container using Azure CLI:

az storage container create --name cherre-egress --account-name <storage_account_name>

Alternatively, this can be done via the Azure Portal under Storage Accounts → Containers

Configure Access Control

Cherre needs write access to the storage container to transfer data. Clients can provide access via:

Shared Access Signature (SAS) Token

  1. In the Azure Portal, go to the Storage Account → Containers (under the Data Storage dropdown)

  2. Find the container you created above (ex. cherre-egress)

  3. Under the container’s “more options” (three dot menu), go to Access Policy and create a policy

    1. The permissions we need are Read, Write, Add, Create, List

    2. Set an expiry date (e.g., 30 days, depending on security policies. This can be updated later without having to modify the SAS url)

    3. Alternatively, this can be created from the Azure CLI

az storage container policy create --name cherre-write --account-name <storage_account_name> --container-name cherre-egress --permissions rwacl --expiry YYYY-MM-DDTHH:MM:SSZ

  1. Again under the container’s more options, go to Generate SAS

    1. Set the Storage Access Policy as the one created above

    2. Click Generate SAS Token and URL

    3. Alternatively, this can be created from the Azure CLI

az storage container generate-sas --name cherre-egress --account-name <storage_account_name> --policy-name cherre-write

  1. Share the Blob SAS URL with Cherre.

    1. Note: if you choose to generate the SAS with the Azure CLI, then it will provide you with the SAS Token instead of the URL, which is fine because we can derive the URL from the token, storage account name, and container name.

Example SAS URL:

Example SAS Token:

sv=2022-11-02&si=cherre-write&sr=c&sig=XYZ...

Schedule and Automate Data Transfers from Cherre to Azure

Cherre will configure an automated pipeline to deliver data based on the agreed schedule. Data will be transferred in Parquet format.

Verify Data Transfer

Once data delivery begins, we recommend that clients confirm file delivery by checking Azure Blob Storage:

az storage blob list --container-name cherre-egress --account-name <storage_account_name> --sas-token "<SAS_TOKEN>"

Monitoring Cherre Egress for Azure Blob Storage

The volume and frequency of data egress depends on the dataset and agreement between Cherre and the client. Clients can integrate Azure Monitor, Log Analytics, or custom alerts to track data consistency and availability.

Cherre offers 2 options for data updates:

  1. Full Replace - Each file delivered represents all of the data in a table from the latest run of the Connector

  2. Incremental Appends - Each file delivered represents only the inserted or updated records in the table since the last refresh. The consuming system is then responsible for processing these changes, including applying updates to existing data and performing any required deduplication.

Final Thoughts

Cherre Egress for Azure Blob Storage enables seamless data integration into Azure-based architectures, ensuring efficient access to Cherre-transformed data. By leveraging secure data transfer methods and automated pipelines, organizations can incorporate Cherre data into their broader analytics and AI ecosystems.

Next Steps:

  • Ensure Azure Blob Storage is configured & accessible

  • Provide SAS URL/Token

  • Define data update frequency & monitoring strategy

  • Verify successful data transfer once enabled

Need help automating this process? 🚀 Let us know, and we can help optimize your Cherre Azure egress pipeline!

Did this answer your question?