Skip to main content

Guide: Amazon S3 Ingest Setup

When delivering data to Cherre via Amazon S3 bucket, use this guide to get setup

Kevin Mattice avatar
Written by Kevin Mattice
Updated over 3 weeks ago

Summary

Cherre supports the ability to ingest data from an Amazon S3 bucket. The S3 bucket can be set up in either the client environment or Cherre’s. The purpose of this guide is to outline the steps required to set up access to S3 for Cherre to leverage to ingest data.

The implementation is very straightforward and leverages standard Amazon S3 roles and permissions, with the two delivery options outlined below:

Cherre S3 Bucket

  1. The partner or client uploads data to a bucket owned and managed by Cherre

  2. Cherre handles all aspects of access and permission management

  3. Organized bucket structures (e.g., using prefixes or specific paths) within Cherre-owned buckets are mutually agreed upon

  4. The partner or client shares their own AWS user_id, and Cherre grants that user access to emulate a role on the Cherre bucket

Client S3 Bucket

  1. The partner or client creates an IAM Role in their AWS account

    1. In the trust policy, specify Cherre’s AWS account/user/role as the trusted principal

  2. Attach permissions to the Role

    1. Attach a policy granting the necessary S3 permissions (e.g., s3:GetObject, s3:ListBucket) to the bucket

  3. Share the Role ARN with Cherre

    1. This allows Cherre to assume the role from our user

  4. Cherre’s ingest can then assume the Role and access data

  5. Nice to have: The shared bucket should share all files that need to be ingested by Cherre into a clean folder structure so that our ingestion can point to a single location in the bucket

Introduction to Amazon S3 Buckets

Amazon’s S3 bucket offering offers a scalable, secure and reliable solution for storage and management of data. S3 buckets can be set up, managed and owned by Cherre’s partners or clients, but Cherre also hosts S3 buckets that can be used for data delivery where needed. Throughout this implementation guide, Amazon Resource Name (ARN) is referenced multiple times for bucket identification. Details about ARNs can be found here.

Implementation Checklist

The process of Cherre ingesting data from an S3 bucket varies slightly depending on the ownership of the bucket. Both paths are outlined below:

Cherre S3 Bucket

  • The partner or client creates an AWS user

  • The partner or client shares the user’s ARN with Cherre

  • Cherre grants the user ARN permission to access our bucket

  • The partner or client provides data to the Cherre S3 bucket

  • Cherre’s ingest pulls the data via the AWS access key + secret

Client S3 Bucket

  • Cherre shares a user ARN with the client for the purposes of data access

  • The partner or client creates an AWS role, grants Cherre's user ARN permission to assume the role, and grants the role permission to access and read data from the bucket

  • The partner or client shares the role ARN and bucket id with Cherre

  • Cherre’s ingest pulls the data via the AWS access key + secret + role_arn

Best Practices

  • Bucket Ownership by Data Partner or Client: It is generally smoother for the data partner or client to own the S3 bucket.

  • Organized Bucket Structure: Maintaining a well-organized bucket structure with specific paths can significantly simplify data ingestion, regardless of bucket ownership.

Did this answer your question?