Connect to Amazon Athena and query data in S3 via the Glue Data Catalog. Reeflow uses the AWS SDK with IAM access keys, so the credentials you create here are the same ones any AWS SDK client would use.
Before creating a connection, you need:
An AWS account with at least one Athena workgroup (the primary workgroup is created automatically with every new account)
One or more databases registered with the Glue Data Catalog (or a federated catalog)
An S3 bucket the IAM user can write query results to, unless your workgroup enforces its own output location
For security, create a dedicated read-only IAM user rather than reusing an admin account. The policy below grants read-only access to Athena, the Glue Data Catalog, and the S3 bucket Athena reads and writes from:
S3 bucket holding your data S3 bucket for query results
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"athena:BatchGetQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetWorkGroup",
"athena:ListDatabases",
"athena:ListDataCatalogs",
"athena:ListTableMetadata",
"athena:ListWorkGroups",
"athena:StartQueryExecution",
"athena:StopQueryExecution"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::my-data-lake",
"arn:aws:s3:::my-data-lake/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-athena-results",
"arn:aws:s3:::my-athena-results/*"
]
}
]
}
When creating an Athena connection in Reeflow, provide the following:
Field Description Region AWS region where your Athena workgroup lives, for example us-east-1 Access Key ID IAM access key ID, 20 uppercase alphanumerics starting with AKIA (long-lived) or ASIA (temporary STS) Secret Access Key The secret half of the IAM access key Session Token Optional. Required only when using temporary STS credentials Workgroup Athena workgroup. The default workgroup is primary Output Location Optional. S3 URI where query results are written, for example s3://my-athena-results/. Leave blank if the workgroup enforces its own output location. Catalog Data catalog. The default is AwsDataCatalog, the Glue-backed catalog Default Database Optional. Unqualified table references in queries resolve against this database
Create an Athena connection
Add Amazon Athena as a data source in Reeflow.
Navigate to Connections in the main navigation, then click New Connection .
Enter a Name for the connection and an optional Description .
Select Amazon Athena as the connection type.
Enter your Region , Access Key ID , and Secret Access Key .
Pick a Workgroup . Reeflow auto-discovers workgroups the IAM user can see.
Pick a Catalog and an optional Default Database . Databases populate from the chosen catalog.
Click Test Connection to verify your credentials are correct.
Click Create Connection to save. The connection appears in your connections list.
Navigate to Connections in the main navigation, then click New Connection .
Enter a Name for the connection and an optional Description .
Select Amazon Athena as the connection type.
Enter your Region , Access Key ID , and Secret Access Key .
Pick a Workgroup . Reeflow auto-discovers workgroups the IAM user can see.
Pick a Catalog and an optional Default Database . Databases populate from the chosen catalog.
Click Test Connection to verify your credentials are correct.
Click Create Connection to save. The connection appears in your connections list.