How to export data from redshift to s3 using airflow?
I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks
redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)
redshift.run(query)
amazon-web-services amazon-redshift airflow
|
show 2 more comments
I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks
redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)
redshift.run(query)
amazon-web-services amazon-redshift airflow
To me yourSQL
query seems syntactically incorrect, if its not typo.SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19
|
show 2 more comments
I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks
redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)
redshift.run(query)
amazon-web-services amazon-redshift airflow
I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks
redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)
redshift.run(query)
amazon-web-services amazon-redshift airflow
amazon-web-services amazon-redshift airflow
asked Jan 2 at 17:27
MichaelMichael
164
164
To me yourSQL
query seems syntactically incorrect, if its not typo.SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19
|
show 2 more comments
To me yourSQL
query seems syntactically incorrect, if its not typo.SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19
To me your
SQL
query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
To me your
SQL
query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19
|
show 2 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54010645%2fhow-to-export-data-from-redshift-to-s3-using-airflow%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54010645%2fhow-to-export-data-from-redshift-to-s3-using-airflow%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
To me your
SQL
query seems syntactically incorrect, if its not typo.SELECT * FROM (SELECT * FROM mytable LIMIT 10);
– Red Boy
Jan 2 at 19:33
as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?
– vishnu narayanan
Jan 2 at 19:48
@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.
– Nathan Griffiths
Jan 2 at 20:48
@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?
– Nathan Griffiths
Jan 2 at 20:49
@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?
– Michael
Jan 3 at 9:19