How to export data from redshift to s3 using airflow?












0















I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks



redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)

redshift.run(query)









share|improve this question























  • To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

    – Red Boy
    Jan 2 at 19:33











  • as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

    – vishnu narayanan
    Jan 2 at 19:48













  • @RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

    – Nathan Griffiths
    Jan 2 at 20:48











  • @Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

    – Nathan Griffiths
    Jan 2 at 20:49











  • @Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

    – Michael
    Jan 3 at 9:19
















0















I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks



redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)

redshift.run(query)









share|improve this question























  • To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

    – Red Boy
    Jan 2 at 19:33











  • as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

    – vishnu narayanan
    Jan 2 at 19:48













  • @RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

    – Nathan Griffiths
    Jan 2 at 20:48











  • @Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

    – Nathan Griffiths
    Jan 2 at 20:49











  • @Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

    – Michael
    Jan 3 at 9:19














0












0








0








I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks



redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)

redshift.run(query)









share|improve this question














I am running a query that selects the data from redshift and inserts into s3 bucket. The redshift connection are correct because I print the print(hook.get_records(sql_one)) and is working but when I am running the airflow[dag] using the UNLOAD, it gets stuck. Does anyone knows what I am doing wrong here? Thanks



redshift = PostgresHook('redshift_creds')
S3Hook for credentials
sql_one = """
SELECT * FROM (SELECT * FROM mytable LIMIT 10);
"""
query = """
UNLOAD ('{select_query}')
TO 's3://{s3_bucket}/{s3_key}'
CREDENTIALS
'aws_access_key_id={access_key};aws_secret_access_key= {secret_key}'
REGION AS 'region_type'
DELIMITER AS ',';
""".format(select_query=sql_one,
s3_bucket='bucket_name',
s3_key='redshift_data.csv',
access_key=credentials.access_key,
secret_key=credentials.secret_key)

redshift.run(query)






amazon-web-services amazon-redshift airflow






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 2 at 17:27









MichaelMichael

164




164













  • To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

    – Red Boy
    Jan 2 at 19:33











  • as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

    – vishnu narayanan
    Jan 2 at 19:48













  • @RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

    – Nathan Griffiths
    Jan 2 at 20:48











  • @Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

    – Nathan Griffiths
    Jan 2 at 20:49











  • @Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

    – Michael
    Jan 3 at 9:19



















  • To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

    – Red Boy
    Jan 2 at 19:33











  • as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

    – vishnu narayanan
    Jan 2 at 19:48













  • @RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

    – Nathan Griffiths
    Jan 2 at 20:48











  • @Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

    – Nathan Griffiths
    Jan 2 at 20:49











  • @Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

    – Michael
    Jan 3 at 9:19

















To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

– Red Boy
Jan 2 at 19:33





To me your SQL query seems syntactically incorrect, if its not typo. SELECT * FROM (SELECT * FROM mytable LIMIT 10);

– Red Boy
Jan 2 at 19:33













as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

– vishnu narayanan
Jan 2 at 19:48







as @RedBoy points out the query seems off. Everything else looks fine. Have you tried running this unload command directly on redshift ? What happens ?

– vishnu narayanan
Jan 2 at 19:48















@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

– Nathan Griffiths
Jan 2 at 20:48





@RedBoy that SQL is syntactically correct. The LIMIT clause needs to be used inside a subquery in an unload statement.

– Nathan Griffiths
Jan 2 at 20:48













@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

– Nathan Griffiths
Jan 2 at 20:49





@Michael can you elaborate on what happens when this is run in Airflow, "it gets stuck" isn't much to go on. Do you have any log output from Airflow or Redshift?

– Nathan Griffiths
Jan 2 at 20:49













@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

– Michael
Jan 3 at 9:19





@Nathan the output from airflow logs is :successful connected to redshift and it's starting to execute the UNLOAD query, when is executing the UNLOAD query its printing it and it's running without displaying or showing any errors. I left it running for 30min and is not inserting any records at all. Do you think the problem is with redshift?

– Michael
Jan 3 at 9:19












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54010645%2fhow-to-export-data-from-redshift-to-s3-using-airflow%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54010645%2fhow-to-export-data-from-redshift-to-s3-using-airflow%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

Npm cannot find a required file even through it is in the searched directory