Set up Airflow for Multiple Environments












0















What's the idiomatic way to setup Airflow so if you have two environments, such as Production-East and Production-West, only the dags from each of these environments show up but they can live in a single repository?










share|improve this question





























    0















    What's the idiomatic way to setup Airflow so if you have two environments, such as Production-East and Production-West, only the dags from each of these environments show up but they can live in a single repository?










    share|improve this question



























      0












      0








      0








      What's the idiomatic way to setup Airflow so if you have two environments, such as Production-East and Production-West, only the dags from each of these environments show up but they can live in a single repository?










      share|improve this question
















      What's the idiomatic way to setup Airflow so if you have two environments, such as Production-East and Production-West, only the dags from each of these environments show up but they can live in a single repository?







      python airflow






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 21 '18 at 8:07









      Meghdeep Ray

      2,51831838




      2,51831838










      asked Nov 20 '18 at 20:16









      RobRob

      1,215828




      1,215828
























          2 Answers
          2






          active

          oldest

          votes


















          2














          The ideal way to achieve this is with named queues.
          Have multiple workers set up, some working on Production-East environment and some on Production-West environment. That way both DAGs show up in the UI but they execute only on the worker machines that have that specific environment on them.



          From the documentation for queues:



          When using the CeleryExecutor, the celery queues that tasks are sent to can be specified. queue is an attribute of BaseOperator, so any task can be assigned to any queue. The default queue for the environment is defined in the airflow.cfg’s celery -> default_queue. This defines the queue that tasks get assigned to when not specified, as well as which queue Airflow workers listen to when started.



          Workers can listen to one or multiple queues of tasks. When a worker is started (using the command airflow worker), a set of comma delimited queue names can be specified (e.g. airflow worker -q spark). This worker will then only pick up tasks wired to the specified queue(s).



          This can be useful if you need specialized workers, either from a resource perspective (for say very lightweight tasks where one worker could take thousands of tasks without a problem), or from an environment perspective (you want a worker running from within the Spark cluster itself because it needs a very specific environment and security rights).






          share|improve this answer



















          • 1





            The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

            – joeb
            Nov 21 '18 at 19:29











          • I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

            – Rob
            Nov 26 '18 at 16:35











          • Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

            – Rob
            Nov 26 '18 at 16:36



















          1














          Have the files for each group put inside a subfolder and then set the dags_folder path to point to the appropriate subfolder for the server.






          share|improve this answer
























          • I appreciate the simplicity of this answer, but also reference my comment below.

            – Rob
            Nov 26 '18 at 16:36











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400870%2fset-up-airflow-for-multiple-environments%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          The ideal way to achieve this is with named queues.
          Have multiple workers set up, some working on Production-East environment and some on Production-West environment. That way both DAGs show up in the UI but they execute only on the worker machines that have that specific environment on them.



          From the documentation for queues:



          When using the CeleryExecutor, the celery queues that tasks are sent to can be specified. queue is an attribute of BaseOperator, so any task can be assigned to any queue. The default queue for the environment is defined in the airflow.cfg’s celery -> default_queue. This defines the queue that tasks get assigned to when not specified, as well as which queue Airflow workers listen to when started.



          Workers can listen to one or multiple queues of tasks. When a worker is started (using the command airflow worker), a set of comma delimited queue names can be specified (e.g. airflow worker -q spark). This worker will then only pick up tasks wired to the specified queue(s).



          This can be useful if you need specialized workers, either from a resource perspective (for say very lightweight tasks where one worker could take thousands of tasks without a problem), or from an environment perspective (you want a worker running from within the Spark cluster itself because it needs a very specific environment and security rights).






          share|improve this answer



















          • 1





            The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

            – joeb
            Nov 21 '18 at 19:29











          • I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

            – Rob
            Nov 26 '18 at 16:35











          • Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

            – Rob
            Nov 26 '18 at 16:36
















          2














          The ideal way to achieve this is with named queues.
          Have multiple workers set up, some working on Production-East environment and some on Production-West environment. That way both DAGs show up in the UI but they execute only on the worker machines that have that specific environment on them.



          From the documentation for queues:



          When using the CeleryExecutor, the celery queues that tasks are sent to can be specified. queue is an attribute of BaseOperator, so any task can be assigned to any queue. The default queue for the environment is defined in the airflow.cfg’s celery -> default_queue. This defines the queue that tasks get assigned to when not specified, as well as which queue Airflow workers listen to when started.



          Workers can listen to one or multiple queues of tasks. When a worker is started (using the command airflow worker), a set of comma delimited queue names can be specified (e.g. airflow worker -q spark). This worker will then only pick up tasks wired to the specified queue(s).



          This can be useful if you need specialized workers, either from a resource perspective (for say very lightweight tasks where one worker could take thousands of tasks without a problem), or from an environment perspective (you want a worker running from within the Spark cluster itself because it needs a very specific environment and security rights).






          share|improve this answer



















          • 1





            The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

            – joeb
            Nov 21 '18 at 19:29











          • I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

            – Rob
            Nov 26 '18 at 16:35











          • Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

            – Rob
            Nov 26 '18 at 16:36














          2












          2








          2







          The ideal way to achieve this is with named queues.
          Have multiple workers set up, some working on Production-East environment and some on Production-West environment. That way both DAGs show up in the UI but they execute only on the worker machines that have that specific environment on them.



          From the documentation for queues:



          When using the CeleryExecutor, the celery queues that tasks are sent to can be specified. queue is an attribute of BaseOperator, so any task can be assigned to any queue. The default queue for the environment is defined in the airflow.cfg’s celery -> default_queue. This defines the queue that tasks get assigned to when not specified, as well as which queue Airflow workers listen to when started.



          Workers can listen to one or multiple queues of tasks. When a worker is started (using the command airflow worker), a set of comma delimited queue names can be specified (e.g. airflow worker -q spark). This worker will then only pick up tasks wired to the specified queue(s).



          This can be useful if you need specialized workers, either from a resource perspective (for say very lightweight tasks where one worker could take thousands of tasks without a problem), or from an environment perspective (you want a worker running from within the Spark cluster itself because it needs a very specific environment and security rights).






          share|improve this answer













          The ideal way to achieve this is with named queues.
          Have multiple workers set up, some working on Production-East environment and some on Production-West environment. That way both DAGs show up in the UI but they execute only on the worker machines that have that specific environment on them.



          From the documentation for queues:



          When using the CeleryExecutor, the celery queues that tasks are sent to can be specified. queue is an attribute of BaseOperator, so any task can be assigned to any queue. The default queue for the environment is defined in the airflow.cfg’s celery -> default_queue. This defines the queue that tasks get assigned to when not specified, as well as which queue Airflow workers listen to when started.



          Workers can listen to one or multiple queues of tasks. When a worker is started (using the command airflow worker), a set of comma delimited queue names can be specified (e.g. airflow worker -q spark). This worker will then only pick up tasks wired to the specified queue(s).



          This can be useful if you need specialized workers, either from a resource perspective (for say very lightweight tasks where one worker could take thousands of tasks without a problem), or from an environment perspective (you want a worker running from within the Spark cluster itself because it needs a very specific environment and security rights).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 21 '18 at 8:07









          Meghdeep RayMeghdeep Ray

          2,51831838




          2,51831838








          • 1





            The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

            – joeb
            Nov 21 '18 at 19:29











          • I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

            – Rob
            Nov 26 '18 at 16:35











          • Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

            – Rob
            Nov 26 '18 at 16:36














          • 1





            The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

            – joeb
            Nov 21 '18 at 19:29











          • I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

            – Rob
            Nov 26 '18 at 16:35











          • Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

            – Rob
            Nov 26 '18 at 16:36








          1




          1





          The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

          – joeb
          Nov 21 '18 at 19:29





          The poster requested a solution wherein the processes for the environment only appear in the UI for said environment. Using a queue to segregate the items won't accomplish this.

          – joeb
          Nov 21 '18 at 19:29













          I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

          – Rob
          Nov 26 '18 at 16:35





          I agree with your assessment, but this is actually very good advice. I will probably implement the first answer as a temporary solution, and then set this as a longer term goal.

          – Rob
          Nov 26 '18 at 16:35













          Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

          – Rob
          Nov 26 '18 at 16:36





          Perhaps we could combine these into a single response before I accept that takes this into account? I would like to acknowledge the effort you put into this answer.

          – Rob
          Nov 26 '18 at 16:36













          1














          Have the files for each group put inside a subfolder and then set the dags_folder path to point to the appropriate subfolder for the server.






          share|improve this answer
























          • I appreciate the simplicity of this answer, but also reference my comment below.

            – Rob
            Nov 26 '18 at 16:36
















          1














          Have the files for each group put inside a subfolder and then set the dags_folder path to point to the appropriate subfolder for the server.






          share|improve this answer
























          • I appreciate the simplicity of this answer, but also reference my comment below.

            – Rob
            Nov 26 '18 at 16:36














          1












          1








          1







          Have the files for each group put inside a subfolder and then set the dags_folder path to point to the appropriate subfolder for the server.






          share|improve this answer













          Have the files for each group put inside a subfolder and then set the dags_folder path to point to the appropriate subfolder for the server.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 21 '18 at 1:41









          joebjoeb

          2,18611519




          2,18611519













          • I appreciate the simplicity of this answer, but also reference my comment below.

            – Rob
            Nov 26 '18 at 16:36



















          • I appreciate the simplicity of this answer, but also reference my comment below.

            – Rob
            Nov 26 '18 at 16:36

















          I appreciate the simplicity of this answer, but also reference my comment below.

          – Rob
          Nov 26 '18 at 16:36





          I appreciate the simplicity of this answer, but also reference my comment below.

          – Rob
          Nov 26 '18 at 16:36


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400870%2fset-up-airflow-for-multiple-environments%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          The term 'EXEC' is not recognized as the name of a cmdlet Powershell

          NPM command prompt closes immediately [closed]

          Error binding properties and functions in emscripten