Multiprocessing in python leading to infinite execution












0















I am seeing some strange behavior while executing my program. Let me explain.



I have written a multiprocessing class. which goes like this.



from multiprocessing import Process


class ProcessManager:

def __init__(self, spark, logger):
self.spark = spark
self.logger = logger

def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):

self.logger.info("Function Arguments : {}".format(func_arguments))
jobs =
for x in iterableList:
try:
p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
jobs.append(p)
p.start()
except:
raise RuntimeError("Unable to create process for GL : {}".format(x))

for job in jobs:
job.join()


Now, I have a method called detect which goes like this



def detect(self, gl, inputFolder, modelFolder, outputFolder, readWriteUtils, region):

# This reads data from inputFolder, modelFolder using readWriteUtils based on gl and region
# Does computation over data
# Writes data to outputFolder


Now I call this method like this.



pm = ProcessManager(spark=spark, logger=logger)
pm.applyMultiProcessExecution(func_arguments=arguments,
targetFunction= detect,
iterableList=GL_LIST)


This is getting run over EMR cluster using a spark-submit step.



Now, the weird behavior.
Sometimes this execute perfectly within 1 minute.
Sometimes, It goes into infinite processing and when I cancel the operation using CTRL C, I can see the data is computed but the process did not close on its own.



Over my spark, I can see my controller logs looks like this.



2019-01-01T08:22:18.145Z INFO Ensure step 23 jar file command-runner.jar
2019-01-01T08:22:18.145Z INFO StepRunner: Created Runner for step 23
INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA'
INFO Environment:
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
LESS_TERMCAP_md=[01;38;5;208m
LESS_TERMCAP_me=[0m
HISTCONTROL=ignoredups
LESS_TERMCAP_mb=[01;31m
AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
UPSTART_JOB=rc
LESS_TERMCAP_se=[0m
HISTSIZE=1000
HADOOP_ROOT_LOGGER=INFO,DRFA
JAVA_HOME=/etc/alternatives/jre
AWS_DEFAULT_REGION=us-east-1
AWS_ELB_HOME=/opt/aws/apitools/elb
LESS_TERMCAP_us=[04;38;5;111m
EC2_HOME=/opt/aws/apitools/ec2
TERM=linux
runlevel=3
LANG=en_US.UTF-8
AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
MAIL=/var/spool/mail/hadoop
LESS_TERMCAP_ue=[0m
LOGNAME=hadoop
PWD=/
LANGSH_SOURCED=1
HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp
_=/etc/alternatives/jre/bin/java
CONSOLETYPE=serial
RUNLEVEL=3
LESSOPEN=||/usr/bin/lesspipe.sh %s
previous=N
UPSTART_EVENTS=runlevel
AWS_PATH=/opt/aws
USER=hadoop
UPSTART_INSTANCE=
PREVLEVEL=N
HADOOP_LOGFILE=syslog
HOSTNAME=ip-172-32-0-233
HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4
EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
SHLVL=5
HOME=/home/hadoop
HADOOP_IDENT_STRING=hadoop
INFO redirectOutput to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stdout
INFO redirectError to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stderr
INFO Working dir /mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4
INFO ProcessRunner started child process 36375 :
hadoop 36375 6380 0 08:22 ? 00:00:00 /etc/alternatives/jre/bin/java -Xmx1000m -server -XX:OnOutOfMemoryError=kill -9 %p -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp -Dhadoop.security.logger=INFO,NullAppender -Dsun.net.inetaddr.ttl=30 org.apache.hadoop.util.RunJar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA
2019-01-01T08:22:22.152Z INFO HadoopJarStepRunner.Runner: startRun() called for s-3INAXV6LAS9A4 Child Pid: 36375
INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO Process still running
INFO Process still running
INFO Process still running


I read something related to deadlock in multiprocessing queue but that does not apply here since I am not putting anything in a queue which needs to be get. This looks quite strange to me as I am not able to root cause this.
Can anyone suggest something ?










share|improve this question



























    0















    I am seeing some strange behavior while executing my program. Let me explain.



    I have written a multiprocessing class. which goes like this.



    from multiprocessing import Process


    class ProcessManager:

    def __init__(self, spark, logger):
    self.spark = spark
    self.logger = logger

    def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):

    self.logger.info("Function Arguments : {}".format(func_arguments))
    jobs =
    for x in iterableList:
    try:
    p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
    jobs.append(p)
    p.start()
    except:
    raise RuntimeError("Unable to create process for GL : {}".format(x))

    for job in jobs:
    job.join()


    Now, I have a method called detect which goes like this



    def detect(self, gl, inputFolder, modelFolder, outputFolder, readWriteUtils, region):

    # This reads data from inputFolder, modelFolder using readWriteUtils based on gl and region
    # Does computation over data
    # Writes data to outputFolder


    Now I call this method like this.



    pm = ProcessManager(spark=spark, logger=logger)
    pm.applyMultiProcessExecution(func_arguments=arguments,
    targetFunction= detect,
    iterableList=GL_LIST)


    This is getting run over EMR cluster using a spark-submit step.



    Now, the weird behavior.
    Sometimes this execute perfectly within 1 minute.
    Sometimes, It goes into infinite processing and when I cancel the operation using CTRL C, I can see the data is computed but the process did not close on its own.



    Over my spark, I can see my controller logs looks like this.



    2019-01-01T08:22:18.145Z INFO Ensure step 23 jar file command-runner.jar
    2019-01-01T08:22:18.145Z INFO StepRunner: Created Runner for step 23
    INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA'
    INFO Environment:
    PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
    LESS_TERMCAP_md=[01;38;5;208m
    LESS_TERMCAP_me=[0m
    HISTCONTROL=ignoredups
    LESS_TERMCAP_mb=[01;31m
    AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
    UPSTART_JOB=rc
    LESS_TERMCAP_se=[0m
    HISTSIZE=1000
    HADOOP_ROOT_LOGGER=INFO,DRFA
    JAVA_HOME=/etc/alternatives/jre
    AWS_DEFAULT_REGION=us-east-1
    AWS_ELB_HOME=/opt/aws/apitools/elb
    LESS_TERMCAP_us=[04;38;5;111m
    EC2_HOME=/opt/aws/apitools/ec2
    TERM=linux
    runlevel=3
    LANG=en_US.UTF-8
    AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
    MAIL=/var/spool/mail/hadoop
    LESS_TERMCAP_ue=[0m
    LOGNAME=hadoop
    PWD=/
    LANGSH_SOURCED=1
    HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp
    _=/etc/alternatives/jre/bin/java
    CONSOLETYPE=serial
    RUNLEVEL=3
    LESSOPEN=||/usr/bin/lesspipe.sh %s
    previous=N
    UPSTART_EVENTS=runlevel
    AWS_PATH=/opt/aws
    USER=hadoop
    UPSTART_INSTANCE=
    PREVLEVEL=N
    HADOOP_LOGFILE=syslog
    HOSTNAME=ip-172-32-0-233
    HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4
    EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
    SHLVL=5
    HOME=/home/hadoop
    HADOOP_IDENT_STRING=hadoop
    INFO redirectOutput to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stdout
    INFO redirectError to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stderr
    INFO Working dir /mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4
    INFO ProcessRunner started child process 36375 :
    hadoop 36375 6380 0 08:22 ? 00:00:00 /etc/alternatives/jre/bin/java -Xmx1000m -server -XX:OnOutOfMemoryError=kill -9 %p -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp -Dhadoop.security.logger=INFO,NullAppender -Dsun.net.inetaddr.ttl=30 org.apache.hadoop.util.RunJar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA
    2019-01-01T08:22:22.152Z INFO HadoopJarStepRunner.Runner: startRun() called for s-3INAXV6LAS9A4 Child Pid: 36375
    INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
    INFO Process still running
    INFO Process still running
    INFO Process still running


    I read something related to deadlock in multiprocessing queue but that does not apply here since I am not putting anything in a queue which needs to be get. This looks quite strange to me as I am not able to root cause this.
    Can anyone suggest something ?










    share|improve this question

























      0












      0








      0








      I am seeing some strange behavior while executing my program. Let me explain.



      I have written a multiprocessing class. which goes like this.



      from multiprocessing import Process


      class ProcessManager:

      def __init__(self, spark, logger):
      self.spark = spark
      self.logger = logger

      def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):

      self.logger.info("Function Arguments : {}".format(func_arguments))
      jobs =
      for x in iterableList:
      try:
      p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
      jobs.append(p)
      p.start()
      except:
      raise RuntimeError("Unable to create process for GL : {}".format(x))

      for job in jobs:
      job.join()


      Now, I have a method called detect which goes like this



      def detect(self, gl, inputFolder, modelFolder, outputFolder, readWriteUtils, region):

      # This reads data from inputFolder, modelFolder using readWriteUtils based on gl and region
      # Does computation over data
      # Writes data to outputFolder


      Now I call this method like this.



      pm = ProcessManager(spark=spark, logger=logger)
      pm.applyMultiProcessExecution(func_arguments=arguments,
      targetFunction= detect,
      iterableList=GL_LIST)


      This is getting run over EMR cluster using a spark-submit step.



      Now, the weird behavior.
      Sometimes this execute perfectly within 1 minute.
      Sometimes, It goes into infinite processing and when I cancel the operation using CTRL C, I can see the data is computed but the process did not close on its own.



      Over my spark, I can see my controller logs looks like this.



      2019-01-01T08:22:18.145Z INFO Ensure step 23 jar file command-runner.jar
      2019-01-01T08:22:18.145Z INFO StepRunner: Created Runner for step 23
      INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA'
      INFO Environment:
      PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
      LESS_TERMCAP_md=[01;38;5;208m
      LESS_TERMCAP_me=[0m
      HISTCONTROL=ignoredups
      LESS_TERMCAP_mb=[01;31m
      AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
      UPSTART_JOB=rc
      LESS_TERMCAP_se=[0m
      HISTSIZE=1000
      HADOOP_ROOT_LOGGER=INFO,DRFA
      JAVA_HOME=/etc/alternatives/jre
      AWS_DEFAULT_REGION=us-east-1
      AWS_ELB_HOME=/opt/aws/apitools/elb
      LESS_TERMCAP_us=[04;38;5;111m
      EC2_HOME=/opt/aws/apitools/ec2
      TERM=linux
      runlevel=3
      LANG=en_US.UTF-8
      AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
      MAIL=/var/spool/mail/hadoop
      LESS_TERMCAP_ue=[0m
      LOGNAME=hadoop
      PWD=/
      LANGSH_SOURCED=1
      HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp
      _=/etc/alternatives/jre/bin/java
      CONSOLETYPE=serial
      RUNLEVEL=3
      LESSOPEN=||/usr/bin/lesspipe.sh %s
      previous=N
      UPSTART_EVENTS=runlevel
      AWS_PATH=/opt/aws
      USER=hadoop
      UPSTART_INSTANCE=
      PREVLEVEL=N
      HADOOP_LOGFILE=syslog
      HOSTNAME=ip-172-32-0-233
      HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4
      EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
      SHLVL=5
      HOME=/home/hadoop
      HADOOP_IDENT_STRING=hadoop
      INFO redirectOutput to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stdout
      INFO redirectError to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stderr
      INFO Working dir /mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4
      INFO ProcessRunner started child process 36375 :
      hadoop 36375 6380 0 08:22 ? 00:00:00 /etc/alternatives/jre/bin/java -Xmx1000m -server -XX:OnOutOfMemoryError=kill -9 %p -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp -Dhadoop.security.logger=INFO,NullAppender -Dsun.net.inetaddr.ttl=30 org.apache.hadoop.util.RunJar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA
      2019-01-01T08:22:22.152Z INFO HadoopJarStepRunner.Runner: startRun() called for s-3INAXV6LAS9A4 Child Pid: 36375
      INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
      INFO Process still running
      INFO Process still running
      INFO Process still running


      I read something related to deadlock in multiprocessing queue but that does not apply here since I am not putting anything in a queue which needs to be get. This looks quite strange to me as I am not able to root cause this.
      Can anyone suggest something ?










      share|improve this question














      I am seeing some strange behavior while executing my program. Let me explain.



      I have written a multiprocessing class. which goes like this.



      from multiprocessing import Process


      class ProcessManager:

      def __init__(self, spark, logger):
      self.spark = spark
      self.logger = logger

      def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):

      self.logger.info("Function Arguments : {}".format(func_arguments))
      jobs =
      for x in iterableList:
      try:
      p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
      jobs.append(p)
      p.start()
      except:
      raise RuntimeError("Unable to create process for GL : {}".format(x))

      for job in jobs:
      job.join()


      Now, I have a method called detect which goes like this



      def detect(self, gl, inputFolder, modelFolder, outputFolder, readWriteUtils, region):

      # This reads data from inputFolder, modelFolder using readWriteUtils based on gl and region
      # Does computation over data
      # Writes data to outputFolder


      Now I call this method like this.



      pm = ProcessManager(spark=spark, logger=logger)
      pm.applyMultiProcessExecution(func_arguments=arguments,
      targetFunction= detect,
      iterableList=GL_LIST)


      This is getting run over EMR cluster using a spark-submit step.



      Now, the weird behavior.
      Sometimes this execute perfectly within 1 minute.
      Sometimes, It goes into infinite processing and when I cancel the operation using CTRL C, I can see the data is computed but the process did not close on its own.



      Over my spark, I can see my controller logs looks like this.



      2019-01-01T08:22:18.145Z INFO Ensure step 23 jar file command-runner.jar
      2019-01-01T08:22:18.145Z INFO StepRunner: Created Runner for step 23
      INFO startExec 'hadoop jar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA'
      INFO Environment:
      PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
      LESS_TERMCAP_md=[01;38;5;208m
      LESS_TERMCAP_me=[0m
      HISTCONTROL=ignoredups
      LESS_TERMCAP_mb=[01;31m
      AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
      UPSTART_JOB=rc
      LESS_TERMCAP_se=[0m
      HISTSIZE=1000
      HADOOP_ROOT_LOGGER=INFO,DRFA
      JAVA_HOME=/etc/alternatives/jre
      AWS_DEFAULT_REGION=us-east-1
      AWS_ELB_HOME=/opt/aws/apitools/elb
      LESS_TERMCAP_us=[04;38;5;111m
      EC2_HOME=/opt/aws/apitools/ec2
      TERM=linux
      runlevel=3
      LANG=en_US.UTF-8
      AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
      MAIL=/var/spool/mail/hadoop
      LESS_TERMCAP_ue=[0m
      LOGNAME=hadoop
      PWD=/
      LANGSH_SOURCED=1
      HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp
      _=/etc/alternatives/jre/bin/java
      CONSOLETYPE=serial
      RUNLEVEL=3
      LESSOPEN=||/usr/bin/lesspipe.sh %s
      previous=N
      UPSTART_EVENTS=runlevel
      AWS_PATH=/opt/aws
      USER=hadoop
      UPSTART_INSTANCE=
      PREVLEVEL=N
      HADOOP_LOGFILE=syslog
      HOSTNAME=ip-172-32-0-233
      HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4
      EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
      SHLVL=5
      HOME=/home/hadoop
      HADOOP_IDENT_STRING=hadoop
      INFO redirectOutput to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stdout
      INFO redirectError to /mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4/stderr
      INFO Working dir /mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4
      INFO ProcessRunner started child process 36375 :
      hadoop 36375 6380 0 08:22 ? 00:00:00 /etc/alternatives/jre/bin/java -Xmx1000m -server -XX:OnOutOfMemoryError=kill -9 %p -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/s-3INAXV6LAS9A4 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-3INAXV6LAS9A4/tmp -Dhadoop.security.logger=INFO,NullAppender -Dsun.net.inetaddr.ttl=30 org.apache.hadoop.util.RunJar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit --py-files /mnt/road-artifacts/ROAD.zip /mnt/road-artifacts/com/amazon/road/model-executor/PCAModelTestExecution.py --inputFolder=/tmp/split_data --modelFolder=/tmp/model --outputFolder=/tmp/output --region=NA
      2019-01-01T08:22:22.152Z INFO HadoopJarStepRunner.Runner: startRun() called for s-3INAXV6LAS9A4 Child Pid: 36375
      INFO Synchronously wait child process to complete : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
      INFO Process still running
      INFO Process still running
      INFO Process still running


      I read something related to deadlock in multiprocessing queue but that does not apply here since I am not putting anything in a queue which needs to be get. This looks quite strange to me as I am not able to root cause this.
      Can anyone suggest something ?







      python-2.7 pyspark amazon-emr






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 1 at 13:17









      White ShadowsWhite Shadows

      599




      599
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995754%2fmultiprocessing-in-python-leading-to-infinite-execution%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995754%2fmultiprocessing-in-python-leading-to-infinite-execution%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

          ts Property 'filter' does not exist on type '{}'

          Notepad++ export/extract a list of installed plugins