running spark job on local cluster runs infinitely
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I have setup a local spark cluster on my windows 7 machine ( a master and worker node). I have created a simple scala script which i build with sbt and try to run with spark-submit. Please find the resources below
Scala code :
package example1
import java.io._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.expr
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
println("===============================================")
println("===============================================")
println("Hello, world!")
val pw = new PrintWriter(new File("d:\hello.txt" ))
pw.write("Hello, world")
println("===============================================")
println("===============================================")
val session = SparkSession.builder.getOrCreate()
var filesmall = "file:///D:/_Work/azurepoc/samplebigdata/ds2.csv"
//val df = session.read.format("csv").option("header", "true").load(filesmall)
println("===============================================")
pw.write("Hello, world some more information ")
pw.close
}
}
Spark cluster Master script :
C:Windowssystem32>spark-class org.apache.spark.deploy.master.Master
2019-01-03 16:49:16 INFO Master:2612 - Started daemon with process name: 23940@ws-amalhotra
2019-01-03 16:49:16 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'sparkMaster' on port 7077.
2019-01-03 16:49:17 INFO Master:54 - Starting Spark master at spark://192.168.8.101:7077
2019-01-03 16:49:17 INFO Master:54 - Running Spark version 2.3.2
2019-01-03 16:49:17 INFO log:192 - Logging initialized @1412ms
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO Server:419 - Started @1489ms
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@16391414{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'MasterUI' on port 8080.
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@204e3825{/app,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748394e8{/app/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b99890{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c0f561c{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3443bda1{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54541f46{/app/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e8c3d12{/driver/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO MasterWebUI:54 - Bound MasterWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8080
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22eb9260{/,null,AVAILABLE}
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@636eb125{HTTP/1.1,[http/1.1]}{192.168.8.101:6066}
2019-01-03 16:49:17 INFO Server:419 - Started @1558ms
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service on port 6066.
2019-01-03 16:49:17 INFO StandaloneRestServer:54 - Started REST server for submitting applications on port 6066
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a4c3e84{/metrics/master/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5a3b4746{/metrics/applications/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO Master:54 - I have been elected leader! New state: ALIVE
2019-01-03 16:49:21 INFO Master:54 - Registering worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
My Worker node :
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 16:49:20 INFO Worker:2612 - Started daemon with process name: 16264@ws-amalhotra
2019-01-03 16:49:21 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 16:49:21 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 16:49:21 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 16:49:21 INFO Worker:54 - Spark home: C:spark
2019-01-03 16:49:21 INFO log:192 - Logging initialized @1471ms
2019-01-03 16:49:21 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:21 INFO Server:419 - Started @1518ms
2019-01-03 16:49:21 INFO AbstractConnector:278 - Started ServerConnector@44629c8f{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f34cce{/logPage,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@447fb46{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b027ba{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5396b0bb{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6830ec44{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5eb28ff8{/log,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 16:49:21 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36cc352{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 26 ms (0 ms spent in bootstraps)
2019-01-03 16:49:21 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
Now I build and package the scala code with sbt that packages it into a JAR. My build.sbt file looks like below
version := "1.0"
scalaVersion := "2.11.8"
val sparkVersion = "2.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion
)
It creates a jar and I submit it using the spark submit command as below :
C:Usersamalhotra>spark-submit --deploy-mode cluster --master spark://192.168.
8.101:6066 --class "example1.HelloWorld" "D:_Workazurepocsbtexampletargets
cala-2.11sbtexample_2.11-1.0.jar"
Everything works fine and now i just change a single line of code in my script and again follow the compile -> sbt package code -> spark-submit (same as above). The code change is I uncomment the below line :
//val df = session.read.format("csv").option("header", "true").load(filesmall)
When I again run the above with spark-submit, the worker executes forever. Also , the file in my D drive is not getting written. Worker logs below
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 17:24:38 INFO Worker:2612 - Started daemon with process name: 24952@ws-amalhotra
2019-01-03 17:24:39 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 17:24:39 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 17:24:39 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 17:24:39 INFO Worker:54 - Spark home: C:spark
2019-01-03 17:24:39 INFO log:192 - Logging initialized @1512ms
2019-01-03 17:24:39 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 17:24:39 INFO Server:419 - Started @1561ms
2019-01-03 17:24:39 INFO AbstractConnector:278 - Started ServerConnector@51e2ccae{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d96670b{/logPage,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@48e02860{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758918a3{/,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1643bea5{/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f293725{/static,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@339a8612{/log,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 17:24:39 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@196e9c2a{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 29 ms (0 ms spent in bootstraps)
2019-01-03 17:24:40 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
2019-01-03 17:25:17 INFO Worker:54 - Asked to launch driver driver-20190103172517-0000
2019-01-03 17:25:17 INFO DriverRunner:54 - Copying user jar file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexamp
le_2.11-1.0.jar
2019-01-03 17:25:17 INFO Utils:54 - Copying D:_Workazurepocsbtexampletargetscala-2.11sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar
2019-01-03 17:25:17 INFO DriverRunner:54 - Launch Command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.master=spark://19
2.168.8.101:7077" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar" "-Dspark.ap
p.name=example1.HelloWorld" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.8.101:8089" "C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar" "example1.He
lloWorld"
2019-01-03 17:25:19 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/0 for example1.HelloWorld
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:19 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "0" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:25:43 INFO Worker:54 - Executor app-20190103172519-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:25:43 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/1 for example1.HelloWorld
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:43 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "1" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:05 INFO Worker:54 - Executor app-20190103172519-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:05 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/2 for example1.HelloWorld
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:05 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "2" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:28 INFO Worker:54 - Executor app-20190103172519-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:28 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/3 for example1.HelloWorld
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:28 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "3" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
This keeps running forever with same logs repeated every few seconds. Its unclear whats going on. The logs are not saying much. There are no full length examples which show running such jobs on a local standalone cluster
scala apache-spark sbt spark-submit
|
show 1 more comment
I have setup a local spark cluster on my windows 7 machine ( a master and worker node). I have created a simple scala script which i build with sbt and try to run with spark-submit. Please find the resources below
Scala code :
package example1
import java.io._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.expr
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
println("===============================================")
println("===============================================")
println("Hello, world!")
val pw = new PrintWriter(new File("d:\hello.txt" ))
pw.write("Hello, world")
println("===============================================")
println("===============================================")
val session = SparkSession.builder.getOrCreate()
var filesmall = "file:///D:/_Work/azurepoc/samplebigdata/ds2.csv"
//val df = session.read.format("csv").option("header", "true").load(filesmall)
println("===============================================")
pw.write("Hello, world some more information ")
pw.close
}
}
Spark cluster Master script :
C:Windowssystem32>spark-class org.apache.spark.deploy.master.Master
2019-01-03 16:49:16 INFO Master:2612 - Started daemon with process name: 23940@ws-amalhotra
2019-01-03 16:49:16 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'sparkMaster' on port 7077.
2019-01-03 16:49:17 INFO Master:54 - Starting Spark master at spark://192.168.8.101:7077
2019-01-03 16:49:17 INFO Master:54 - Running Spark version 2.3.2
2019-01-03 16:49:17 INFO log:192 - Logging initialized @1412ms
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO Server:419 - Started @1489ms
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@16391414{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'MasterUI' on port 8080.
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@204e3825{/app,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748394e8{/app/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b99890{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c0f561c{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3443bda1{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54541f46{/app/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e8c3d12{/driver/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO MasterWebUI:54 - Bound MasterWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8080
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22eb9260{/,null,AVAILABLE}
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@636eb125{HTTP/1.1,[http/1.1]}{192.168.8.101:6066}
2019-01-03 16:49:17 INFO Server:419 - Started @1558ms
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service on port 6066.
2019-01-03 16:49:17 INFO StandaloneRestServer:54 - Started REST server for submitting applications on port 6066
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a4c3e84{/metrics/master/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5a3b4746{/metrics/applications/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO Master:54 - I have been elected leader! New state: ALIVE
2019-01-03 16:49:21 INFO Master:54 - Registering worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
My Worker node :
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 16:49:20 INFO Worker:2612 - Started daemon with process name: 16264@ws-amalhotra
2019-01-03 16:49:21 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 16:49:21 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 16:49:21 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 16:49:21 INFO Worker:54 - Spark home: C:spark
2019-01-03 16:49:21 INFO log:192 - Logging initialized @1471ms
2019-01-03 16:49:21 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:21 INFO Server:419 - Started @1518ms
2019-01-03 16:49:21 INFO AbstractConnector:278 - Started ServerConnector@44629c8f{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f34cce{/logPage,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@447fb46{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b027ba{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5396b0bb{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6830ec44{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5eb28ff8{/log,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 16:49:21 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36cc352{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 26 ms (0 ms spent in bootstraps)
2019-01-03 16:49:21 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
Now I build and package the scala code with sbt that packages it into a JAR. My build.sbt file looks like below
version := "1.0"
scalaVersion := "2.11.8"
val sparkVersion = "2.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion
)
It creates a jar and I submit it using the spark submit command as below :
C:Usersamalhotra>spark-submit --deploy-mode cluster --master spark://192.168.
8.101:6066 --class "example1.HelloWorld" "D:_Workazurepocsbtexampletargets
cala-2.11sbtexample_2.11-1.0.jar"
Everything works fine and now i just change a single line of code in my script and again follow the compile -> sbt package code -> spark-submit (same as above). The code change is I uncomment the below line :
//val df = session.read.format("csv").option("header", "true").load(filesmall)
When I again run the above with spark-submit, the worker executes forever. Also , the file in my D drive is not getting written. Worker logs below
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 17:24:38 INFO Worker:2612 - Started daemon with process name: 24952@ws-amalhotra
2019-01-03 17:24:39 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 17:24:39 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 17:24:39 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 17:24:39 INFO Worker:54 - Spark home: C:spark
2019-01-03 17:24:39 INFO log:192 - Logging initialized @1512ms
2019-01-03 17:24:39 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 17:24:39 INFO Server:419 - Started @1561ms
2019-01-03 17:24:39 INFO AbstractConnector:278 - Started ServerConnector@51e2ccae{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d96670b{/logPage,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@48e02860{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758918a3{/,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1643bea5{/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f293725{/static,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@339a8612{/log,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 17:24:39 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@196e9c2a{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 29 ms (0 ms spent in bootstraps)
2019-01-03 17:24:40 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
2019-01-03 17:25:17 INFO Worker:54 - Asked to launch driver driver-20190103172517-0000
2019-01-03 17:25:17 INFO DriverRunner:54 - Copying user jar file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexamp
le_2.11-1.0.jar
2019-01-03 17:25:17 INFO Utils:54 - Copying D:_Workazurepocsbtexampletargetscala-2.11sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar
2019-01-03 17:25:17 INFO DriverRunner:54 - Launch Command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.master=spark://19
2.168.8.101:7077" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar" "-Dspark.ap
p.name=example1.HelloWorld" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.8.101:8089" "C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar" "example1.He
lloWorld"
2019-01-03 17:25:19 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/0 for example1.HelloWorld
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:19 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "0" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:25:43 INFO Worker:54 - Executor app-20190103172519-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:25:43 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/1 for example1.HelloWorld
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:43 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "1" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:05 INFO Worker:54 - Executor app-20190103172519-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:05 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/2 for example1.HelloWorld
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:05 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "2" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:28 INFO Worker:54 - Executor app-20190103172519-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:28 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/3 for example1.HelloWorld
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:28 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "3" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
This keeps running forever with same logs repeated every few seconds. Its unclear whats going on. The logs are not saying much. There are no full length examples which show running such jobs on a local standalone cluster
scala apache-spark sbt spark-submit
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
Have you tried to completely remove the usage ofPrintWriter
, and work from there?
– Shikkou
Jan 4 at 13:16
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
2
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03
|
show 1 more comment
I have setup a local spark cluster on my windows 7 machine ( a master and worker node). I have created a simple scala script which i build with sbt and try to run with spark-submit. Please find the resources below
Scala code :
package example1
import java.io._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.expr
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
println("===============================================")
println("===============================================")
println("Hello, world!")
val pw = new PrintWriter(new File("d:\hello.txt" ))
pw.write("Hello, world")
println("===============================================")
println("===============================================")
val session = SparkSession.builder.getOrCreate()
var filesmall = "file:///D:/_Work/azurepoc/samplebigdata/ds2.csv"
//val df = session.read.format("csv").option("header", "true").load(filesmall)
println("===============================================")
pw.write("Hello, world some more information ")
pw.close
}
}
Spark cluster Master script :
C:Windowssystem32>spark-class org.apache.spark.deploy.master.Master
2019-01-03 16:49:16 INFO Master:2612 - Started daemon with process name: 23940@ws-amalhotra
2019-01-03 16:49:16 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'sparkMaster' on port 7077.
2019-01-03 16:49:17 INFO Master:54 - Starting Spark master at spark://192.168.8.101:7077
2019-01-03 16:49:17 INFO Master:54 - Running Spark version 2.3.2
2019-01-03 16:49:17 INFO log:192 - Logging initialized @1412ms
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO Server:419 - Started @1489ms
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@16391414{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'MasterUI' on port 8080.
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@204e3825{/app,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748394e8{/app/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b99890{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c0f561c{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3443bda1{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54541f46{/app/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e8c3d12{/driver/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO MasterWebUI:54 - Bound MasterWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8080
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22eb9260{/,null,AVAILABLE}
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@636eb125{HTTP/1.1,[http/1.1]}{192.168.8.101:6066}
2019-01-03 16:49:17 INFO Server:419 - Started @1558ms
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service on port 6066.
2019-01-03 16:49:17 INFO StandaloneRestServer:54 - Started REST server for submitting applications on port 6066
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a4c3e84{/metrics/master/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5a3b4746{/metrics/applications/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO Master:54 - I have been elected leader! New state: ALIVE
2019-01-03 16:49:21 INFO Master:54 - Registering worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
My Worker node :
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 16:49:20 INFO Worker:2612 - Started daemon with process name: 16264@ws-amalhotra
2019-01-03 16:49:21 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 16:49:21 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 16:49:21 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 16:49:21 INFO Worker:54 - Spark home: C:spark
2019-01-03 16:49:21 INFO log:192 - Logging initialized @1471ms
2019-01-03 16:49:21 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:21 INFO Server:419 - Started @1518ms
2019-01-03 16:49:21 INFO AbstractConnector:278 - Started ServerConnector@44629c8f{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f34cce{/logPage,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@447fb46{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b027ba{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5396b0bb{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6830ec44{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5eb28ff8{/log,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 16:49:21 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36cc352{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 26 ms (0 ms spent in bootstraps)
2019-01-03 16:49:21 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
Now I build and package the scala code with sbt that packages it into a JAR. My build.sbt file looks like below
version := "1.0"
scalaVersion := "2.11.8"
val sparkVersion = "2.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion
)
It creates a jar and I submit it using the spark submit command as below :
C:Usersamalhotra>spark-submit --deploy-mode cluster --master spark://192.168.
8.101:6066 --class "example1.HelloWorld" "D:_Workazurepocsbtexampletargets
cala-2.11sbtexample_2.11-1.0.jar"
Everything works fine and now i just change a single line of code in my script and again follow the compile -> sbt package code -> spark-submit (same as above). The code change is I uncomment the below line :
//val df = session.read.format("csv").option("header", "true").load(filesmall)
When I again run the above with spark-submit, the worker executes forever. Also , the file in my D drive is not getting written. Worker logs below
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 17:24:38 INFO Worker:2612 - Started daemon with process name: 24952@ws-amalhotra
2019-01-03 17:24:39 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 17:24:39 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 17:24:39 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 17:24:39 INFO Worker:54 - Spark home: C:spark
2019-01-03 17:24:39 INFO log:192 - Logging initialized @1512ms
2019-01-03 17:24:39 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 17:24:39 INFO Server:419 - Started @1561ms
2019-01-03 17:24:39 INFO AbstractConnector:278 - Started ServerConnector@51e2ccae{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d96670b{/logPage,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@48e02860{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758918a3{/,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1643bea5{/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f293725{/static,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@339a8612{/log,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 17:24:39 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@196e9c2a{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 29 ms (0 ms spent in bootstraps)
2019-01-03 17:24:40 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
2019-01-03 17:25:17 INFO Worker:54 - Asked to launch driver driver-20190103172517-0000
2019-01-03 17:25:17 INFO DriverRunner:54 - Copying user jar file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexamp
le_2.11-1.0.jar
2019-01-03 17:25:17 INFO Utils:54 - Copying D:_Workazurepocsbtexampletargetscala-2.11sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar
2019-01-03 17:25:17 INFO DriverRunner:54 - Launch Command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.master=spark://19
2.168.8.101:7077" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar" "-Dspark.ap
p.name=example1.HelloWorld" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.8.101:8089" "C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar" "example1.He
lloWorld"
2019-01-03 17:25:19 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/0 for example1.HelloWorld
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:19 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "0" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:25:43 INFO Worker:54 - Executor app-20190103172519-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:25:43 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/1 for example1.HelloWorld
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:43 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "1" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:05 INFO Worker:54 - Executor app-20190103172519-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:05 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/2 for example1.HelloWorld
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:05 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "2" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:28 INFO Worker:54 - Executor app-20190103172519-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:28 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/3 for example1.HelloWorld
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:28 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "3" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
This keeps running forever with same logs repeated every few seconds. Its unclear whats going on. The logs are not saying much. There are no full length examples which show running such jobs on a local standalone cluster
scala apache-spark sbt spark-submit
I have setup a local spark cluster on my windows 7 machine ( a master and worker node). I have created a simple scala script which i build with sbt and try to run with spark-submit. Please find the resources below
Scala code :
package example1
import java.io._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.expr
import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
println("===============================================")
println("===============================================")
println("Hello, world!")
val pw = new PrintWriter(new File("d:\hello.txt" ))
pw.write("Hello, world")
println("===============================================")
println("===============================================")
val session = SparkSession.builder.getOrCreate()
var filesmall = "file:///D:/_Work/azurepoc/samplebigdata/ds2.csv"
//val df = session.read.format("csv").option("header", "true").load(filesmall)
println("===============================================")
pw.write("Hello, world some more information ")
pw.close
}
}
Spark cluster Master script :
C:Windowssystem32>spark-class org.apache.spark.deploy.master.Master
2019-01-03 16:49:16 INFO Master:2612 - Started daemon with process name: 23940@ws-amalhotra
2019-01-03 16:49:16 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:16 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'sparkMaster' on port 7077.
2019-01-03 16:49:17 INFO Master:54 - Starting Spark master at spark://192.168.8.101:7077
2019-01-03 16:49:17 INFO Master:54 - Running Spark version 2.3.2
2019-01-03 16:49:17 INFO log:192 - Logging initialized @1412ms
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO Server:419 - Started @1489ms
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@16391414{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service 'MasterUI' on port 8080.
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@204e3825{/app,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@748394e8{/app/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b99890{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c0f561c{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3443bda1{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54541f46{/app/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e8c3d12{/driver/kill,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO MasterWebUI:54 - Bound MasterWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8080
2019-01-03 16:49:17 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22eb9260{/,null,AVAILABLE}
2019-01-03 16:49:17 INFO AbstractConnector:278 - Started ServerConnector@636eb125{HTTP/1.1,[http/1.1]}{192.168.8.101:6066}
2019-01-03 16:49:17 INFO Server:419 - Started @1558ms
2019-01-03 16:49:17 INFO Utils:54 - Successfully started service on port 6066.
2019-01-03 16:49:17 INFO StandaloneRestServer:54 - Started REST server for submitting applications on port 6066
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a4c3e84{/metrics/master/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5a3b4746{/metrics/applications/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:17 INFO Master:54 - I have been elected leader! New state: ALIVE
2019-01-03 16:49:21 INFO Master:54 - Registering worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
My Worker node :
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 16:49:20 INFO Worker:2612 - Started daemon with process name: 16264@ws-amalhotra
2019-01-03 16:49:21 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 16:49:21 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 16:49:21 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 16:49:21 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 16:49:21 INFO Worker:54 - Spark home: C:spark
2019-01-03 16:49:21 INFO log:192 - Logging initialized @1471ms
2019-01-03 16:49:21 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 16:49:21 INFO Server:419 - Started @1518ms
2019-01-03 16:49:21 INFO AbstractConnector:278 - Started ServerConnector@44629c8f{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 16:49:21 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f34cce{/logPage,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@447fb46{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b027ba{/,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5396b0bb{/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6830ec44{/static,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5eb28ff8{/log,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 16:49:21 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 16:49:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36cc352{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 16:49:21 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 26 ms (0 ms spent in bootstraps)
2019-01-03 16:49:21 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
Now I build and package the scala code with sbt that packages it into a JAR. My build.sbt file looks like below
version := "1.0"
scalaVersion := "2.11.8"
val sparkVersion = "2.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion
)
It creates a jar and I submit it using the spark submit command as below :
C:Usersamalhotra>spark-submit --deploy-mode cluster --master spark://192.168.
8.101:6066 --class "example1.HelloWorld" "D:_Workazurepocsbtexampletargets
cala-2.11sbtexample_2.11-1.0.jar"
Everything works fine and now i just change a single line of code in my script and again follow the compile -> sbt package code -> spark-submit (same as above). The code change is I uncomment the below line :
//val df = session.read.format("csv").option("header", "true").load(filesmall)
When I again run the above with spark-submit, the worker executes forever. Also , the file in my D drive is not getting written. Worker logs below
C:Windowssystem32>spark-class org.apache.spark.deploy.worker.Worker spark://192.168.8.101:7077 -p 8089
2019-01-03 17:24:38 INFO Worker:2612 - Started daemon with process name: 24952@ws-amalhotra
2019-01-03 17:24:39 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:24:39 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'sparkWorker' on port 8089.
2019-01-03 17:24:39 INFO Worker:54 - Starting Spark worker 192.168.8.101:8089 with 8 cores, 14.9 GB RAM
2019-01-03 17:24:39 INFO Worker:54 - Running Spark version 2.3.2
2019-01-03 17:24:39 INFO Worker:54 - Spark home: C:spark
2019-01-03 17:24:39 INFO log:192 - Logging initialized @1512ms
2019-01-03 17:24:39 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-01-03 17:24:39 INFO Server:419 - Started @1561ms
2019-01-03 17:24:39 INFO AbstractConnector:278 - Started ServerConnector@51e2ccae{HTTP/1.1,[http/1.1]}{0.0.0.0:8081}
2019-01-03 17:24:39 INFO Utils:54 - Successfully started service 'WorkerUI' on port 8081.
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d96670b{/logPage,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@48e02860{/logPage/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758918a3{/,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1643bea5{/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f293725{/static,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@339a8612{/log,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO WorkerWebUI:54 - Bound WorkerWebUI to 0.0.0.0, and started at http://ws-amalhotra.domain.co.in:8081
2019-01-03 17:24:39 INFO Worker:54 - Connecting to master 192.168.8.101:7077...
2019-01-03 17:24:39 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@196e9c2a{/metrics/json,null,AVAILABLE,@Spark}
2019-01-03 17:24:39 INFO TransportClientFactory:267 - Successfully created connection to /192.168.8.101:7077 after 29 ms (0 ms spent in bootstraps)
2019-01-03 17:24:40 INFO Worker:54 - Successfully registered with master spark://192.168.8.101:7077
2019-01-03 17:25:17 INFO Worker:54 - Asked to launch driver driver-20190103172517-0000
2019-01-03 17:25:17 INFO DriverRunner:54 - Copying user jar file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexamp
le_2.11-1.0.jar
2019-01-03 17:25:17 INFO Utils:54 - Copying D:_Workazurepocsbtexampletargetscala-2.11sbtexample_2.11-1.0.jar to C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar
2019-01-03 17:25:17 INFO DriverRunner:54 - Launch Command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.master=spark://19
2.168.8.101:7077" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/D:/_Work/azurepoc/sbtexample/target/scala-2.11/sbtexample_2.11-1.0.jar" "-Dspark.ap
p.name=example1.HelloWorld" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.8.101:8089" "C:sparkworkdriver-20190103172517-0000sbtexample_2.11-1.0.jar" "example1.He
lloWorld"
2019-01-03 17:25:19 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/0 for example1.HelloWorld
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:19 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:19 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "0" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:25:43 INFO Worker:54 - Executor app-20190103172519-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:25:43 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/1 for example1.HelloWorld
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:25:43 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:25:43 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "1" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:05 INFO Worker:54 - Executor app-20190103172519-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:05 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/2 for example1.HelloWorld
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:05 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:05 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "2" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
2019-01-03 17:26:28 INFO Worker:54 - Executor app-20190103172519-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1
2019-01-03 17:26:28 INFO Worker:54 - Asked to launch executor app-20190103172519-0000/3 for example1.HelloWorld
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls to: admin
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-03 17:26:28 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(admin); groups with view permissions: Set(); user
s with modify permissions: Set(admin); groups with modify permissions: Set()
2019-01-03 17:26:28 INFO ExecutorRunner:54 - Launch command: "C:Program FilesJavajdk1.8.0_181binjava" "-cp" "C:sparkbin..conf;C:sparkjars*" "-Xmx1024M" "-Dspark.driver.port=557
86" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ws-amalhotra.domain.co.in:55786" "--executor-id" "3" "--hostname" "192.168.8.101" "--
cores" "7" "--app-id" "app-20190103172519-0000" "--worker-url" "spark://Worker@192.168.8.101:8089"
This keeps running forever with same logs repeated every few seconds. Its unclear whats going on. The logs are not saying much. There are no full length examples which show running such jobs on a local standalone cluster
scala apache-spark sbt spark-submit
scala apache-spark sbt spark-submit
asked Jan 3 at 12:08
ankurankur
172223
172223
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
Have you tried to completely remove the usage ofPrintWriter
, and work from there?
– Shikkou
Jan 4 at 13:16
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
2
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03
|
show 1 more comment
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
Have you tried to completely remove the usage ofPrintWriter
, and work from there?
– Shikkou
Jan 4 at 13:16
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
2
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
Have you tried to completely remove the usage of
PrintWriter
, and work from there?– Shikkou
Jan 4 at 13:16
Have you tried to completely remove the usage of
PrintWriter
, and work from there?– Shikkou
Jan 4 at 13:16
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
2
2
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03
|
show 1 more comment
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022016%2frunning-spark-job-on-local-cluster-runs-infinitely%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022016%2frunning-spark-job-on-local-cluster-runs-infinitely%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Looks like the spark application is retrying something or awaiting for some event. Is it because you have the file opened and spark is waiting for the lock to get released.
– Sc0rpion
Jan 3 at 18:13
no, i have not opened the file at all
– ankur
Jan 3 at 19:04
Have you tried to completely remove the usage of
PrintWriter
, and work from there?– Shikkou
Jan 4 at 13:16
@AlexGrigore On the contrary, when i keep only printwriter related code, it works perfectly.
– ankur
Jan 7 at 9:29
2
@AlexGrigore just resolved the above issue. tried the same on my personal machine and everything worked just fine. It was a port firewall related issue. But i really have a serious question now. You must have observed the above logs, i have done that as well... Why is it unclear what the error is? It should have been intuitive from just the logs that something is wrong with my firewall settings.. may be i am doing something wrong. I am not very sure.
– ankur
Jan 7 at 15:03