20/03/25 10:28:07 WARN UserGroup@R_715_4045@ion: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.spark.SparkException: Exception thrown in awaitResult Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroup@R_715_404[email protected](UserGroup@R_715_404[email protected]:1930) at org.apache.spark.deploy.SparkHadoopUtil.runAssparkUser(SparkHadoopUtil.scala:66) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:202) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroup@R_715_404[email protected](UserGroup@R_715_404[email protected]:1917) ... 4 more Caused by: java.io.IOException: Failed to connect to localhost/127.0.0.1:41640 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.OutBox$anon$1.call(OutBox.scala:191) at org.apache.spark.rpc.netty.OutBox$anon$1.call(OutBox.scala:187) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:41640 at sun.nio.ch.socketChannelImpl.checkConnect(Native Method) at sun.nio.ch.socketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257) at io.netty.channel.nio.AbstractNioChannel$AbstractNIoUnsafe.finishConnect(AbstractNioChannel.java:291) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:640) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) ... 1 more LogType:stderr Log Upload Time:Wed Mar 25 10:31:22 +0800 2020 LogLength:63452 Log Contents: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/yarn/nm/usercache/root/filecache/4996/__spark_libs__5125379819107399169.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 20/03/25 10:29:18 INFO SignalUtils: Registered signal handler for TERM 20/03/25 10:29:18 INFO SignalUtils: Registered signal handler for HUP 20/03/25 10:29:18 INFO SignalUtils: Registered signal handler for INT 20/03/25 10:29:19 INFO ApplicationMaster: Preparing Local resources 20/03/25 10:29:19 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1585020115190_0150_000002 20/03/25 10:29:19 INFO SecurityManager: Changing view acls to: yarn,root 20/03/25 10:29:19 INFO SecurityManager: Changing modify acls to: yarn,root 20/03/25 10:29:19 INFO SecurityManager: Changing view acls groups to: 20/03/25 10:29:19 INFO SecurityManager: Changing modify acls groups to: 20/03/25 10:29:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); groups with view permissions: Set(); users with modify permissions: Set(yarn, root); groups with modify permissions: Set() 20/03/25 10:29:19 INFO ApplicationMaster: Starting the user application in a separate Thread 20/03/25 10:29:19 INFO ApplicationMaster: Waiting for spark context initialization... 20/03/25 10:29:21 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered! 20/03/25 10:29:21 INFO RMProxy: Connecting to ResourceManager at df1/172.16.252.11:8030 20/03/25 10:29:28 WARN YarnAllocator: Container marked as Failed: container_1585020115190_0150_02_000003 on host: df3. Exit status: 1. Diagnostics: Exception from container-launch. Container id: container_1585020115190_0150_02_000003 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runcommand(Shell.java:604) at org.apache.hadoop.util.Shell.run(Shell.java:507) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Container exited with a non-zero exit code 1
问题回顾: 编写好程序,在本地idea远程访问测试环境进行测试, 一切正常。 提交程序到测试环境,使用spark local模式执行程序 , 一切正常。 使用cluster 模式执行程序,报错报错报错。。。 思路: 因为在测试环境跑local模式一切正常, 所以首先考虑到是不是因为环境问题,但是别的程序可以正常运行。 所以应该不是环境问题。 然后就想着应该是代码出现了问题, 但是看代码愣是没看出来, 就只能使用笨办法,重新写了一个最简单的程序
.png)
.png)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。