微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Hadoop学习问题记录之基础篇

目的

记录学习hadoop过程中遇到的基础问题,无关大小、无关困扰时间长短。

问题一 全分布式环境中运行mapred程序,报异常:java.net.noroutetoHostException: 没有到主机的路由

在全分布式环境中运行mapred程序,报异常:java.net.noroutetoHostException: 没有到主机的路由,但同样的配置、同样的程序,在伪分布式环境中是没有问题的。具体异常信息如下:

2019-09-14 15:37:44,018 INFO mapreduce.Job: Running job: job_1568442070466_0003
2019-09-14 15:43:46,231 INFO mapreduce.Job: Job job_1568442070466_0003 running in uber mode : false
2019-09-14 15:43:46,233 INFO mapreduce.Job:  map 0% reduce 0%
2019-09-14 15:43:46,273 INFO mapreduce.Job: Job job_1568442070466_0003 Failed with state Failed due to: Application application_1568442070466_0003 Failed 2 times due to Error launching appattempt_1568442070466_0003_000002. Got exception: java.net.noroutetoHostException: No Route to Host from  master/192.168.212.132 to slave1:45816 Failed on socket timeout exception: java.net.noroutetoHostException: 没有到主机的路由; For more details see:  http://wiki.apache.org/hadoop/noroutetoHost
        at sun.reflect.GeneratedConstructorAccessor56.newInstance(UnkNown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
        at org.apache.hadoop.ipc.Client.call(Client.java:1457)
        at org.apache.hadoop.ipc.Client.call(Client.java:1367)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy83.startContainers(UnkNown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
        at sun.reflect.GeneratedMethodAccessor88.invoke(UnkNown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy84.startContainers(UnkNown Source)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:308)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.noroutetoHostException: 没有到主机的路由
        at sun.nio.ch.socketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.socketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.socketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
        at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
        at org.apache.hadoop.ipc.Client.call(Client.java:1403)
        ... 19 more
. Failing the application.
2019-09-14 15:43:46,332 INFO mapreduce.Job: Counters: 0
异常信息

问题定位

1、虽然异常信息中只提到了一个slave节点,但已知所有节点配置均一模一样,所以这应该是一个共性问题;

2、根据异常的字面意思,即当前提交工作的master节点找不到slave节点,而找不到无非是以下几种情况:

  • 当前主机的hosts配置错误
    经ping slave1命令测试成功,这个情况可以排除。
  • 目标主机对应端口(45816)未打开
    打开所有slave节点的45816端口后,重新提交一次作业观察运行情况。
    再次提交作业,发现仍然报错,但此时的端口变成了43910,这个端口我们并未处理,自然也是关闭的。由端口的变化,可推测,每次提交作业调度运行时,用到的IPC端口是框架随机使用的,所以再用打开某个端口的方式来解决这个问题已经明显不可取了。
    解决思路有两个:1、为slave节点添加对master节点的IP级别允许通过;2、使框架用到的ipc端口固定,然后用允许端口的方式开放。(暂未找到这种配置方法
    在使用解决思路1,使用以下命令开放给master节点访问:
    firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.212.132" accept'
    firewall-cmd --reload

     

再次提交作业,发现作业已经可以运行起来,但仍有noroutetoHostException报错,但并不影响mapreduce程序的执行。推测是由于网络波动、或者虚拟机的局限性导致的间歇性丢失目标主机,有哪位大神知道,还望指点一二,这里先不深究了。错误信息如下:

2019-09-14 17:31:55,733 INFO mapreduce.Job: Task Id : attempt_1568452715996_0001_m_000013_2, Status : Failed
Container launch Failed for container_1568452715996_0001_01_000056 : java.net.noroutetoHostException: No Route to Host from  slave3/192.168.212.135 to slave1:42387 Failed on socket timeout exception: java.net.noroutetoHostException: 没有到主机的路由; For more details see:  http://wiki.apache.org/hadoop/noroutetoHost
        at sun.reflect.GeneratedConstructorAccessor39.newInstance(UnkNown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
        at org.apache.hadoop.ipc.Client.call(Client.java:1457)
        at org.apache.hadoop.ipc.Client.call(Client.java:1367)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy84.startContainers(UnkNown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
        at sun.reflect.GeneratedMethodAccessor12.invoke(UnkNown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy85.startContainers(UnkNown Source)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:160)
        at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.noroutetoHostException: 没有到主机的路由
        at sun.nio.ch.socketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.socketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.socketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
        at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
        at org.apache.hadoop.ipc.Client.call(Client.java:1403)
        ... 19 more
异常信息之一

 

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐