Hadoop分布式集群搭建
1 资源
提取码:swfy
2 环境
2.1 规划
IP | HOST | Zookeeper | NameNode | ZKFC | JournalNode | Datanode | ResourceManager | NodeManager | JobHistory |
---|---|---|---|---|---|---|---|---|---|
104.21.51.1 | zk01 | √ | √ | √ | √ | √ | √ | √ | |
104.21.51.2 | zk02 | √ | √ | √ | √ | √ | √ | √ | |
104.21.51.3 | zk03 | √ | √ | √ | √ | √ |
2.2 组件简介
- Zookeeper:分布式应用程序协调服务,以Fast paxos算法为基础,实现同步服务,配置维护和命名服务等分布式应用
- NameNode:管理HDFS,监控Datanode
- ZKFC:监控NameNode的状态,并及时把状态信息写入Zookeeper。当Active的NameNode发生故障时,负责故障切换
- JournalNode:存放NameNode的editlog文件(元数据)
- Datanode:存储节点,多副本
- ResourceManager:负责各个NodeManager的资源调度
- NodeManager:管理Datanode的资源
- JobHistory:记录已经finished的mapreduce运行日志
3 安装zookeeper集群
4 部署hadoop集群
4.1 解压安装包
mkdir -p /hadoop
tar -zxvf hadoop-2.7.7.tar.gz -C /hadoop
4.2 设置环境变量并使之生效
vi /etc/bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64/jre
export HADOOP_HOME=/hadoop/hadoop-2.7.7
export PATH=$HADOOP_HOME:$PATH
source /etc/bashrc
4.3 创建数据目录
mkdir -p $HADOOP_HOME/data/ha/jn
mkdir -p $HADOOP_HOME/data/ha/tmp
4.4 配置集群
cd $HADOOP_HOME/etc/hadoop
4.4.1 core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 把两个NameNode的地址组装成一个集群mycluster -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<!-- 指定hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/hadoop-2.7.7/data/ha/tmp</value>
</property>
<!-- 指定ZKFC故障自动切换转移 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>zk01:2181,zk02:2181,zk03:2181</value>
</property>
</configuration>
4.4.2 hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 设置dfs副本数,默认3个 -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 完全分布式集群名称 -->
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<!-- 集群中NameNode节点都有哪些 -->
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<!-- nn1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>zk01:8020</value>
</property>
<!-- nn2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>zk02:8020</value>
</property>
<!-- nn1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>zk01:50070</value>
</property>
<!-- nn2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>zk02:50070</value>
</property>
<!-- 指定NameNode元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://zk01:8485;zk02:8485;zk03:8485/mycluster</value>
</property>
<!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- 使用隔离机制时需要ssh无秘钥登录 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 声明journalnode服务器存储目录 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hadoop/hadoop-2.7.7/data/ha/jn</value>
</property>
<!-- 关闭权限检查 -->
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
<!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置自动故障转移 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 开启webhdfs服务 -->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!-- https://issues.apache.org/jira/browse/HDFS-9274 -->
<property>
<name>dfs.datanode.directoryscan.throttle.limit.ms.per.sec</name>
<value>1000</value>
</property>
<!-- 数据传输线程个数 -->
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
</configuration>
4.4.3 mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 指定mr框架为yarn方式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 指定mr历史服务器主机,端口 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>zk01:10020</value>
</property>
<!-- 指定mr历史服务器WebUI主机,端口 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>zk01:19888</value>
</property>
<!-- 历史服务器的WEB UI上最多显示20000个历史的作业记录信息 -->
<property>
<name>mapreduce.jobhistory.joblist.cache.size</name>
<value>20000</value>
</property>
<!-- MR作业在提交时所使用的临时目录 -->
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/hadoop/hadoop-2.7.7/hadoop-yarn/staging</value>
</property>
<!--配置作业运行日志 -->
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
</property>
<!-- MapReduce作业产生的日志存放位置 -->
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
</property>
</configuration>
4.4.4 yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Site specific YARN configuration properties -->
<configuration>
<!-- reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rmCluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>zk02</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>zk03</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>zk01:2181,zk02:2181,zk03:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
4.4.5 slaves
zk01
zk02
zk03
4.5 拷贝hadoop目录到其他节点
scp -r $HADOOP_HOME root@zk02:/hadoop
scp -r $HADOOP_HOME root@zk03:/hadoop
5 初始化集群
5.1 初始化NameNode
hdfs namenode -format
5.2 初始化ZKFC
前提:Zookeeper集群已启动
hdfs zkfc -formatZK
6 集群管理
6.1 创建集群管理脚本
vi $HADOOP_HOME/bin/auto-hdp.sh
#/bin/bash
sbindir=/hadoop/hadoop-2.7.7/sbin
nodelist=(zk01 zk02 zk03)
case $1 in
(start)
ssh zk01 $sbindir/$1-dfs.sh
ssh zk02 $sbindir/$1-yarn.sh
ssh zk03 $sbindir/yarn-daemon.sh start resourcemanager
ssh zk01 "$sbindir/mr-jobhistory-daemon.sh $1 historyserver"
;;
(stop)
ssh zk01 "$sbindir/mr-jobhistory-daemon.sh $1 historyserver"
ssh zk02 $sbindir/$1-yarn.sh
ssh zk01 $sbindir/$1-dfs.sh
;;
(status)
for node in ${nodelist[@]}; do
echo -e "\nHost $node:"
ssh $node "jps|sort|grep -v Jps"
done
;;
esac
6.2 启动集群
前提:Zookeeper集群已启动
auto-hdp.sh start
Starting namenodes on [zk01 zk02]
zk01: starting namenode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-namenode-zk01.out
zk02: starting namenode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-namenode-zk02.out
zk01: starting datanode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-datanode-zk01.out
zk02: starting datanode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-datanode-zk02.out
zk03: starting datanode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-datanode-zk03.out
Starting journal nodes [zk01 zk02 zk03]
zk01: starting journalnode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-journalnode-zk01.out
zk02: starting journalnode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-journalnode-zk02.out
zk03: starting journalnode, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-journalnode-zk03.out
Starting ZK Failover Controllers on NN hosts [zk01 zk02]
zk02: starting zkfc, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-zkfc-zk02.out
zk01: starting zkfc, logging to /hadoop/hadoop-2.7.7/logs/hadoop-root-zkfc-zk01.out
starting yarn daemons
starting resourcemanager, logging to /hadoop/hadoop-2.7.7/logs/yarn-root-resourcemanager-zk02.out
zk02: starting nodemanager, logging to /hadoop/hadoop-2.7.7/logs/yarn-root-nodemanager-zk02.out
zk03: starting nodemanager, logging to /hadoop/hadoop-2.7.7/logs/yarn-root-nodemanager-zk03.out
zk01: starting nodemanager, logging to /hadoop/hadoop-2.7.7/logs/yarn-root-nodemanager-zk01.out
starting resourcemanager, logging to /hadoop/hadoop-2.7.7/logs/yarn-root-resourcemanager-zk03.out
starting historyserver, logging to /hadoop/hadoop-2.7.7/logs/mapred-root-historyserver-zk01.out
6.3 查看集群启动进程
auto-hdp.sh status
显示以下进程,并且进程号稳定
Host zk01:
59950 NameNode
60090 Datanode
60359 JournalNode
60607 DFSZKFailoverController
60718 NodeManager
60932 JobHistoryServer
8707 QuorumPeerMain
Host zk02:
109295 NameNode
109395 Datanode
109519 JournalNode
109680 DFSZKFailoverController
109830 ResourceManager
109974 NodeManager
39495 QuorumPeerMain
Host zk03:
10108 Datanode
10227 JournalNode
10356 NodeManager
10518 ResourceManager
83090 QuorumPeerMain
6.4 停止集群
auto-hdp.sh stop
stopping historyserver
stopping yarn daemons
stopping resourcemanager
zk02: stopping nodemanager
zk03: stopping nodemanager
zk01: stopping nodemanager
no proxyserver to stop
Stopping namenodes on [zk01 zk02]
zk02: stopping namenode
zk01: stopping namenode
zk03: stopping datanode
zk01: stopping datanode
zk02: stopping datanode
Stopping journal nodes [zk01 zk02 zk03]
zk02: stopping journalnode
zk03: stopping journalnode
zk01: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [zk01 zk02]
zk02: stopping zkfc
zk01: stopping zkfc
6.5 Kill进程方式停止集群
该脚本不建议使用,只在无法正常停止集群的时候执行
vi $HADOOP_HOME/bin/cls.sh
#/bin/bash
nodelist=(zk01 zk02 zk03)
for node in ${nodelist[@]}; do
ssh $node "jps|grep -Ev 'Jps|QuorumPeerMain'|awk '{print \$1}'|xargs kill -9"
done
7 Hadoop的几个Web页面
7.1 Namenode @R_808_4045@ion
http://104.21.51.1:50070/dfshealth.html#tab-overview
7.2 All Applications
7.2.1 查看ResourceManager状态
$HADOOP_HOME/yarn rmadmin -getServiceState rm1
standby
$HADOOP_HOME/yarn rmadmin -getServiceState rm2
active
7.2.2 All Applications
使用active的ResourceManager的主机IP访问
http://104.21.51.3:8088/cluster
7.3 JobHistory
http://104.21.51.1:19888/jobhistory
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。