这篇文章给大家分享的是有关hadoop集群相关操作有哪些的内容。小编觉得挺实用的,因此分享给大家做个参考,一起跟随小编过来看看吧。
hadoop集群
首先关闭 selinux,
vim /etc/selinux/config
SELINUX=disabled
防火墙
systemctl stop firewalld
systemctl disable firewalld
1.master和slave机都修改/etc/hostname
添加
192.168.1.129 hadoop1
192.168.1.130 hadoop2
192.168.1.132 hadoop3
2.免密码登录
master主机(hadoop1)
切换到/root/.ssh
ssh-keygen -t rsa
一直按回车
生成 id_rsa 和id_rsa.pub
cat id_rsa.pub >> master
将公钥保存到master,发送到slave机器
scp master hadoop2:/root/.ssh/
登录slave(hadoop2,hadoop3)
将master追加到authorized_keys
cat master>>authorized_keys
slave机同
3.配置
解压hadoop-2.6.0.tar.gz到/usr/lib/目录下
tar -zxvf hadoop-2.6.0.tar.gz -C /usr/lib/
cd /usr/lib/hadoop-2.6.0/etc/hadoop
配置文件
4.安装zookeeper
配置环境变量
export JAVA_HOME=/usr/lib/jdk1.7.0_79
export MAVEN_HOME=/usr/lib/apache-maven-3.3.3
export LD_LIBRARY_PATH=/usr/lib/protobuf
export ANT_HOME=/usr/lib/apache-ant-1.9.4
export ZOOKEEPER_HOME=/usr/lib/zookeeper-3.4.6
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$LD_LIBRARY_PATH/bin:$ANT_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$ZOOKERPER_HOME/lib
4.1配置zookeeper/conf/ 将zoo_sample.cfg复制为zoo.cfg
cp zoo_sample.cfg zoo.cfg
修改
dataDir=/usr/lib/zookeeper-3.4.6/datas
增加
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
创建/usr/lib/zookeeper-3.4.6/datas并创建myid在myid中写入对应的数字
将zookeeper-3.4.6 拷贝到hadoop2 和hadoop3以及/etc/profile
运行
hadoop1,hadoop2,hadoop3上执行
zkServer.sh start
查看状态
zkServer.sh status
有Mode: leader,Mode: follower等说明运行正常
5.安装hadoop
在master(hadoop1)上执行
将前面编译的hadoop-2.6.0.tar.gz 解压到/usr/lib/
配置环境变量
export JAVA_HOME=/usr/lib/jdk1.7.0_79
export MAVEN_HOME=/usr/lib/apache-maven-3.3.3
export LD_LIBRARY_PATH=/usr/lib/protobuf
export ANT_HOME=/usr/lib/apache-ant-1.9.4
export ZOOKEEPER_HOME=/usr/lib/zookeeper-3.4.6
export HADOOP_HOME=/usr/lib/hadoop-2.6.0
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$LD_LIBRARY_PATH/bin:$ANT_HOME/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
(hadoop2,hadoop3中可以没有maven等这些是编译hadoop时候配置的)
5.1修改配置文件
cd hadoop-2.6.0/etc/hadoop
配置文件(hadoop-env.sh、core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml、slaves)
5.1.1 hadoop-env.sh
export JAVA_HOME=/usr/lib/jdk1.7.0_79
5.1.2 core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/lib/hadoop-2.6.0/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
5.1.3 hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>hadoop101,hadoop102</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop101</name>
<value>hadoop1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop101</name>
<value>hadoop1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop102</name>
<value>hadoop2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop102</name>
<value>hadoop2:50070</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.cluster1</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop2:8485;hadoop3:8485/cluster1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/lib/hadoop-2.6.0/tmp/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
5.1.4 yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
2.1.5 mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2.1.6 slaves
hadoop2
hadoop3
6.集群启动:
6.1格式化zookeeper集群
hadoop1中执行
bin/hdfs zkfc -formatZK
6.2启动journalnode集群,在hadoop2和hadoop3当中执行
sbin/hadoop-daemon.sh start journalnode
6.3格式化namenode,启动namenode
在hadoop1当中执行
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
在hadoop2上执行
bin/hdfs namenode -bootstrapStandby
sbin/hadoop-daemon.sh start namenode
启动datanode 直接在hadoop1中执行
sbin/hadoop-daemons.sh start datanode
启动zkfc,哪里有namenode就在哪里启动这个进程
在hadoop1和hadoop2中执行
sbin/hadoop-daemon.sh start zkfc
启动yarn 和resourcemanager,在hadoop1中执行
sbin/start-yarn.sh start resourcemanager
在浏览器输入
http://192.168.1.129:50070
Overview 'hadoop1:9000' (active)
http://192.168.1.130:50070/
Overview 'hadoop2:9000' (standby)
hadoop -fs ls /
查看hadoop目录
感谢各位的阅读!关于“hadoop集群相关操作有哪些”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,让大家可以学到更多知识,如果觉得文章不错,可以把它分享出去让更多的人看到吧!