这篇文章给大家分享的是有关Hadoop怎样配置的内容。小编觉得挺实用的,因此分享给大家做个参考,一起跟随小编过来看看吧。
1. 设置SSH免密码登录
注意两点:
如果设置之后依然需要输入密码,可能是.ssh的权限问题,尝试以下命令
chown root /root/.ssh
chown root /root/.ssh/*
chmod 700 /root/.ssh
chmod 600 /root/.ssh/*
2. 修改etc/hadoop目录和sbin目录下的配置文件
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://backup01:8020</value>
<description>For namenode listening</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>backup01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>backup01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>backup01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>backup01:8033</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>backup01:9001</value>
</property>
</configuration>
hadoop-env.sh
在文件开头增加Java路径
export JAVA_HOME=/usr/local/jdk
export HADOOP_PID_DIR=/usr/local/hadoop/tmp
yarn-env.sh
在文件开头增加Java路径
export JAVA_HOME=/usr/local/jdk
master (注意3.x.x不需要配置master这个文件)
将backup01作为secondary namenode
backup01
slaves (注意3.x.x对应为workers文件)
backup02
sbin/yarn-daemon.sh
在开头增加一下代码
export YARN_PID_DIR=/usr/local/hadoop/tmp
3.x.x版本Hadoop所需的额外操作
需要在sbin路径下修改start-dfs.sh、stop-dfs.sh、start-yarn.sh和stop-yarn.sh 4个文件,否则运行hadoop时会抛出以下错误:
Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
将start-dfs.sh,stop-dfs.sh两个文件顶部下一行添加以下参数
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
start-yarn.sh、stop-yarn.sh顶部也需添加以下参数:
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=root
YARN_NODEMANAGER_USER=root
3. 输入以下命令以格式化HDFS
hdfs namenode -format
4. 启动Hadoop
$./bin/start-dfs.sh
$./bin/start-yarn.sh
5. 输入以下命令验证Hadoop是否启动成功
hadoop fs -mkdir /in
hadoop fs -ls /
感谢各位的阅读!关于“Hadoop怎样配置”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,让大家可以学到更多知识,如果觉得文章不错,可以把它分享出去让更多的人看到吧!