伪分布式

官方文档

https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

免密登录

未配置免密登录

通过ssh工具登录node0的时候,会让你输入密码。

1
2
3
4
5
6
7
8
[zhangsan@node0 ~]$ ssh node0
The authenticity of host 'node0 (192.168.179.100)' can't be established.
ECDSA key fingerprint is SHA256:1+3DDeEwkWu0zRO1RoxISbQoKTSgZ56QO3Rl4XXteTw.
ECDSA key fingerprint is MD5:92:c9:cd:4a:b8:07:29:ff:3d:25:1c:45:db:8b:5f:dc.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node0,192.168.179.100' (ECDSA) to the list of known hosts.
`zhangsan@node0's password: `
Last login: Wed Mar 30 13:32:51 2022 from localhost

接下来配置免密登录。

进入用户目录
1
[zhangsan@node0 usr]$ cd ~
生成秘钥对
1
2
3
4
# 此命令将会在你用户目录(~)下创建一个名字为.ssh的文件夹;
# 并生成公钥文件(id_rsa.pub)和 私钥文件(id_rsa)
# 执行此命令会提示你输入一些信息,比如让你指定存放秘钥对的位置等,直接回车即可,默认保存到用户目录
[zhangsan@node0 ~]$ ssh-keygen
公钥分发
1
2
# 此过程会让你确认及输入zhangsan用户在node0的密码
[zhangsan@node0 ~]$ ssh-copy-id -i node0
测试
1
2
3
# 分发公钥到node0后,再次登录node0便无需输入密码了。
[zhangsan@node0 ~]$ ssh node0
Last login: Wed Mar 30 13:35:51 2022 from node0

配置文件

hadoop配置文件

hadoop的相关配置文件存放在$HADOOP_HOME/etc/hadoop目录下。

1
[zhangsan@node0 ~]$ cd /opt/bigdata/hadoop/default/etc/hadoop/
hadoop-env.sh
1
JAVA_HOME=/usr/java/default
core-site.xml
1
2
3
4
5
6
7
8
9
10
11
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node0:9000</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/opt/bigdata/hadoop/default/tmp</value>
</property>
</configuration>
hdfs-site.xml
1
2
3
4
5
6
7
8
<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

</configuration>
yarn-site.xml
1
2
3
4
5
6
7
8
9
10
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml
1
2
3
4
5
6
7
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn.</description>
</property>
</configuration>

环境变量

.bashrc

此文件在用户的HOME(~)目录。

1
2
3
[zhangsan@node0 ~]$ vim .bashrc 
export HADOOP_HOME=/opt/bigdata/hadoop/default
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin/:$PATH

让环境变量生效source ~/.bashrc

可以使用hadoop version验证环境变量是否配置成功。

格式化

使用hdfs namenode 命令格式化HDFS文件系统

1
2
3
4
5
[zhangsan@node0 ~]$ hdfs namenode -format
22/02/14 23:17:31 INFO util.ExitUtil: Exiting with status 0
22/02/14 23:17:31 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node1/192.168.179.100

启动

启动HDFS
1
2
3
4
5
6
[zhangsan@node0 ~]$ start-dfs.sh
# 后台三个守护进程
[zhangsan@node0 hadoop]$ jps
8096 NameNode
8402 SecondaryNameNode
8230 DataNode
HDFS Web UI

http://node0:50070 (NameNode)

http://node0:50090(Secondary NameNode)

启动YARN
1
2
3
4
5
6
[zhangsan@node0 ~]$ start-yarn.sh

# 后台两个守护进程
[zhangsan@node0 hadoop]$ jps
8558 ResourceManager
8712 NodeManager
YARN Web UI

http://node0:8088

记得在C:\Windows\System32\drivers\etc\hosts配置一下映射域名ip的映射

192.168.179.100 node0