hadoop:建立一个单节点集群伪分布式操作
安装路径为:/opt/hadoop-2.7.2.tar.gz
解压hadoop: tar -zxvf hadoop-2.7.2.tar.gz
配置文件
1. etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/jdk1.8
2. etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop-2.7.2/tmp</value>
</property>
</configuration>
3. etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>.dir</name>
<value>file:/opt/hadoop-2.7.2/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop-2.7.2/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
设置ssh无密码登录
$ ssh localhost
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
格式化文件系统
$ bin/hdfs namenode -format
启动namenode和datanode
$ sbin/start-dfs.sh
浏览界面
http://localhost:50070
yarn一个简单的节点
配置文件
etc/hadoop/mapred-site.xml
<configuration>
<property>
<name></name>
<value>yarn</value>
</property>
</configuration>
etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
启动resourcemanager和nodemanager
$ sbin/start-yarn.sh
浏览界面
http://localhost:8088/
运行wordcount事例
查看文件目录
bin/hdfs dfs -ls /
创建目录文件
bin/hdfs dfs -mkdir /test/input
touch wc.input
vi wc.input (输入内容)
bin/hdfs dfs -put ./wc.input /test/input/ (把wc.input文件放到input中)
bin/hdfs dfs -text /test/input/wc.input(查看文件内容)
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.7.jar wordcount /test/input/ /test/output/
bin/hdfs dfs -ls /test/output/
bin/hdfs dfs -text /test/output/part-r-00000 (显示执行结果)
bin/hdfs dfs -text /test/input/wc.input (查看内容)
sbin/mr-jobhistory-daemon.sh start historyserver。