免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 2701 | 回复: 0
打印 上一主题 下一主题

[Hadoop&HBase] hadoop 快速安装 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-12-22 08:52 |只看该作者 |倒序浏览
这里省略hadoop的介绍,直接介绍安装步骤,按照这步骤就能克隆搭建一个实例。<br>角色列表:<br>namenode &amp; jobtracker 192.168.237.13<br>datanode &amp; tasktracker 192.168.237.74<br>datanode &amp; tasktracker 192.168.239.128<br><br>#useradd hadoop<br>download hadoop-0.20.2.tar.gz&nbsp; http://mirror.bjtu.edu.cn/apache/hadoop/core/hadoop-0.20.2/<br>#mkdir /data/hadoop<br>#tar -zxvf hadoop-0.20.2.tar.gz<br>#chown -R hadoop:hadoop hadoop-0.20.2 hadoop<br>解决无密码登录问题<br>#./ssh_nopasswd.sh client &amp;&amp; ./ssh_nopasswd.sh server 按需修改用户和路径<br><a href=".http://blog.chinaunix.net/attachment/attach/22/27/07/73222707737ff3e4021253d530696d46f60467d238.zip" target="_blank" target="_blank"><img src="/blog/image/attachicons/zip.gif" align="absmiddle" border="0">&nbsp;ssh_nopasswd.zip </a>&nbsp; <br>----------------------------<br>以下四个文件的配置,在一台机上编辑好后,传到其它机器上,面前重复编辑。<br>相关文件配置:<br>core-site.xml 配置namenode jobtracker基本信息<br>主要配置<br>fs.default.name:URI of NameNode<br>mapred.job.tracker:jobtracker ip 和 端口<br>hadoop.tmp.dir:hadoop临时目录<br>dfs.name.dir:name table存储路径<br>dfs.data.dir:namenode数据块配置<br>dfs.replication:副本数<br><br>PS:<br>我的host中进行了如下设置:<br>192.168.237.13&nbsp; hadoop-237-13.pconline.ctc&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hadoop-237-13<br>192.168.237.74&nbsp; hadoop-237-74.pconline.ctc&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hadoop-237-74<br>192.168.239.128&nbsp; hadoop-239-128.pconline.ctc&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hadoop-239-128<br><br>例子:<br>&lt;property&gt;<br>&lt;name&gt;fs.default.name&lt;/name&gt;<br>&lt;value&gt;hdfs://hadoop-237-13:9000&lt;/value&gt;<br>&lt;description&gt;The name of the default file system. Either the literal string "local" or a host:port for DFS.&lt;/description&gt;<br>&lt;/property&gt;<br><br><br><br>&lt;property&gt;<br>&lt;name&gt;mapred.job.tracker&lt;/name&gt;<br>&lt;value&gt;192.168.237.13:9001&lt;/value&gt;<br>&lt;description&gt;The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and<br>reduce task.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;hadoop.tmp.dir&lt;/name&gt;<br>&lt;value&gt;/data/hadoop/tmp&lt;/value&gt;<br>&lt;description&gt;A base for other temporary directories.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;dfs.name.dir&lt;/name&gt;<br>&lt;value&gt;/data/hadoop/filesystem/name&lt;/value&gt;<br>&lt;description&gt;Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. &lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;dfs.data.dir&lt;/name&gt;<br>&lt;value&gt;/data/hadoop/filesystem/data&lt;/value&gt;<br>&lt;description&gt;Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are i<br>gnored.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;dfs.replication&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;Default block replication. The actual number of replications can be specified when the file is created. The default isused if replication is not specified in create time.&lt;/description&gt;<br><br>mapred-site.xml <br>配置map reduce 的一些细节信息<br>看description进行配置就行<br>&lt;property&gt;<br>&lt;name&gt;mapred.job.tracker&lt;/name&gt;<br>&lt;value&gt;192.168.237.13:9001&lt;/value&gt;<br>&lt;description&gt;The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;mapred.tasktracker.map.tasks.maximum&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;The maximum number of map tasks that will be run simultaneously by a task tracker.&lt;/description&gt;<br>&lt;/property&gt;<br><br><br>&lt;property&gt;<br>&lt;name&gt;mapred.tasktracker.reduce.tasks.maximum&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;The maximum number of reduce tasks that will be run simultaneously by a task tracker.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;mapred.map.tasks&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;The default number of map tasks per job. Ignored when mapred.job.tracker is "local".&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;mapred.reduce.tasks&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;The default number of reduce tasks per job. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Ignored when mapred.job.tracker is "local".&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&lt;name&gt;mapred.userlog.retain.hours&lt;/name&gt;<br>&lt;value&gt;2&lt;/value&gt;<br>&lt;description&gt;The maximum time, in hours, for which the user-logs are to be retained.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&nbsp;&nbsp; &lt;name&gt;mapred.child.java.opts&lt;/name&gt;<br>&nbsp;&nbsp; &lt;value&gt;-Xmx700M -server&lt;/value&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&nbsp; &lt;name&gt;mapred.map.max.attempts&lt;/name&gt;<br>&nbsp; &lt;value&gt;800&lt;/value&gt;<br>&nbsp; &lt;description&gt;Expert: The maximum number of attempts per map task. In other words, framework will try to execute a map task these many number of times before giving up on it.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&nbsp; &lt;name&gt;mapred.reduce.max.attempts&lt;/name&gt;<br>&nbsp; &lt;value&gt;800&lt;/value&gt;<br>&nbsp; &lt;description&gt;Expert: The maximum number of attempts per reduce task. In other words, framework will try to execute a reduce task these many number of times before giving up on it.&lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&nbsp; &lt;name&gt;mapred.max.tracker.failures&lt;/name&gt;<br>&nbsp; &lt;value&gt;800&lt;/value&gt;<br>&nbsp; &lt;description&gt;The number of task-failures on a tasktracker of a given job after which new tasks of that job aren't assigned to it. &lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>&nbsp; &lt;name&gt;mapred.task.timeout&lt;/name&gt;<br>&nbsp; &lt;value&gt;60000000&lt;/value&gt;<br>&nbsp; &lt;description&gt;The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. &lt;/description&gt;<br>&lt;/property&gt;<br><br>masters secondarynamenode: 这里只为测试只做在namenode本机上了<br>里面信息为 192.168.237.13<br><br>slaves:<br>里面信息为:<br>192.168.237.74<br>192.168.239.128<br><br>因为我机器中没配置JAVA_HOME环境变量,所以在hadoop-env.sh文件中进行设置<br>export JAVA_HOME=/usr/java/jdk1.6.0_22<br><br>----------------------------<br>#cd /datat/hadoop &amp;&amp; su hadoop<br>$bin/hadoop namenode -format<br>$bin/start-all.sh<br>$ bin/hadoop dfsadmin -report<br>显示如下信息,为成功。<br>Configured Capacity: 107981234176 (100.57 GB)<br>Present Capacity: 101694681088 (94.71 GB)<br>DFS Remaining: 101694607360 (94.71 GB)<br>DFS Used: 73728 (72 KB)<br>DFS Used%: 0%<br>Under replicated blocks: 1<br>Blocks with corrupt replicas: 0<br>Missing blocks: 0<br><br>-------------------------------------------------<br>Datanodes available: 2 (2 total, 0 dead)<br><br>Name: 192.168.239.128:50010<br>Decommission Status : Normal<br>Configured Capacity: 53558603776 (49.88 GB)<br>DFS Used: 36864 (36 KB)<br>Non DFS Used: 3143274496 (2.93 GB)<br>DFS Remaining: 50415292416(46.95 GB)<br>DFS Used%: 0%<br>DFS Remaining%: 94.13%<br>Last contact: Fri Aug 05 12:19:33 CST 2011<br><br><br>Name: 192.168.237.74:50010<br>Decommission Status : Normal<br>Configured Capacity: 54422630400 (50.69 GB)<br>DFS Used: 36864 (36 KB)<br>Non DFS Used: 3143278592 (2.93 GB)<br>DFS Remaining: 51279314944(47.76 GB)<br>DFS Used%: 0%<br>DFS Remaining%: 94.22%<br>Last contact: Fri Aug 05 12:19:33 CST 2011 <br><br><br><br>安装过程中的一些错误:<br>1./data/hadoop 没有做chown&nbsp; 提示没权限<br>2./data/hadoop中手工创建了tmp data相关目录,提示<br>2011-08-05 09:40:34,559 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://hadoop-237-13:9000/data/hadoop/tmp/mapred/system<br>org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /data/hadoop/tmp/mapred/system. Name node is in safe mode.<br><br>如果遇到错误,多查看hadoop_home/logs下来相关信息<br><br>参考信息:<br>http://hadoop.apache.org/common/docs/r0.21.0/cluster_setup.html<br><br>
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP