免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 2083 | 回复: 0
打印 上一主题 下一主题

[Hadoop&HBase] 基于hadoop大规模数据排序算法---韩旭红组 第二次报告 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-12-23 02:39 |只看该作者 |倒序浏览
<DIV>
<P style="TEXT-ALIGN: center; mso-line-height-alt: 15.6pt" align=center><FONT face=宋体><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 22pt">基于<SPAN lang=EN-US>hadoop</SPAN>的大规模数据排序算法</SPAN></B><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US></SPAN></FONT></P>
<P style="TEXT-ALIGN: center; mso-line-height-alt: 15.6pt" align=center><FONT face=宋体><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 18pt" lang=EN-US>(</SPAN></B><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 18pt">第二次报告)</SPAN></B><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US></SPAN></FONT></P>
<P style="mso-line-height-alt: 15.6pt"><FONT face=宋体><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 22pt" lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </SPAN></SPAN></B><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 15pt" lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp;&nbsp;</SPAN></SPAN></B><B style="mso-bidi-font-weight: normal"><SPAN style="FONT-SIZE: 13.5pt" lang=EN-US>-------2011.9.24</SPAN></B></FONT></P>
<P style="LINE-HEIGHT: 15.6pt"><FONT size=3><FONT face=宋体><SPAN lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp; </SPAN></SPAN>小组成员:<SPAN style="FONT-SIZE: 10.5pt" lang=EN-US></SPAN></FONT></FONT></P>
<P style="LINE-HEIGHT: 15.6pt"><FONT size=3><FONT face=宋体><SPAN lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </SPAN></SPAN>组长:韩旭红<SPAN lang=EN-US> 1091000161</SPAN><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US></SPAN></FONT></FONT></P>
<P style="LINE-HEIGHT: 15.6pt"><FONT size=3><FONT face=宋体><SPAN lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </SPAN></SPAN>组员:李巍<SPAN lang=EN-US> 1091000167&nbsp;&nbsp; </SPAN>李越<SPAN lang=EN-US> 1091000169<SPAN style="mso-spacerun: yes">&nbsp; </SPAN></SPAN>闫悦<SPAN lang=EN-US> 1091000178</SPAN></FONT></FONT></P>
<P style="LINE-HEIGHT: 15.6pt"><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US><FONT face=宋体>&nbsp;</FONT></SPAN></P>
<P style="TEXT-INDENT: -17.1pt; MARGIN-LEFT: 53.2pt; mso-line-height-alt: 15.6pt; mso-para-margin-left: 3.44gd; mso-char-indent-count: -.95"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-bidi-font-family: 幼圆">本周内容:<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-INDENT: 27pt; MARGIN-LEFT: 53.25pt; mso-line-height-alt: 15.6pt; mso-para-margin-left: 5.07gd; mso-char-indent-count: 1.5"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-bidi-font-family: 幼圆">这周我们对<SPAN lang=EN-US>Hadoop MapReduce </SPAN>做了研究,对<SPAN lang=EN-US>mapreduce</SPAN>有了初步的了解,并为以后工作做了一些规划。</SPAN><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US></SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-outline-level: 2" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">一.概述</SPAN><B><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 18pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US></SPAN></B></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>Hadoop Map/Reduce</SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">是一个使用简易的软件框架,基于它写出来的应用程序能够运行在由上千个商用机器组成的大型集群上,并以一种可靠容错的方式并行处理上<SPAN lang=EN-US>T</SPAN>级别的数据集。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="LINE-HEIGHT: 15.6pt; TEXT-INDENT: -53.25pt; MARGIN-LEFT: 53.25pt"><SPAN style="FONT-SIZE: 10.5pt" lang=EN-US><FONT face=宋体>&nbsp;</FONT></SPAN></P>
<P style="TEXT-INDENT: -53.25pt; MARGIN-LEFT: 53.25pt; mso-line-height-alt: 15.6pt"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-bidi-font-family: 幼圆">二.</SPAN><SPAN style="FONT-FAMILY: 'Times New Roman','serif'; FONT-SIZE: 7pt; mso-bidi-font-weight: bold; mso-fareast-font-family: 幼圆" lang=EN-US>&nbsp;&nbsp;&nbsp;&nbsp; </SPAN><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold">简单工作原理<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-INDENT: -53.25pt; mso-line-height-alt: 15.6pt"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold" lang=EN-US><SPAN style="mso-spacerun: yes">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </SPAN></SPAN><FONT size=3 face=宋体>一个<SPAN lang=EN-US>Map/Reduce </SPAN><I>作业(<SPAN lang=EN-US>job</SPAN>)</I> 通常会把输入的数据集切分为若干独立的数据块,由 <I><SPAN lang=EN-US>map</SPAN>任务(<SPAN lang=EN-US>task</SPAN>)</I>以完全并行的方式处理它们。框架会对<SPAN lang=EN-US>map</SPAN>的输出先进行排序,然后把结果输入给<I><SPAN lang=EN-US>reduce</SPAN>任务</I>。通常作业的输入和输出都会被存储在文件系统中。 整个框架负责任务的调度和监控,以及重新执行已经失败的任务。</FONT></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">通常,<SPAN lang=EN-US>Map/Reduce</SPAN>框架和分布式文件系统是运行在一组相同的节点上的,也就是说,计算节点和存储节点通常在一起。这种配置允许框架在那些已经存好数据的节点上高效地调度任务,这可以使整个集群的网络带宽被非常高效地利用。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>Map/Reduce</SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">框架由一个单独的<SPAN lang=EN-US>master JobTracker </SPAN>和每个集群节点一个<SPAN lang=EN-US>slave TaskTracker</SPAN>共同组成。<SPAN lang=EN-US>master</SPAN>负责调度构成一个作业的所有任务,这些任务分布在不同的<SPAN lang=EN-US>slave</SPAN>上,<SPAN lang=EN-US>master</SPAN>监控它们的执行,重新执行已经失败的任务。而<SPAN lang=EN-US>slave</SPAN>仅负责执行由<SPAN lang=EN-US>master</SPAN>指派的任务。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">应用程序至少应该指明输入<SPAN lang=EN-US>/</SPAN>输出的位置(路径),并通过实现合适的接口或抽象类提供<SPAN lang=EN-US>map</SPAN>和<SPAN lang=EN-US>reduce</SPAN>函数。再加上其他作业的参数,就构成了<I>作业配置(<SPAN lang=EN-US>job configuration</SPAN>)</I>。然后,<SPAN lang=EN-US>Hadoop</SPAN>的 <I><SPAN lang=EN-US>job client</SPAN></I>提交作业(<SPAN lang=EN-US>jar</SPAN>包<SPAN lang=EN-US>/</SPAN>可执行程序等)和配置信息给<SPAN lang=EN-US>JobTracker</SPAN>,后者负责分发这些软件和配置信息给<SPAN lang=EN-US>slave</SPAN>、调度任务并监控它们的执行,同时提供状态和诊断信息给<SPAN lang=EN-US>job-client</SPAN>。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="LINE-HEIGHT: 18pt; MARGIN: 0cm 0cm 0pt; BACKGROUND: white" class=MsoNormal><SPAN style="FONT-FAMILY: 宋体; LETTER-SPACING: 0.4pt; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">   </SPAN><SPAN lang=EN-US><FONT face=Calibri>Hadoop </FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">有许多元素构成。其最底部是</SPAN><SPAN lang=EN-US><FONT face=Calibri>HDFS</FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">,它存储</SPAN><SPAN lang=EN-US><FONT face=Calibri> Hadoop </FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">集群中所有存储节点上的文件。</SPAN><SPAN lang=EN-US><FONT face=Calibri>HDFS</FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">的上一层是</SPAN><SPAN lang=EN-US><FONT face=Calibri> MapReduce </FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">引擎,该引擎由</SPAN><SPAN lang=EN-US><FONT face=Calibri> JobTrackers </FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">和</SPAN><SPAN lang=EN-US><FONT face=Calibri> TaskTrackers </FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin">组成。</SPAN><SPAN style="FONT-FAMILY: 宋体; LETTER-SPACING: 0.4pt; FONT-SIZE: 12pt; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin" lang=EN-US></SPAN></P>
<P style="TEXT-INDENT: -53.25pt; MARGIN-LEFT: 53.25pt; mso-line-height-alt: 15.6pt"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-bidi-font-family: 幼圆">三. 运行环境</SPAN><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold" lang=EN-US></SPAN></P>
<UL type=disc>
<LI style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; tab-stops: list 36.0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1" class=MsoNormal><SPAN style="mso-bidi-font-size: 10.5pt" lang=EN-US><FONT face=Calibri>&nbsp;</FONT></SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>Hadoop Streaming</SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">是一种运行作业的实用工具,它允许用户创建和运行任何可执行程序 (例如:<SPAN lang=EN-US>Shell</SPAN>工具)来做为<SPAN lang=EN-US>mapper</SPAN>和<SPAN lang=EN-US>reducer</SPAN>。<SPAN lang=EN-US></SPAN></SPAN></LI>
<LI style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; tab-stops: list 36.0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1" class=MsoNormal><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>Hadoop Pipes</SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">是一个与<SPAN lang=EN-US>SWIG</SPAN>兼容的<SPAN lang=EN-US>C++ API </SPAN>(没有基于<SPAN lang=EN-US>JNI<SUP>TM</SUP></SPAN>技术),它也可用于实现<SPAN lang=EN-US>Map/Reduce</SPAN>应用程序。<SPAN lang=EN-US></SPAN></SPAN></LI></UL>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-outline-level: 2" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold">四.</SPAN><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold; mso-hansi-font-family: 宋体; mso-bidi-font-family: 幼圆; mso-font-kerning: 0pt">输入与输出</SPAN><B><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 18pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US></SPAN></B></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>Map/Reduce</SPAN><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">框架运转在<SPAN lang=EN-US>&lt;key, value&gt; </SPAN>键值对上,也就是说, 框架把作业的输入看为是一组<SPAN lang=EN-US>&lt;key, value&gt; </SPAN>键值对,同样也产出一组<SPAN lang=EN-US> &lt;key, value&gt; </SPAN>键值对做为作业的输出,这两组键值对的类型可能不同。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">框架需要对<SPAN lang=EN-US>key</SPAN>和<SPAN lang=EN-US>value</SPAN>的类<SPAN lang=EN-US>(class)</SPAN>进行序列化操作, 因此,这些类需要实现 <SPAN lang=EN-US>Writable</SPAN>接口。 另外,为了方便框架执行排序操作,<SPAN lang=EN-US>key</SPAN>类必须实现 <SPAN lang=EN-US>WritableComparable</SPAN>接口。 <SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">一个<SPAN lang=EN-US>Map/Reduce </SPAN>作业的输入和输出类型如下所示:<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US>(input) &lt;k1, v1&gt; -&gt; <B>map</B> -&gt; &lt;k2, v2&gt; -&gt; <B>combine</B> -&gt; &lt;k2, v2&gt; -&gt; <B>reduce</B> -&gt; &lt;k3, v3&gt; (output) </SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-outline-level: 2" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold">五.<SPAN lang=EN-US>Map/Reduce - </SPAN>用户界面</SPAN><B><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 18pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US></SPAN></B></P>
<P style="mso-line-height-alt: 15.6pt"><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold" lang=EN-US>&nbsp;</SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">这部分文档为用户将会面临的<SPAN lang=EN-US>Map/Reduce</SPAN>框架中的各个环节提供了适当的细节。这应该会帮助用户更细粒度地去实现、配置和调优作业。然而,需要注意每个类<SPAN lang=EN-US>/</SPAN>接口的<SPAN lang=EN-US>javadoc</SPAN>文档提供最全面的文档。 <SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">我们会先看看<SPAN lang=EN-US>Mapper</SPAN>和<SPAN lang=EN-US>Reducer</SPAN>接口。应用程序通常会通过提供<SPAN lang=EN-US>map</SPAN>和<SPAN lang=EN-US>reduce</SPAN>方法来实现它们。 <SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">然后,我们会讨论其他的核心接口,其中包括:<SPAN lang=EN-US> JobConf</SPAN>,<SPAN lang=EN-US>JobClient</SPAN>,<SPAN lang=EN-US>Partitioner</SPAN>,<SPAN lang=EN-US> OutputCollector</SPAN>,<SPAN lang=EN-US>Reporter</SPAN>,<SPAN lang=EN-US> InputFormat</SPAN>,<SPAN lang=EN-US>OutputFormat</SPAN>等等。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">最后,我们将通过讨论框架中一些有用的功能点(例如:<SPAN lang=EN-US>DistributedCache</SPAN>,<SPAN lang=EN-US> IsolationRunner</SPAN>等等)来收尾。<SPAN lang=EN-US></SPAN></SPAN></P>
<P style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-outline-level: 3" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 幼圆; FONT-SIZE: 18pt; mso-bidi-font-weight: bold">六.下一步计划</SPAN><B><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 13.5pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt" lang=EN-US></SPAN></B></P>
<P style="TEXT-ALIGN: left; TEXT-INDENT: 24pt; MARGIN: 0cm 0cm 0pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0" class=MsoNormal align=left><SPAN style="FONT-FAMILY: 宋体; FONT-SIZE: 12pt; mso-fareast-font-family: 宋体; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt">下周我们将对<SPAN lang=EN-US>mapreduce </SPAN>的核心功能进行探究,对其作业配置,任务的执行和环境进行进一步的了解。</SPAN><SPAN style="mso-bidi-font-size: 10.5pt" lang=EN-US></SPAN></P>
<P style="MARGIN: 0cm 0cm 0pt" class=MsoNormal><SPAN lang=EN-US><FONT face=Calibri>&nbsp;</FONT></SPAN></P></DIV>
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP