这个作业的作用是计算单词在输入文件中出现的次数。
Recall that the point of the job is to calculate the number of times words occurred in the input files.
在以前的作业中,输入、解析和输出用于将多个XML文件解析为关系记录。
In the previous job, the input, parser and output steps are used to parse multiple XML files into relational records. The following steps describe how to create the assembly.
在提供输入数据时(进入Hadoop文件系统[hdfs]),首先分段,然后分配给map工作线程(通过作业跟踪器)。
When input data is provided (into the Hadoop file system [HDFS]), it is first partitioned, and then distributed to map workers (via the job tracker).
如果作业完成,但输出文件没有包含经过音译的输出,请检查输入文件是否保存为utf编码之外的其他格式,比如ucs。
If the job completes, but the output file does not contain the transliterated output, check to see if the input file was saved as something other than UTF encoding, for example UCS.
然而,一个输出文件带不能被多个作业使用也不能作为立即输入文件。
However, an output file tape may not be used by more than one job nor may it be used as an immediate input file.
一种以磁盘或磁带存储器为单位的文件,既可存放作业的输入数据,又可存放结果。
A single file (unit of magnetic disc or magnetic tape storage) that holds the input data for a job and to which the results are written.
一种以磁盘或磁带存储器为单位的文件,既可存放作业的输入数据,又可存放结果。
A single file (unit of magnetic disc or magnetic tape storage) that holds the input data for a job and to which the results are written.
应用推荐