把这些日志放到HDFS中。
It provides metadata services within HDFS.
它在HDFS中提供元数据服务。
HDFS applications need a write-once-read-many access model for files.
HDFS应用对文件要求的是write - one - read - many访问模型。
Their focus is now on developing Hadoop's distributed file system, HDFS
现在他们的重点是开发Hadoop的分布式文件系统,HDFS
And there is the lack of an append operation in HDFS (See HADOOP-1700).
而且在HDFS中缺少append操作(参见HADOOP- 1700)。
Breaking up storage into a set of volumes (USES multiple HDFS clusters).
把存储拆分成一组文件卷集(使用多个HDFS集群)。
In this example we assume that there is a directory in HDFS for every day.
在这个示例中,我们假设在HDFS中有针对每个日期的目录。
But first, format your hadoop File System (HDFS) using the hadoop command.
但是,首先使用hadoop命令对hadoopFileSystem (HDFS)进行格式化。
However, the HDFS architecture does not preclude implementing these features.
然而,HDFS架构不排除实现这些特性。
From the perspective of an end user, HDFS appears as a traditional file system.
从最终用户的角度来看,HDFS就像传统的文件系统一样。
Releasing federated storage across HDFS instances, in the next major Hadoop release.
在Hadoop的下一个主要版本将会发布跨hdfs实例的联合存储。
In order to use Hadoop to analyze your data, you need to get the data into HDFS.
为了使用Hadoop来分析您的数据,您必须将数据放在HDFS上。
You can also extract the file from HDFS using the hadoop-0.20 utility (see Listing 8).
还可以使用hadoop- 0.20实用程序从HDFS中提取文件(见清单8)。
The HDFS client software implements checksum checking on the contents of HDFS files.
HDFS客户端软件实现了HDFS文件内容的校验和。
Recall that at the top of the Hadoop cluster is the namenode, which manages the HDFS.
位于Hadoop集群最上层的是namenode,它管理 HDFS。
HDFS is designed to reliably store very large files across machines in a large cluster.
HDFS是被设计成在一个集群中跨机器可靠地存储大量文件。
The complete introduction to HDFS and how to operate on it is beyond the scope of this article.
对HDFS及其使用方法的完整介绍超出了本文的范围。
Hadoop itself provides the ability to copy files from the file system into the HDFS and vice-versa.
Hadoop本身也提供将文件从文件系统复制到 HDFS 的功能,反之也可以。
The previous two commands will generate two directories under HDFS, one "input" and one "output."
前两个命令会在HDFS 中生成两个目录,“input”和 “output”。
When an application is submitted, input and output directories contained in the HDFS should be provided.
在提交应用程序时,应该提供包含在HDFS中的输入和输出目录。
You do this easily with the get utility (analogous to the put you executed earlier to write files into HDFS).
只需使用get实用程序(它与前面在HDFS中写文件所用的put相似)。
Is there any additional work being done to simplify the management of a cluster, the HDFS, MapReduce processes, etc.?
那么管理这样一个集群、HDFS以及MapReduce的处理还有什么额外的工作需要做吗?
There are a large number of variables in core/hdfs/mapred-default.xml that you can override in core/hdfs/mapred-site.xml.
core/hdfs/mapred-default.xml中有许多变量,可以在 core/hdfs/mapred-site.xml 中覆盖它们。
If you mess up something, you can format the HDFS and clear the temp directory specified in hadoop-site.xml and start again.
如果弄乱了什么东西,可以格式化HDFS并清空hadoop - site . xml中指定的临时目录,然后重新启动。
HBase is a database representation over Hadoop's HDFS, permitting MapReduce to operate on database tables over simple files.
HBase是数据库在Hadoop的HDFS上的表现,在简单文件上执行MapReduce以操作数据库表。
For more complex scenarios, you can take advantage of the likes of Sqoop (see Resources), a SQL-to-HDFS database import tool.
对于更复杂的场景,您可以利用像Sqoop(参见参考资料)这类工具,这是一个SQL - to -HDFS数据库导入工具。
For more complex scenarios, you can take advantage of the likes of Sqoop (see Resources), a SQL-to-HDFS database import tool.
对于更复杂的场景,您可以利用像Sqoop(参见参考资料)这类工具,这是一个SQL - to -HDFS数据库导入工具。
应用推荐