把这些日志放到HDFS中。
It provides metadata services within HDFS.
它在HDFS中提供元数据服务。
HDFS applications need a write-once-read-many access model for files.
HDFS应用对文件要求的是write - one - read - many访问模型。
Their focus is now on developing Hadoop's distributed file system, HDFS
现在他们的重点是开发Hadoop的分布式文件系统,HDFS
Breaking up storage into a set of volumes (USES multiple HDFS clusters).
把存储拆分成一组文件卷集(使用多个HDFS集群)。
But first, format your hadoop File System (HDFS) using the hadoop command.
但是,首先使用hadoop命令对hadoopFileSystem (HDFS)进行格式化。
However, the HDFS architecture does not preclude implementing these features.
然而,HDFS架构不排除实现这些特性。
Releasing federated storage across HDFS instances, in the next major Hadoop release.
在Hadoop的下一个主要版本将会发布跨hdfs实例的联合存储。
In order to use Hadoop to analyze your data, you need to get the data into HDFS.
为了使用Hadoop来分析您的数据,您必须将数据放在HDFS上。
You can also extract the file from HDFS using the hadoop-0.20 utility (see Listing 8).
还可以使用hadoop- 0.20实用程序从HDFS中提取文件(见清单8)。
The HDFS client software implements checksum checking on the contents of HDFS files.
HDFS客户端软件实现了HDFS文件内容的校验和。
Recall that at the top of the Hadoop cluster is the namenode, which manages the HDFS.
位于Hadoop集群最上层的是namenode,它管理 HDFS。
HDFS is designed to reliably store very large files across machines in a large cluster.
HDFS是被设计成在一个集群中跨机器可靠地存储大量文件。
The complete introduction to HDFS and how to operate on it is beyond the scope of this article.
对HDFS及其使用方法的完整介绍超出了本文的范围。
Hadoop itself provides the ability to copy files from the file system into the HDFS and vice-versa.
Hadoop本身也提供将文件从文件系统复制到 HDFS 的功能,反之也可以。
The previous two commands will generate two directories under HDFS, one "input" and one "output."
前两个命令会在HDFS 中生成两个目录,“input”和 “output”。
When an application is submitted, input and output directories contained in the HDFS should be provided.
在提交应用程序时,应该提供包含在HDFS中的输入和输出目录。
You do this easily with the get utility (analogous to the put you executed earlier to write files into HDFS).
只需使用get实用程序(它与前面在HDFS中写文件所用的put相似)。
Is there any additional work being done to simplify the management of a cluster, the HDFS, MapReduce processes, etc.?
那么管理这样一个集群、HDFS以及MapReduce的处理还有什么额外的工作需要做吗?
There are a large number of variables in core/hdfs/mapred-default.xml that you can override in core/hdfs/mapred-site.xml.
core/hdfs/mapred-default.xml中有许多变量,可以在 core/hdfs/mapred-site.xml 中覆盖它们。
One usage of the snapshot feature may be to roll back a corrupted HDFS instance to a previously known good point in time.
一个使用可能快照功能的情况是一个失效的HDFS实例想要回滚到之前一个已知是正确的时间。
If you mess up something, you can format the HDFS and clear the temp directory specified in hadoop-site.xml and start again.
如果弄乱了什么东西,可以格式化HDFS并清空hadoop - site . xml中指定的临时目录,然后重新启动。
HBase is a database representation over Hadoop's HDFS, permitting MapReduce to operate on database tables over simple files.
HBase是数据库在Hadoop的HDFS上的表现,在简单文件上执行MapReduce以操作数据库表。
For more complex scenarios, you can take advantage of the likes of Sqoop (see Resources), a SQL-to-HDFS database import tool.
对于更复杂的场景,您可以利用像Sqoop(参见参考资料)这类工具,这是一个SQL - to -HDFS数据库导入工具。
JasperSoft has added support for reporting against files in HDFS or directly against HBase, and also against various No SQL flavors..
JasperSoft增加了对HDFS中的文件报表的支持,或是直接支持HBase,也支持各种NoSQL风格。
To minimize global bandwidth consumption and read latency, HDFS tries to satisfy a read request from a replica that is closest to the reader.
副本的选择,为了降低整体的带宽消耗和读延时,HDFS会尽量让reader读最近的副本。
To minimize global bandwidth consumption and read latency, HDFS tries to satisfy a read request from a replica that is closest to the reader.
副本的选择,为了降低整体的带宽消耗和读延时,HDFS会尽量让reader读最近的副本。
应用推荐