管理集群管理员交流。
重申一下,懒惰的Linux集群管理员知道。
我们发现最懒惰的Linux集群管理员依赖于命令行。
We find that our laziest Linux cluster admins will live and die by the command line.
在本文中,我们讨论最懒惰的Linux集群管理员的一些秘诀。
In this article we discuss some of the best secrets of the laziest Linux cluster admins.
懒惰的Linux集群管理员不怕承认自己不知道某个东西。
The lazy Linux cluster admin is not afraid of saying he doesn't know something.
出于培训目的,正常集群操作中的一些内容可能还与集群管理员有关。
Some parts might also be relevant to cluster administrators for educational purposes and during normal cluster operation.
懒惰的Linux集群管理员不会做那些让他们的脑子变得迟钝的工作。
The lazy Linux cluster admin does not accept work that turns his brain into fluff.
任何有经验的横向扩展集群管理员都会告诉您,TFTP不可靠而且不可伸缩。
As any experienced scale-out-oriented admin can tell you, TFTP is unreliable and does not scale.
我们认识的最成功的Linux集群管理员都非常了解当前的开放源码项目。
The most successful Linux cluster admins that we have worked with have a vast knowledge of current open source projects.
收集系统的相关信息之后,应该把这些信息存储在某个方便的位置,让其他集群管理员可以轻松地访问。
Once you collect information about a system, the information should be stored some place useful so that the rest of the cluster staff can access it easily.
最懒惰的集群管理员的另一个特点是,他们对开放源码运动都非常热心,愿意在自己的工作中使用开放源码软件。
One other aspect of the laziest cluster admins is that they are quite passionate about open source and use it in their own personal pursuits.
另外,AIX7的一些特性将帮助减少发现失败和统一设备命名花费的时间,从而帮助系统管理员简化集群管理。
Furthermore, AIX 7 will have features that will help reduce the time to discover failures, along with common device naming, to help systems administrators simplify cluster administration.
现在,部署管理员已经知道可用的节点,于是可以创建应用服务器并将其添加到集群中。
Now that the deployment manager is aware of the available nodes, application servers can be created and added to the cluster.
然后,服务器管理员则必须处理由此导致的 “内存溢出”错误,集群环境中的服务器相似性,以及服务器重启时的串行化异常。
Server administrators then have to deal with the fallout in the form of "out of memory" errors, server affinity in clustered environments, and serialization exceptions on server restarts.
用户节点——理想情况下,集群的计算机节点不应该接受外部连接,只应当由管理员通过管理服务器访问。
User nodes — Ideally, the compute nodes of a cluster should not accept external connections and should only be accessible to system administrators through the management server.
系统管理员需要确保匹配动态集群成员策略的所有节点都安装了这些资源。
The system administrator has to ensure that all nodes that match the dynamic cluster membership policy have those resources installed.
当动态集群在自动模式下运行时,应用程序位置控制器将在管理员不直接参与的情况下停止一个集群成员。
When dynamic clusters are running in automatic mode, the application placement controller can bring down a cluster member without the direct involvement of an administrator.
发生故障转移的情况有两种:一是系统管理员指示集群中的节点执行故障转移;二是出现灾难性应用程序或服务器故障的情况迫使资源组转移。
Failover can occur when a systems administrator instructs the nodes in the cluster to do so or when circumstances like a catastrophic application or server failure forces the resource groups to move.
它向节点管理员询问资源信息,做出高级计划决策,并通过向集群控制器发出请求来实现。
It queries node managers for information about resources, makes high level scheduling decisions, and implements them by making requests to cluster controller.
反之,对迁移过程缺乏信心的管理员也许希望在迁移之前将队列管理器从集群中移除。
Conversely, administrators who are less confident in the migration process may want to remove queue the manager from the cluster before migrating.
这让AIX管理员可以通过OS把一组 AIX节点组成集群并利用集群功能。
This will allow AIX administrators to use the OS to cluster a set of AIX nodes and take advantage of the clustering capabilities.
这一技术允许系统管理员创建一种利用多个节点(物理机)和集群(跨越物理资源的应用程序进程)的部署拓扑。
This technology allows system administrators to create a deployment topology that leverages multiple nodes (physical machines) and clusters (application processes that span physical resources).
幸运的是,可以在WebSphereApplicationServerv5中集群WSGW,但是在系统管理员部分还需要一些额外的装配。
Fortunately, the WSGW can be clustered in WebSphere Application Server V5, but some additional assembly is required on the part of the system administrator.
尽管软件可以使用很多工具和向导来组建服务器集群,但专家数据库管理员(DataBase Administrator,DBA)可以帮助处理各种意外问题。
Although many tools and wizards are available with the software to cluster servers, an expert database administrator (DBA) can help with unexpected problems.
管理员的透视图肯定是不同的,考虑到很多服务器会包含存储系统这一潜在因素(要查看更多创建Ceph集群的信息,见参考资料部分)。
The administrator's perspective will certainly differ, given the potential for many servers encompassing the storage system (see the Resources section for information on creating a Ceph cluster).
在当时,可用的集群管理工具并不多,因此大多数管理员用自己开发的工具集来部署、监视和管理他们的集群。
Back then, there were not many cluster management tools available and as such, most admins created home-grown suites of tools that deployed, monitored, and managed their clusters.
通过结合使用这两种可定制性很强的工具,管理员可以很好地了解集群中发生的大多数情况。
Between these two very customizable tools an admin can get great insight into the multitudes of things happening on the cluster.
这种 “集群的集群”方式使懒惰的管理员设计的系统能够超过预期规模,支持集中的管理控制,同时不必担心大规模操作失败。
This cluster-of-clusters approach will allow the lazy admin to design systems that can scale beyond any budget and allow the same admin central control without fear that massive operations will fail.
许多管理员在设计集群时并没有考虑到“可控制”方面。
We are surprised at the number of admins who don't think in terms of "lights out" when designing their clusters.
简单地说,这个wiki 包含的信息应该足够全面,如果有人询问关于此集群的问题,管理员只需回答 “您自己去查看 wiki”。
In short: The wiki should have enough information so that if someone asks a question about the cluster the admin need only say "Check the Wiki."
应用推荐