This sounds like a great LDAP problem looking for a solution. LDAP is designed for this kind of thing: a catalog of items that is optimized for data searches and retrieval (but not necessarily writes). There are many LDAP servers to choose from (OpenLDAP, Sun's OpenDS, Microsoft Active Directory, just to name a few ...), and I've seen LDAP used to catalog servers before. LDAP is very standardized and a "database" of information that is usually searched or read, but not frequently updated, is the strong-suit of LDAP.
My team have been dumping all out systems in to RDF for a month or two now, we have the systems implementation people create the initial data in excel, which is then transformed to N3 (RDF) using Perl.
We view the data in Gruff (http://www.franz.com/downloads.lhtml) and keep the resulting RDF in Allegro (a triple store from the same guys that do Gruff)
It's incredibly simple and flexible - no schema means we simply augment the data on the fly and with a wide variety of RDF viewers and reasoning engines the presentation options are enless.
The best part for me? no coding, just create triples and throw them in the store then view them as graphs.
The collection of detailed machine information is a very frustrating problem (many vendors want to keep it this way). Even if you can spend a large amount of money, you probably will not find a simple solution to this problem. IBM and HP offer products that achieve what you are seeking, but they are very, very, expensive, and will leave a bad taste in your mouth once you realize that probably all you needed was 40-50% of the functionality they offer. You say that you need to monitor *Nix servers...most (if not all) unices support RFC 1514 (windows also supports this RFC as of windows 2000). The Host MIB support defined by RFC 1514 has its drawbacks however. Since it is SNMP based, it requires that SNMP be enabled on the machine, which is typically not the default for unix and windows machines. The reason for this is that SNMP was created before the entire world was using the Internet, and thus the old, crusty nature of its security is of concern. In many environs, this may not be acceptable for security reasons. However, if you are only dealing with machines behind the firewall, this might not be an issue (I suspect this is true in your case). Several years ago, I was working on a product that monitored hundreds of unix and windows machines. At the time, I did extensive research into the mechanics of how to acquire detailed information from each machine such as disk info, running processes, installed software, up-time, memory pressure, CPU and IO load (Including Network) without running a custom client on each machine. This info can be collected in a centralized fashion. As of three or four years ago, the RFC-1514 Host MIB spec was the only "standard" for acquiring detailed real-time machine info without resorting to OS-specific software. Sun and Microsoft announced a WebService based initiative many years ago to address some of this, but I suspect it never received any traction since I cannot at the moment even remember its marketing name.
I should mention that RFC 1514 is certainly no panacea. You are at the mercy of the OS-provided SNMP service, unless you have the luxury of deploying a custom info-collecting client to each machine. The RFC-1514 spec dictates that several parameters are optional, and if your target OS does not implement it, then you are back to custom code to provide the information.
I'm contemplating how to go about this myself, and I think this is one of the key pieces of infrastructure that not having around keeps us in the dark ages. Hopefully this will be a popular question on serverfault.com. :)
It's not just that you could install a single tool to collect this data, because that's not possible cheaply, but ideally you want everything from the hardware up to the applications on the network feeding into this thing.
I think the only approach that makes sense is a modular one. The range of devices and types of information is too disparate to come under a single tool. Also the collection of data needs to be as passive and asynchronous as possible - the reality of running infrastructure means that there will be interruptions and you can't rely on being able to get the data at all times.
I think the tools you've pointed out form something of an ecosystem that could work together - Cobbler can install from bare-metal and hand over to Puppet, which has support for generating Nagios configs, and storing configs in a database; for me only Cacti is a bit opaque in terms of programmatically inserting new devices, templates etc. but I know this is possible.
Ultimately you have to sit down and work out which pieces of information are important for the business you work for, and design a db schema around that. Then, work out how to get the information you need into the db, whether it's from Facter, Nagios, Cacti, or direct snmp calls.
Since you asked about collection of data, I think if you have quite disparate kit (Dell, HP etc.) then it makes sense to create a library to abstract away as much as possible the differences between them, so your scripts just make standard calls such as "checkdiskhealth". When you add new hardware you can add to the library rather than having to write a completely new script.
Sounds like a common problem that larger organizations would have. I know our (50 person company) sysadmin has a little access database of information about every server, license, and piece of hardware installed. He's very meticulous, but when it comes time to replace or repair hardware, he knows everything about it from his little db.
You and your organization could sponsor an open source project to get oyu what you need, and give back to the community so that additional features (that you may not need now) can be developed at no cost to you.
也许是一个简单的网络服务? 只是接受机器名称或 IP 地址的东西。 当服务获得输入时,它将其放入队列中并启动任务以从通知它的机器收集数据。 任务的性质(SNMP 询问、远程调用 Perl 脚本等)可以作为机器信息的一部分存储在数据库中。 如果任务失败,机器 ID 会保留在队列中,并定期重新轮询机器,直到收集到信息。 当然,您还必须在服务器上运行某种监视器来注意到某些内容发生了变化并发送通知; 希望这可以通过您已经安装的任何服务器监控软件轻松完成。
Maybe a simple web service? Just something that accepts a machine name or IP address. When the service gets input, it sticks it in a queue and kicks off a task to collect the data from the machine that notified it. The nature of the task (SNMP interrogation, remote call to a Perl script, whatever) could be stored as part of the machine information in the database. If the task fails, the machine ID stays in the queue and the machine is periodically re-polled until the information is collected. Of course, you also have to have some kind of monitor running on your servers to notice that something has changed and send the notification; hopefully this is easily accomplished with whatever server monitoring software you've already got in place.
There are some solutions from the big vendors for managing monstrous sets of machines - such as some of the Tivoli stuff from IBM. That is probably, however, overkill for mere hundreds of machines.
There are some free software server database solutions but I do not know if they provide hooks to update information automatically from the machines with dmidecode or SNMP. One I heard about (but no personal experience, sorry), is GLPI.
I believe you are looking for Zabbix. It's open source, easy to install and use. I've installed for a client a few years ago, and if I remember right it has a client application that connects to the zabbix server to update it with the requested information. I really recommend it: http://www.zabbix.com
发布评论
评论(10)
这听起来像是一个正在寻找解决方案的重大 LDAP 问题。 LDAP 是为此类事情而设计的:针对数据搜索和检索(但不一定是写入)而优化的项目目录。 有许多 LDAP 服务器可供选择(OpenLDAP、Sun 的 OpenDS、Microsoft Active Directory,仅举几例......),我以前见过 LDAP 用于编目服务器。 LDAP 非常标准化,通常搜索或读取但不经常更新的信息“数据库”是 LDAP。
This sounds like a great LDAP problem looking for a solution. LDAP is designed for this kind of thing: a catalog of items that is optimized for data searches and retrieval (but not necessarily writes). There are many LDAP servers to choose from (OpenLDAP, Sun's OpenDS, Microsoft Active Directory, just to name a few ...), and I've seen LDAP used to catalog servers before. LDAP is very standardized and a "database" of information that is usually searched or read, but not frequently updated, is the strong-suit of LDAP.
我的团队已经将所有系统转储到 RDF 一两个月了,我们让系统实施人员在 excel 中创建初始数据,然后使用 Perl 将其转换为 N3 (RDF)。
我们在 Gruff 中查看数据 (http://www.franz.com/downloads.lhtml )并将生成的 RDF 保留在 Allegro 中(来自与 Gruff 相同的人的三重存储)。
它非常简单和灵活 - 没有模式意味着我们只需动态地扩充数据,并使用各种 RDF 查看器和推理引擎来呈现选项是无限的。
对我来说最好的部分? 无需编码,只需创建三元组并将它们扔到商店中,然后将它们视为图形。
My team have been dumping all out systems in to RDF for a month or two now, we have the systems implementation people create the initial data in excel, which is then transformed to N3 (RDF) using Perl.
We view the data in Gruff (http://www.franz.com/downloads.lhtml) and keep the resulting RDF in Allegro (a triple store from the same guys that do Gruff)
It's incredibly simple and flexible - no schema means we simply augment the data on the fly and with a wide variety of RDF viewers and reasoning engines the presentation options are enless.
The best part for me? no coding, just create triples and throw them in the store then view them as graphs.
收集详细的机器信息是一个非常令人沮丧的问题(许多供应商希望保持这种方式)。 即使你可以花很多钱,你也可能找不到解决这个问题的简单方法。 IBM 和 HP 提供的产品可以实现您所寻求的功能,但它们非常非常昂贵,一旦您意识到您可能需要的只是它们提供的 40-50% 的功能,就会给您留下不好的印象。 你说你需要监视 *Nix 服务器...大多数(如果不是全部)unices 支持 RFC 1514(Windows 从 Windows 2000 开始也支持此 RFC)。 然而,RFC 1514 定义的主机 MIB 支持也有其缺点。 由于它是基于 SNMP 的,因此需要在计算机上启用 SNMP,这通常不是 UNIX 和 Windows 计算机的默认设置。 原因是 SNMP 是在全世界使用互联网之前创建的,因此其安全性的陈旧和顽固性质令人担忧。 在许多环境中,出于安全原因这可能是不可接受的。 但是,如果您只处理防火墙后面的计算机,这可能不是问题(我怀疑您的情况确实如此)。 几年前,我正在开发一款可以监控数百台 UNIX 和 Windows 机器的产品。 当时,我对如何在不运行自定义程序的情况下从每台机器获取详细信息(例如磁盘信息、运行进程、安装的软件、正常运行时间、内存压力、CPU 和 IO 负载(包括网络))的机制进行了广泛的研究。每台机器上的客户端。 该信息可以集中方式收集。 截至三四年前,RFC-1514 主机 MIB 规范是无需借助操作系统特定软件即可获取详细实时机器信息的唯一“标准”。 Sun 和 Microsoft 多年前就宣布了一项基于 WebService 的计划来解决其中的一些问题,但我怀疑它从未受到任何关注,因为我现在甚至不记得它的营销名称。
我应该指出的是,RFC 1514 当然不是万能药。 除非您有能力为每台计算机部署自定义信息收集客户端,否则您将受到操作系统提供的 SNMP 服务的支配。 RFC-1514 规范规定有几个参数是可选的,如果您的目标操作系统没有实现它,那么您将返回自定义代码来提供信息。
The collection of detailed machine information is a very frustrating problem (many vendors want to keep it this way). Even if you can spend a large amount of money, you probably will not find a simple solution to this problem. IBM and HP offer products that achieve what you are seeking, but they are very, very, expensive, and will leave a bad taste in your mouth once you realize that probably all you needed was 40-50% of the functionality they offer. You say that you need to monitor *Nix servers...most (if not all) unices support RFC 1514 (windows also supports this RFC as of windows 2000). The Host MIB support defined by RFC 1514 has its drawbacks however. Since it is SNMP based, it requires that SNMP be enabled on the machine, which is typically not the default for unix and windows machines. The reason for this is that SNMP was created before the entire world was using the Internet, and thus the old, crusty nature of its security is of concern. In many environs, this may not be acceptable for security reasons. However, if you are only dealing with machines behind the firewall, this might not be an issue (I suspect this is true in your case). Several years ago, I was working on a product that monitored hundreds of unix and windows machines. At the time, I did extensive research into the mechanics of how to acquire detailed information from each machine such as disk info, running processes, installed software, up-time, memory pressure, CPU and IO load (Including Network) without running a custom client on each machine. This info can be collected in a centralized fashion. As of three or four years ago, the RFC-1514 Host MIB spec was the only "standard" for acquiring detailed real-time machine info without resorting to OS-specific software. Sun and Microsoft announced a WebService based initiative many years ago to address some of this, but I suspect it never received any traction since I cannot at the moment even remember its marketing name.
I should mention that RFC 1514 is certainly no panacea. You are at the mercy of the OS-provided SNMP service, unless you have the luxury of deploying a custom info-collecting client to each machine. The RFC-1514 spec dictates that several parameters are optional, and if your target OS does not implement it, then you are back to custom code to provide the information.
我自己正在考虑如何解决这个问题,我认为这是基础设施的关键部分之一,如果没有这些基础设施,我们就会陷入黑暗时代。 希望这将成为 serverfault.com 上的热门问题。 :)
这不仅仅是您可以安装一个工具来收集这些数据,因为这不可能便宜,但理想情况下您希望从硬件到网络上的应用程序的所有内容都输入到这个东西中。
我认为唯一有意义的方法是模块化方法。 设备范围和信息类型差异太大,无法归入单一工具之下。 此外,数据收集需要尽可能被动和异步 - 运行基础设施的现实意味着会出现中断,并且您不能依赖始终能够获取数据。
我认为你所指出的工具形成了一个可以协同工作的生态系统——Cobbler 可以从裸机安装并移交给 Puppet,Puppet 支持生成 Nagios 配置并将配置存储在数据库中; 对我来说,只有 Cacti 在以编程方式插入新设备、模板等方面有点不透明,但我知道这是可能的。
最终,您必须坐下来找出哪些信息对您所从事的业务很重要,并围绕它设计一个数据库模式。 然后,弄清楚如何将所需的信息获取到数据库中,无论是来自 Facter、Nagios、Cacti 还是直接 snmp 调用。
既然您询问了数据收集的问题,我认为如果您有完全不同的套件(戴尔、惠普等),那么创建一个库来尽可能抽象出它们之间的差异是有意义的,因此您的脚本只需进行标准调用例如“检查磁盘健康状况”。 添加新硬件时,您可以添加到库中,而不必编写全新的脚本。
I'm contemplating how to go about this myself, and I think this is one of the key pieces of infrastructure that not having around keeps us in the dark ages. Hopefully this will be a popular question on serverfault.com. :)
It's not just that you could install a single tool to collect this data, because that's not possible cheaply, but ideally you want everything from the hardware up to the applications on the network feeding into this thing.
I think the only approach that makes sense is a modular one. The range of devices and types of information is too disparate to come under a single tool. Also the collection of data needs to be as passive and asynchronous as possible - the reality of running infrastructure means that there will be interruptions and you can't rely on being able to get the data at all times.
I think the tools you've pointed out form something of an ecosystem that could work together - Cobbler can install from bare-metal and hand over to Puppet, which has support for generating Nagios configs, and storing configs in a database; for me only Cacti is a bit opaque in terms of programmatically inserting new devices, templates etc. but I know this is possible.
Ultimately you have to sit down and work out which pieces of information are important for the business you work for, and design a db schema around that. Then, work out how to get the information you need into the db, whether it's from Facter, Nagios, Cacti, or direct snmp calls.
Since you asked about collection of data, I think if you have quite disparate kit (Dell, HP etc.) then it makes sense to create a library to abstract away as much as possible the differences between them, so your scripts just make standard calls such as "checkdiskhealth". When you add new hardware you can add to the library rather than having to write a completely new script.
听起来像是大型组织会遇到的常见问题。 我知道我们(50 人的公司)系统管理员有一个小的访问数据库,其中包含有关每台服务器、许可证和安装的硬件的信息。 他非常细心,但当需要更换或修理硬件时,他从他的小数据库中知道了一切。
您和您的组织可以赞助一个开源项目来获得您需要的东西,并回馈社区,以便可以免费开发其他功能(您现在可能不需要)。
Sounds like a common problem that larger organizations would have. I know our (50 person company) sysadmin has a little access database of information about every server, license, and piece of hardware installed. He's very meticulous, but when it comes time to replace or repair hardware, he knows everything about it from his little db.
You and your organization could sponsor an open source project to get oyu what you need, and give back to the community so that additional features (that you may not need now) can be developed at no cost to you.
也许是一个简单的网络服务? 只是接受机器名称或 IP 地址的东西。 当服务获得输入时,它将其放入队列中并启动任务以从通知它的机器收集数据。 任务的性质(SNMP 询问、远程调用 Perl 脚本等)可以作为机器信息的一部分存储在数据库中。 如果任务失败,机器 ID 会保留在队列中,并定期重新轮询机器,直到收集到信息。 当然,您还必须在服务器上运行某种监视器来注意到某些内容发生了变化并发送通知; 希望这可以通过您已经安装的任何服务器监控软件轻松完成。
Maybe a simple web service? Just something that accepts a machine name or IP address. When the service gets input, it sticks it in a queue and kicks off a task to collect the data from the machine that notified it. The nature of the task (SNMP interrogation, remote call to a Perl script, whatever) could be stored as part of the machine information in the database. If the task fails, the machine ID stays in the queue and the machine is periodically re-polled until the information is collected. Of course, you also have to have some kind of monitor running on your servers to notice that something has changed and send the notification; hopefully this is easily accomplished with whatever server monitoring software you've already got in place.
大供应商提供了一些用于管理庞大机器集的解决方案 - 例如一些 Tivoli IBM 的东西。 然而,对于仅有数百台机器来说,这可能有点过大了。
There are some solutions from the big vendors for managing monstrous sets of machines - such as some of the Tivoli stuff from IBM. That is probably, however, overkill for mere hundreds of machines.
有一些免费软件服务器数据库解决方案,但我不知道它们是否提供钩子来自动从具有 dmidecode 或 SNMP 的计算机更新信息。 我听说过的一个(抱歉,没有个人经验)是 GLPI。
There are some free software server database solutions but I do not know if they provide hooks to update information automatically from the machines with dmidecode or SNMP. One I heard about (but no personal experience, sorry), is GLPI.
我相信您正在寻找 Zabbix。 它是开源的,易于安装和使用。
几年前我为一个客户端安装了,如果我没记错的话,它有一个客户端应用程序连接到 zabbix 服务器以使用请求的信息更新它。
我真的推荐它:http://www.zabbix.com
I believe you are looking for Zabbix. It's open source, easy to install and use.
I've installed for a client a few years ago, and if I remember right it has a client application that connects to the zabbix server to update it with the requested information.
I really recommend it: http://www.zabbix.com
查看 Machdb 它是针对您所描述的问题的开源解决方案。
Checkout Machdb Its an opensource solution to the problem you are describing.