使用 Flume 进行分布式日志记录
我有一个移动服务,分布在 7 台服务器上,每台服务器都执行特定的任务。我想记录他们的信息,然后从中获取商业智能。我已经把它四舍五入到 Flume 了。我如何使用它来收集信息? 我的系统是用PHP编写的。 Flume 可以在 PHP 上运行吗?
I have a mobile service distributed over 7 servers each of them doing a specific task. I want to log information from them and later derive business intelligence from them. I have rounded it to Flume. How can I use it to gather information?
My system is written in PHP. Does flume work on PHP?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这取决于您的需求以及您的服务器环境是什么样的。我可以告诉你的一件事是 Flume 没有与 PHP 直接集成。然而,还有其他方法可以解决这个问题。
我正在运行托管在 Amazon EC2 中的服务器,该服务器运行 rsyslog + Flume 的组合。在我的设置中,我从在 Linux 服务器上运行 nginx 的服务器收集 Web 日志。 nginx 服务器将 Web 请求日志作为 syslog 消息发送到 rsyslog 中; rsyslog 向我的中央 Flume 收集器发出 TCP 请求; Flume 收集器使用 syslogTcp 接收器监听这些消息; Flume 收集器将消息转发到 Amazon S3。然后,我会在稍后的某个时间使用 Amazon EMR 分析日志文件。
在您的情况下,PHP 还可以配置为写入系统日志 (http://php. net/manual/en/function.syslog.php);因此,您可以进行类似的设置,并让 syslog 将日志转发到中央 Flume 收集器节点。
如果您不想依赖系统日志消息,您也可以在服务器上运行 Flume 客户端。 Flume 客户端可以配置为使用 Flume 的 tail Sink 跟踪本地日志文件,或者您可以使用 Flume 的 tailDir Sink 跟踪指定目录中的所有日志文件,并将它们流式传输到 Flume 收集器。
Flume 的一个很好的好处是,您可以将其配置为以非常高的概率传递重要消息的消息,而其他消息可以以较低的传递要求发送。
Flume 用户指南是获取更多详细信息的最佳位置:
http://flume.apache.org/FlumeUserGuide.html
另一个值得一看的好地方是跳转到 freenode 并加入 #flume 频道。
It depends on your needs and what your server environment is like. One thing I can tell you is that Flume has no direct integration with PHP. However, there are other ways around this.
I'm running servers hosted in Amazon EC2 running a combination of rsyslog + flume. In my setup, I collect web logs from my servers running nginx running on linux servers. The nginx servers emit web request logs as syslog messages into rsyslog; rsyslog makes a tcp request to my central flume collector; the flume collector listens to these messages with the syslogTcp sink; the flume collector forwards the messages into Amazon S3. I then analyze the logs files with Amazon EMR at some later point in time.
In your situation, PHP can also be configured to write to syslog (http://php.net/manual/en/function.syslog.php); therefore, you can have a similar setup and have the syslog forward the logs to a central flume collector node.
If you don't want to rely on syslog messages, you can also have flume clients running on your servers. The flume clients can be configured to tail local log files with flume's tail sink or you can tail all log files in a specified directory with flume's tailDir sink and have them streamed to a flume collector.
A nice benefit of flume is that you can configure it to have messages delivered with a very good probability that it'll reach it's destination for important messages, while other messages can be sent with lower delivery requirements.
The flume user guide is your best place to get more detailed information:
http://flume.apache.org/FlumeUserGuide.html
Another good place to look is to jump on freenode and join the #flume channel.
Flume 代理可以运行在各种操作系统上,包括 Windows 和 Linux。
简而言之,如果您托管在这些操作系统中的任何一个上,则没有理由不能使用 Flume 来聚合来自多个盒子的日志。
Flume agents can sit on various OS's, including Windows and Linux.
So in short, if you're hosting on either of these operating systems, there is no reason why you can't use flume to aggregate your logs from multiple boxes.