将日志存储在 SQL 数据库还是文件中哪个更有效?
我经常由 cron 加载很少的脚本。现在我不存储任何日志,所以如果任何脚本加载失败,我在看到结果之前不会知道 - 即使我注意到结果不正确,我也无能为力,因为我不'不知道哪个脚本失败了。
我决定存储日志,但我仍然不知道该怎么做。所以,我的问题是 - 什么更有效 - 将日志存储在 sql 数据库或文件中?
我可以在 mysql 数据库中创建“日志”表并将每个日志存储在单独的行中,或者我可以使用 php 的 file_put_contents 或 fopen/fwrite 将日志存储在单独的文件中。
我的脚本在工作时大约每分钟添加 5 个日志(总共)。我做了一些测试来确定哪个更快 - fopen/fwrite 或 mysql 的 insert。我循环“insert”语句 3000 次以生成 3000 行,并循环 fopen/fwrite 3000 次以生成 3000 个包含示例文本的文件。 Fwrite的执行速度比sql的insert快4-5倍。我做了第二个循环 - 我循环了一个“select”语句并将其分配给一个字符串 3000 次 - 我还使用“fopen”打开了 3000 个文件并将结果分配给该字符串。结果是一样的 - fopen/fwrite 完成任务的速度快了 4-5 倍。
那么,对于所有经验丰富的程序员来说,您在存储日志方面有什么经验?有什么建议吗?
// 04.09.2011 编辑 - 谢谢大家的回答,对我帮助很大。每篇文章都很有价值,因此很难只接受一个答案;-)
I have few scripts loaded by cron quite often. Right now I don't store any logs, so if any script fails to load, I won't know it till I see results - and even when I notice that results are not correct, I can't do anything since I don't know which script failed.
I've decided to store logs, but I am still not sure how to do it. So, my question is - what's more efficient - storing logs in sql database or files?
I can create 'logs' table in my mysql database and store each log in separate row, or I can just use php's file_put_contents or fopen/fwrite to store logs in separate files.
My scripts would approximately add 5 logs (in total) per minute while working. I've done few tests to determine what's faster - fopen/fwrite or mysql's insert. I looped an "insert" statement 3000 times to make 3000 rows and looped fopen/fwrite 3000 times to make 3000 files with sample text. Fwrite executed 4-5 times faster than sql's insert. I made a second loop - I looped a 'select' statement and assigned it to a string 3000 times - I also opened 3000 files using 'fopen' and assigned the results to the string. Result was the same - fopen/fwrite finished the task 4-5 times faster.
So, to all experienced programmers - what's your experience with storing logs? Any advice?
// 04.09.2011 EDIT -
Thank you all for your answers, they helped ma a lot. Each post were valuable, so it was quite hard to accept only one answer ;-)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
使用文件的日志效率更高,但是存储在数据库中的日志更容易读取,甚至可以远程读取(例如,如果需要,您可以编写 Web 前端)。
但请注意,连接数据库并向数据库插入行很容易出错(数据库服务器关闭、密码错误、资源不足),因此如果您决定使用数据库,您将在哪里记录这些错误?
Logs using files are more efficient, however logs stored in the database are easier to read, even remotely (you can write a web frontend if required, for example).
Note however that connecting and inserting rows into the database is error prone (database server down, password wrong, out-of-resources) so where would you log those errors if you decided to use the database?
您可以使用 Zend_Log 之类的组件,它本身支持该概念附加到同一日志实例的写入器数量。通过这种方式,您可以将相同的消息记录到一个或多个不同的位置,而无需更改您的日志记录代码。而且您始终可以更改代码来替换日志系统或以简单的方式添加新的系统。
对于你的问题,我认为如果你(开发人员)是唯一需要阅读日志消息的人,那么记录到文件会更简单、更合适。
如果您需要其他人需要在 Web 界面中读取日志或者您需要搜索日志的能力,请改为记录到数据库。正如其他人指出的那样,并发性也很重要,如果您有很多用户登录到数据库,则可以更好地扩展。
最后,每分钟 5 条消息的日志频率几乎不需要应用程序的 CPU,因此您无需担心性能。在您的情况下,我将从日志文件开始,然后如果您的需求发生变化,则进行更改(或添加更多编写器)。
You can use a component such as Zend_Log which natively supports the concept of writers attached to the same log instance. In that way you can log the same message to one or more different place with no need to change your logging code. And you can always change your code to replace the log system or add a new one in a simple way.
For your question I think that log to files is simpler and more appropriate if you (developer) is the only one who needs to read log messages.
Log to db instead if you need other people needs to read logs in a web interface or if you need the ability to search through logs. As someone else has pointed out also concurrency matters, if you have a lot of users log to db could scale better.
Finally, a log frequency of 5 messages per minute requires almost no cpu for your application, so you don't need to worry about performances. In your case I'd start with logfiles and then change (or add more writers) if your requisites will change.
速度并不是一切。是的,写入文件速度更快,但如果日志位于数据库中,则在日志中查找所需内容要快得多。几年前,我将 CMS 从基于文件的日志转换为 Mysql 表。表比较好。
Speed isn't everything. Yes, it's faster to write to files but it's far faster for you to find what you need in the logs if they are in a database. Several years ago I converted our CMS from a file-based log to a Mysql table. Table is better.
评论你的发现。
关于写入文件,您可能是对的。
关于阅读,你大错特错了。
写入数据库:
内存
引擎,它将表写入RAM中。当 CPU 负载较低时,将数据传输到基于磁盘的表。从数据库中读取
这是数据库真正发挥作用的地方。
您可以组合来自不同条目的各种信息,比使用平面文件更快、更容易。
如果您对
where
子句中使用的字段有索引,结果几乎会立即返回,请尝试在平面文件上执行此操作。不要介意表没有标准化,这对于平面文件来说会慢得多并且更困难。
这确实是理所当然的。
Commenting on your findings.
Regarding the writing to the file you are probably right.
Regarding the reading you are dead wrong.
Writing to a database:
memory
engine, this writes the table in RAM. Transfer the data to a disk-based table when CPU load is low.Reading from the database
This is where the database truly shines.
You can combine all sorts of information from different entries, much much faster and easier than you can ever do from a flat file.
If you have indexes on the fields used in the
where
clause the result will return almost instantly, try doing that on a flat file.Never mind the fact that the table is not normalized, this will be much much slower and harder to do with a flat file.
It's a no brainer really.
这取决于日志的大小和并发级别。由于最新的原因,您的测试完全无效 - 如果网站上有 100 个用户,并且您有 10 个线程写入同一个文件,则 fwrite 不会那么快。 RDBMS 提供的功能之一是并发控制。
这取决于您想要执行的要求和分析类型。仅读取记录很容易,但是聚合指定时间段内的一些数据怎么样?
大型网站使用 Scribe 等系统来写入日志。
然而,如果您谈论的是每分钟 5 条记录,那么这确实是低负载,因此主要问题是您将如何读取它们。如果某个文件适合您的需要,请使用该文件。一般来说,仅追加写入(通常用于日志)非常快。
It depends on the size of the logs and on the concurrency level. Because of the latest, your test is completely invalid - if there are 100 users on the site, and you have let's say 10 threads writing to the same file, fwrite won't be so faster. One of the things RDBMS provides is concurrency control.
It depends on the requirements and lot kind of analysis you want to perform. Just reading records is easy, but what about aggregating some data over a defined period?
Large scale web sites use systems like Scribe for writing their logs.
If you are talking about 5 records per minute however, this is really low load, so the main question is how you are going to read them. If a file is suitable for your needs, go with the file. Generally, append-only writes (usual for logs) are really fast.
我认为将日志存储在数据库中不是一个好主意。通过文件将日志存储到数据库的优点是您可以利用 SQL 的强大功能更轻松地分析日志,但缺点是您必须花费更多的时间来维护数据库。您最好设置一个单独的数据库服务器来存储您的日志,否则您可能会获得过多的日志
INSERT
,这会降低数据库在生产使用中的性能;此外,与文件(logrotate 等)相比,数据库中的归档日志迁移并不容易。现在你应该使用一些特殊的、功能丰富的日志系统来处理你的日志,例如,logstash(http://logstash.net/< /a>)具有日志收集器、过滤器,它可以将日志存储在外部系统(例如elasticsearch)中,并结合漂亮的前端来可视化和分析日志。
参考:
I think storing logs in database is not a good idea. The pros of storing logs to databases over files is that you can analyse your logs much more easily with the power of SQL, the cons, however, is that you have to pay much more time for database maintainence. You'd better to set up a seperate database server to store your logs or your might get too much log
INSERT
which will decrease your database performance to production use; also, it's not easy to migrate, archive logs in database, compared with files(logrotate, etc).Nowadays you should use some special feature-rich logging system to handling your logs, for example, logstash(http://logstash.net/) has log collector, filter, and it can store log in external systems such as elasticsearch, combined with a beautiful frontend for visualizing and analyzing your logs.
Ref:
写入文件系统应该总是更快。
然而,这应该是您关心的问题。执行简单的插入和写入文件系统都是快速操作。您需要担心的是当数据库出现故障时会发生什么。我个人喜欢同时写入两者,因此如果出现任何问题,总会有一个日志,而且您还可以轻松地从数据库中进行搜索。
Writing the filesystem should always be faster.
That however shouldent be your concern. Both doing a simple insert and writing to a file system are quick operations. What you need to be worried about is what happens when your database goes down. I personaly like to write to both so there is always a log if anything goes wrong but also you have the ease of searching from a database.
在我看来,错误日志记录最好仅限于文件,因为如果数据库出现问题,您仍然可以记录该问题。显然,如果您的错误日志记录需要连接到数据库,那么这不是一个选择!
然而,我还要说的是,一般日志记录是我留在数据库中的内容,但这仅适用于您为审计跟踪等进行大量日志记录的情况。
Error logging is best limited to files in my opinion, because if there is a problem with the database, you can still log that. Obviously that's not an option if your error logging requires a connection to the database!
What I will also say however, is that general logging is something I leave within the database, however this only applies if you are doing lots of logging for audit trails etc.
就我个人而言,我更喜欢日志文件,因此我创建了两个函数:
我定义一个或两个常量(我使用 ACTIVITY_LOG 和 ERROR_LOG 两者设置为同一个文件,因此您不需要并排引用两个文件来获取总体信息运行视图)并根据需要进行调用。我还创建了一个专用文件夹(/var/log/phplogs),我编写的每个应用程序都有自己的日志文件。最后,我轮换日志,以便有一些历史记录可供客户参考。
自由使用上述函数意味着我可以相当轻松地跟踪应用程序的执行情况。
Personally, I prefer log files so I've created two functions:
I define a constant or two (I use ACTIVITY_LOG and ERROR_LOG both set to the same file so you don't need to refer to two files side by side to get an overall view of the running) and call as appropriate. I've also created a dedicated folder (/var/log/phplogs) and each application that I write has its own log file. Finally, I rotate logs so that I have some history to refer back to for customers.
Liberal use of the above functions means that I can trace the execution of apps fairly easily.