平面文件和 MySQL RAM 数据库哪个更快?
我需要一种简单的方法让多个正在运行的 PHP 脚本共享数据。
我应该创建一个带有 RAM 存储引擎的 MySQL 数据库,并通过它共享数据(多个脚本可以同时连接到同一个数据库吗?)
或者每行包含一条数据的平面文件会更好吗?
I need a simple way for multiple running PHP scripts to share data.
Should I create a MySQL DB with a RAM storage engine, and share data via that (can multiple scripts connect to the same DB simultaneously?)
Or would flat files with one piece of data per line be better?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
平面文件?不……
使用好的数据库引擎(MySQL、SQLite 等)。然后,为了获得最佳性能,请使用 memcached 来缓存内容。
In this way, you have the ease and reliability of sharing data between processes using proven server software that handles concurrency, etc... But you get the speed of having your data cached.
请记住以下几点:
Flat files? Nooooooo...
Use a good DB engine (MySQL, SQLite, etc). Then, for maximum performance, use memcached to cache content.
In this way, you have the ease and reliability of sharing data between processes using proven server software that handles concurrency, etc... But you get the speed of having your data cached.
Keep in mind a couple things:
为了维护者的理智,请不要使用平面文件。
如果您只是希望尽快共享数据,并且可以将其全部保存在 RAM 中,那么 memcached 是完美的解决方案。
如果您想要持久保存数据,请使用 DBMS,例如 MySQL。
Please don't use flat files, for the sanity of the maintainers.
If you're just looking to have shared data, as fast as possible, and you can hold it all in RAM, then memcached is the perfect solution.
If you'd like persistence of data, then use a DBMS, like MySQL.
一般来说,数据库更好,但是,如果您共享少量且大部分是静态的数据,则使用平面文件可能会带来性能优势(并且简单)。
然而,除了琐碎的数据共享之外,我会选择数据库。
Generally, a DB is better, however, if you are sharing a small, mostly static amount of data, there might be performance benefits (and simplicity) of doing it with flat files.
Anything other than trivial data sharing and I would pick a DB however.
1- 平面文件有用的地方:
平面文件可能比数据库更快,但在非常特定的应用程序中。
如果从头到尾读取数据而不进行任何搜索或写入,它们的速度会更快。
如果数据不适合内存并且需要完全读取才能完成工作,那么它“可以”比数据库更快。此外,如果写入量多于读取量,平面文件也会发挥作用,大多数默认数据库设置将需要使读取查询等待写入完成,以便维护索引和外键。使写入查询通常比简单读取慢。
TD/LR 版本:
将平面文件用于基于作业的系统(也称为简单日志解析),而不是用于网络搜索查询。
2- 平面文件陷阱:
如果您使用平面文件,则需要在文件更改时使用自定义锁定机制同步脚本。如果出现错误,这可能会导致速度减慢、损坏甚至死锁。
3- 基于 RAM 的数据库?
大多数数据库都在内存中缓存查询结果和搜索索引,这使得平面文件很难击败它们。由于它们缓存在内存中,因此使其完全从内存运行在大多数情况下是无效且危险的。最好正确调整数据库配置。
如果您希望使用 ram 优化性能,我会首先考虑从 ram 驱动器运行您的 php 脚本、html 页面和小图像。其中缓存机制更可能是粗糙的,并且系统地命中硬盘驱动器以获取不变的静态数据。
使用负载平衡器可以达到更好的结果,通过背板连接到基于 RAM 的 SAN 阵列进行集群。但这是另一个话题了。
5-多个脚本可以同时连接到同一个数据库吗?
是的,这就是所谓的连接池。在 php (客户端)中,它是打开连接的函数 mysql-pconnect( http://php.net/manual/en/function.mysql-pconnect.php)。
我认为您可以在 php.ini 中配置最大打开连接数。 mysql 服务器端的类似设置定义了 /etc/mysql/my.cnf 中的最大并发客户端连接数。
您必须这样做才能利用 cpu 的并行处理并避免 php 脚本等待彼此的查询完成。它大大提高了重负载下的性能。
Apache 配置中还为常规 Web 客户端提供了一个连接池/线程池。请参阅httpd.conf。
抱歉,文字墙,很无聊。
路易斯.
1- Where the flat file can be usefull:
Flat file can be faster than a database, but in very specific applications.
They are faster if the data is read from start to finish without any search or write.
If the data dont fit in memory and need to be read fully to get the job done, It 'can' be faster than a database. Also if there is lot more write than read, flat file also shine, most default databases setups will need to make the read queries wait for the write to finish in order maintain indexes and foreign keys. Making the write queries usually slower than simple reads.
TD/LR vesion:
Use flat files for jobs based system(Aka, simple logs parsing), not for web searches queries.
2- Flat files pit falls:
If your going with a flat file, you will need to synchronize your scripts when the file change using custom lock mechanism. Which can lead to slowdown, corruption up to dead lock if you have a bug.
3- Ram based Database ?
Most databases have in memory cache for query results, search indexes, making them very hard to beat with a flat file. Because they cache in memory, making it run entirely from memory is most of the time ineffective and dangerous. Better to properly tune the database configuration.
If your looking to optimize performance using ram, I would first look at running your php scrips, html pages, and small images from a ram drive. Where the cache mechanism is more likely to be crude and hit the hard drive systematically for non changing static data.
Better result can be reach with a load balancer, clustering with a back plane connections up to ram based SAN array. But that's a whole other topic.
5- can multiple scripts connect to the same DB simultaneously?
Yes, its called connection pooling. In php (client side) its the function to open a connection its mysql-pconnect(http://php.net/manual/en/function.mysql-pconnect.php).
You can configure the maximum open connection in php.ini I think. Similar setting on mysql server side define the maximum of concurrent client connections in /etc/mysql/my.cnf.
You must do this in order to take advantage of parrallel processessing of the cpu and avoid php script to wait the query of each other finish. It greatly increase performance under heavy load.
There is also one connection pool/thread pool in Apache configuration for regular web clients. See httpd.conf.
Sorry for the wall of text, was bored.
Louis.
如果您在多个服务器上运行它们,基于文件系统的方法将无法解决问题(除非您拥有一致的共享文件系统,但这不太可能并且可能不可扩展)。
因此,无论如何,您都需要一个基于服务器的数据库,以允许在 Web 服务器之间共享数据。如果您认真对待性能或可用性,您的应用程序将支持多个 Web 服务器。
If you're running them on multiple servers, a filesystem-based approach will not cut it (unless you've got a consistent shared filesystem, which is unlikely and may not be scalable).
Therefore you'll need a server-based database anyway to allow the sharing of data between web servers. If you're serious about either performance or availability, your application will support multiple web servers.
我想说 MySql DB 将是更好的选择,除非您有某种机制来处理平面文件上的锁(以及某种控制访问的方法)。在这种情况下,DB 层(无论特定的 DBMS)充当间接层,让您不必担心。
由于OP没有指定Web服务器(并且PHP实际上可以从命令行运行),所以我不确定缓存技术是他们在这里所追求的。 OP 可能希望进行某种非网站驱动的飞行数据转换。谁知道呢。
I would say that the MySql DB would be better choice unless you have some mechanism in place to deal with locks on the flat files (and some way to control access). In this case the DB layer (regardless of specific DBMS) is acting as an indirection layer, letting you not worry about it.
Since the OP doesn't specify a web server (and PHP actually can run from a commandline) then I'm not certain that the caching technologies are what they're after here. The OP could be looking to do some sort of flying data transform that isn't website driven. Who knows.
如果您的系统有 PHP 缓存(将已编译的 PHP 代码缓存在内存中,如 APC),请尝试将数据作为 PHP 代码放入 PHP 文件中。如果必须写入数据,就会存在一些安全问题。
If your system has a PHP cache (that caches compiled PHP code in memory, like APC), try putting your data into a PHP file, as PHP code. If you have to write data, there are some security issues.
APC 和 memcached 都是不错的选择,具体取决于上下文。 共享内存也可能是一种选择。
这也是一个不错的选择,但可能不如 APC 或 memcached 快。
如果这是只读数据,则有可能 - 但可能比上述任何选项都要慢。特别是当数据很大时。然而,不要编写自定义解析代码,而是考虑简单地构建一个 PHP 数组,并 include() 该文件。
如果这是一个可由多个编写器同时访问的数据存储,请务必不要使用平面文件!从多个进程写入平面文件可能会导致文件损坏。您可以锁定文件,但会面临锁争用问题和较长的锁等待时间的风险。
处理并发写入是 mysql 和 memcached 等应用程序存在的原因。
APC, and memcached are both good options depending on context. shared memory may also be an option.
That's also a decent option, but will probably not be as fast as APC or memcached.
If this is read-only data, that's a possibility -- but may be slower than any of the options above. Especially if the data is large. Rather than writing custom parsing code, however, consider simply building a PHP array, and include() the file.
If this is a datastore that may be accessed by several writers simultaneously, by all means do NOT use a flat file! Writing to a flat file from multiple processes is likely to lead to file corruption. You can lock the file, but you risk lock contention issues, and long lock wait times.
Handling concurrent writes is the reason applications like mysql and memcached exist.