有1对1的实时聊天。两种解决方案:
1)我将每条消息存储到数据库中,并在 jQuery 的帮助下每秒检查数据库中是否有新消息。当然我也使用缓存。如果有,我们会给出该消息。
2)我将每条消息存储在一个 html 文件中,并且每秒通过 jQuery 一遍又一遍地显示该文件。
什么更好?或者还有第三种选择?一般来说,对于这种项目,mysql 和 file 哪个更好?
非常感谢。
PS 最重要的问题是:什么更有效,什么方式会消耗更少的资源!
编辑:现在,对于许多聊天来说,这是否非常糟糕(假设有 2,500 个聊天,这意味着5,000 个用户)使用长轮询并通过 javascript 检查每秒编辑文件的时间?我使用非常类似的方法,例如这个聊天: http://css-tricks.com/jquery-php-chat/ 它会杀死我的主机吗?
There are 1 on 1 live chat. Two solutions:
1) I store every message into database and with jQuery's help I check if there is a new message in database every second. Of course I use cache either. If there is, we give that message.
2) I store every message in one html file and every second through jQuery that file is shown over and over again.
What is better? Or there is third option? And in general, what is better, mysql or file for this kinda project?
Thank you very much.
P.S. The most important question is: what is more efficient and what way will eat less resources!
Edit: And is it, nowadays, very bad for many chats (let's say 2,500 chats, that means 5,000 users) to use long polling and check when file was edited every second through javascript? I use very similiar methods like this chat: http://css-tricks.com/jquery-php-chat/ Will it kill my hosting?
发布评论
评论(13)
我认为将数据存储在数据库中更好。请参考以下链接
脚本教程聊天
I think storing the data on the database is better. Please refer the following link
Script Tutorials Chat
每个人都给出了各种各样的意见,但我认为没有人真正击中要害。
当谈到存储数据时,数据量、访问速率以及其他几个因素都决定了什么是最好的存储平台。
有些人建议使用 memcached。现在,虽然这是一个有效的答案(您可以使用它),但我认为这不是一个好主意,仅仅基于 memcached 将数据存储在服务器内存中的事实。
您的内存不是用于存储数据,而是用于实际应用程序、操作系统、共享库等的使用。
在内存中存储数据可能会导致当前运行的其他应用程序出现很多问题。如果您在 RAM 中存储太多数据,您的应用程序将无法完成分配给它们的操作。
尽管这比基于磁盘的存储平台(例如 MySQL)更快,但它并不那么可靠。
我个人会使用 MySQL 作为服务器端的存储引擎。这将减少您遇到的问题数量,并使数据变得非常易于管理。
为了加快对客户的响应,我会考虑在您的服务器上运行node。
这是因为它是事件驱动的且非阻塞的。
这意味着什么?
那么,当客户端 A 请求存储在硬盘驱动器上的一些数据时,传统上 PHP 可能会对 C++ 说,获取存储在硬盘驱动器该扇区上的这块数据。 C++ 会说“好吧,没问题”,当它获取信息时,PHP 将等待数据被读取并返回,然后再继续其操作,阻止所有其他客户端与此同时。
对于节点,情况略有不同。 Node 会对内核说,“给我获取这块信息,完成后给我打电话”,然后它继续接受来自其他可能不需要磁盘访问的客户端的请求。
突然之间,因为我们已经为内核分配了一个回调,所以我们不必等待:),快乐的日子。
看看这张图片:
这确实可能是您正在寻找的答案,请参阅以下内容以获取有关如何进行的更多描述性和详细信息节点可能是您的正确选择:
Everyone has given a wide range of opinions but I don't think anyone has really hit the nail on the head.
When it comes down to storing data, the amount of data, the rate it is to be accessed, and several other factors all determine what's the best storage platform.
Some people have suggested using memcached. Now although this is a valid answer (you can use it), I don't think that this is a good idea, solely based on the fact that memcached stores data within your server's memory.
Your memory is not for data storage, it's for use of the actual applications, operating system, shared libraries, etc.
Storing data within the memory can cause a lot of issues with other applications currently running. If you store too much data in your RAM your applications would not be able to complete operations assigned to them.
Although this is faster then a disk based storage platform such as MySQL, it's not as reliable.
I would personally use MySQL as your storage engine server-side. This would reduce the amount of problems you would come across and also makes the data very manageable.
To speed up the responses to your clients I would look at running node on your server.
This is because it's event driven and non-blocking.
What does that mean?
Well, when Client A requests some data that is stored on the hard drive, traditionally PHP might say to the C++, fetch me this chunk of data stored on this sector of the hard drive. C++ would say 'ok no problem', and while it goes of to get the information PHP would sit and wait for the data to be read and returned before it continues it's operations, blocking all other client's in the meantime.
With node, it's slightly different. Node will say to the kernel, 'fetch me this chunk of information and when your done, give me call', and then it continues to take requests from other clients that may not need disk access.
So suddenly because we have assigned a callback to the kernel, we do not have to wait :), happy days.
Take a look at this image:
This really could be the answer your looking for, please see the following for a more descriptive and detailed information regarding how node could be the right choice for you:
第四种选择,如果您已经有了想要使用的 PHP 代码,可能不是您想要的,但也许最有效的是使用基于 Javascript 的服务器而不是 php。
Node.js 很容易成为聊天服务器,并且可以将所有最近的消息存储为 Javascript 变量。
您可以使用长轮询或其他彗星技术,这样您就不必等待消息更新。
此外,Javascript 服务器的基于事件的架构意味着没有闲置等待消息的开销。
A fourth option, probably not what you want if you already have PHP code you want to use, but maybe the most efficient is to use a Javascript based server instead of php.
Node.js is easily capable of being a chat server and can store all the recent messages as a Javascript variable.
You can use long polling or other comet techniques so that you so not have to wait a second for messages to update.
Also, the event based architecture of a Javascript server means that there is no overhead for idling around waiting for messages.
这取决于同时聊天的数量。如果是为了支持并且您预计平均负载为一次 1 到 5 个聊天会话,那么您不必太担心。只需确保当一段时间没有活动时停止刷新并显示一条消息供用户单击以恢复聊天会话。
如果访问者会互相聊天,并且您期望同时进行大量会话 - 10-50 个,您仍然可以使用 PHP + 数据库。只需确保您没有进行多余的查询并且您的查询已正确缓存即可。为了减少负载,您还可以拒绝登录 Web 服务器的聊天脚本:
编辑:
您可以有延迟模式。例如,如果您查询 2 次,延迟 1 秒,但没有得到任何数据,则可以将延迟增加到 2 秒。如果达到 10 个查询而没有响应 - 将延迟增加到 5 秒。 10 分钟后,您可以暂停对话,要求用户单击按钮才能恢复聊天。这样,结合上面的建议将保证足够低的负载以进行许多并发聊天
Edit2:
我建议您找到一些 flash 或 java 解决方案并购买它。对于 5000-10000 个用户,您必须是天才才能使其在 VPS 上运行,尤其是在 RAM 不多的情况下。并不是说这是不可能的,但你可以租用更便宜的VPS,然后用剩下的钱购买一些java或flash解决方案(不知道flush是否支持2路连接,我不是flash专家)。
请注意用户数量:如果您有 10,000 个用户,我的猜测是您同时进行的聊天不会超过 100 个。去看看约会网站——他们在线的用户不超过 10%,也许大多数人都在做其他事情而不是聊天
It depends on number of chats in the same time. If it's for support and you expect average load to be 1 to 5 chat sessions at a time then you don't to worry too much. Just make sure that when there is no activity for some time stop refreshing and show a message for user to click to resume chat session.
If the visitors will chat with each other and you expect big number of sessions - 10-50 at the same time you can still use PHP + database. Just make sure you don't make redundant queries and your queries are cached correctly. To reduce load you can also deny chat script from being logged in web server:
Edit:
you can have delay schema. For example if you query 2 times with delay 1 second and you get no data you can increase delay to 2 seconds. if you reach 10 queries with no response - increase delay to 5 seconds. After 10 minute you can pause the conversation, requiring users to click on a button to resume the chat. That'll, combined with advices above will guarantee low enough load to have many concurrent chats
Edit2:
I suggest you to find some flash or java solution and buy it. With 5000-10000 users you have to be genius to make it work on VPS, especially if RAM is not much. Not that it's not possible but you can rent cheaper VPS and with the rest of the money buy some solution in java or flash (don't know if flush supports 2 way connection, I'm not a flash expert).
Note about number of users: if you have 10 000 users my guess is that you'll have not more than 100 chats at the same time. Go and look dating sites - they have not more than 10% of the users online and maybe most of them are doing something else and not chatting
第三个选项。使用 MEMCACHE。读/写速度无限快。非常适合您的应用。
3rd option. use MEMCACHE. infinitely faster read/writes. perfect for your application.
将聊天消息存储在数据库中,但使用 Memcached 作为数据库读取的缓存层。因此,最流行的阅读(例如聊天室中的最后 20 条消息)将始终直接从内存中提供。
这为您带来了最频繁操作的速度和所有消息的持久存储的优势。
Store the chat messages in the database but use Memcached as a caching layer for the database reads. So the most popular reads (e.g. the last 20 messages in the chat room) will always be served straight out of memory.
This gives you the benefit for speed for the most frequent operations and persistant storage for all of the messages.
只是提出另一个选项...平面文件可以提供资源消耗较少的替代方案。
每个聊天都被分配一个唯一的 ID 并为其存储一个平面文件。每次聊天都会向此文件添加一行。然后,每个客户端计算机使用 jquery 仅检查文件的修改日期,以查看聊天是否已更新。
虽然我通常不会推荐平面文件而不是数据库,但我有一种偷偷摸摸的感觉,检查平面文件上的修改日期会比 MySQL 替代方案更好地扩展。
我很感兴趣,所以我做了一些测试,结果如下:
使用现有的数据库连接,可以在 1 秒内运行的“SELECT field FROM table LIMIT 0,1”的数量:~ 4,000
打开和关闭数据库连接,但运行相同的查询:~ 1,800
检查各种不同文件的修改日期:~225,000
因此,要检查对话是否已更新,将对话存储在平面文件中并检查最后修改日期将很容易比使用数据库执行任何操作更快。
Just to throw in another option... flat files could provide a less resource-hungry alternative.
Every chat is assigned a unique ID and a flat file stored for it. Every chat adds a line to this file. Each client machine then uses jquery to check ONLY the modified date of the file, to see if the chat has been updated.
While I would never normally recommend flat files over a database, I have a sneaky feeling that checking the modified date on a flat file would scale up better than the MySQL alternative.
I was intrigued so I did some tests and here are the results:
With an existing db connection, the number of "SELECT field FROM table LIMIT 0,1" that could be run in 1 second: ~ 4,000
Opening and closing a db connection, but running the same query: ~ 1,800
Checking the modified date on various different files: ~225,000
So to check if a conversation has been updated, storing the conversations in flat files and checking for the last modified date would easily be faster than doing anything with a database.
一般来说,在将数据推送到客户端时,http 连接并不是很有用。如果流量很大,那么每 x 秒进行一次轮询往往会占用任何服务器的资源。
您应该尝试将 XMPP 与 此教程,它将对您有很大帮助 -不仅解决您的具体问题,而且让您对如何通过良好的 ole' http 实现推送技术有更广泛的了解。
In general, http connections are not very useful when it comes to pushing data to the client. Doing polls at every x seconds tend to be a resource hog on any server, given you have significant traffic.
You should try XMPP combined with BOSH. Luckily, most of the heavy work is already done for you. You can implement a pure jquery (or other js framework) based solution very quickly. Read this tutorial, it will help you a lot - not only solving your specific problem but, giving you a broader view on how to implement push technologies over the good ole' http.
除非它是一个小受众脚本 - 在数据库与文件系统之间,最好使用数据库(.)
PS:- Flash 也为聊天服务器提供了一个很好的平台,您可能也想研究一下。
Unless, its a small-audience script - Between Database vs File-System, its better to use Database(.)
P.S:- Flash also makes a great platform for chat servers, you might wanna look into that aswell.
如果您将对话定义为只有两个人,那么每秒一个请求将看起来像每个用户每秒一个读取请求,而每次有人写入内容时(例如每 10 秒)一个写入请求。因此,每 10 秒,每个会话每秒大约会收到 2.2 个请求。
对于 50 个对话,即每秒 100 个用户和 220 个请求。对于如此少量的对话,服务器上的负载很大。将对话写入 JSON 或 XML 可能会提供更具可扩展性的解决方案。
本文讨论了 Meebo 的架构 - 长-投票,彗星。
事后想想,您是否考虑过安装像 Jabber 这样的 IM 服务器而不是从头开始?
If you define a conversation as only two people, then a request every second is going to look like one read request per second per user, and one write request every time somebody writes something (say every 10 seconds). So every 10 seconds you will have about 2.2 requests per second, per conversation.
For 50 conversations, that's 100 users and 220 requests per second. That's a lot of load on a server for such a small number of conversations. Writing the conversation to JSON or XML, would probably provide a more scalable solution.
This article discusses the architecture of Meebo - long-polling, comet.
As an afterthought, have you considered installing an IM server like Jabber rather than starting from scratch?
您总能找到适合您工作的工具……兼容 XMPP 的软件。尽管文档很差,
ejabber
还不错。因为它严格遵循 XMPP 标准:http://code.google.com/p/ijab/ 您可以使用任何 XMPP 客户端。如果您愿意,您可以将所有这些存储在 RDBMS 中,并提供 gmail / google talk 中提供的类似功能。0.02 美元
you could always get the right tool for the job ... an XMPP compliant bit of software. for as poor as the documentation is,
ejabber
is pretty alright. because it follows closely the XMPP standard: http://code.google.com/p/ijab/ you can use any XMPP client. You can store all of it in an RDBMS if you like and provide similar functionalities that are offered in gmail / google talk.$0.02
一个真正快速的替代方案可能是像 MongoDB 这样的 NoSQL 数据库:
A really fast alternative could be a NoSQL database like MongoDB:
我不使用它,但你也许可以尝试 Photon ,一个基于 Mongrel 的非常高速的框架。
在 作者博客(法语)您有一个示例,实时聊天服务器的 30 行代码,带有视频演示。
I don't use it but you maybe can try Photon , a very high speed framework based on Mongrel.
On the author blog (in french) you have a example , 30 lines of code for a real time chat server, with video demonstration.