没有 PHP Semaphore 如何在 PHP 中实现信号量?
问题:
如何在没有信号量包的情况下在 PHP 中实现共享内存变量(http://php.net/manual/en/function.shm-get-var.php)?
上下文
- 我有一个简单的 Web 应用程序(实际上是 WordPress 的插件)
- ,它获取一个 url,
- 然后检查数据库该 url 是否已存在,
- 如果不存在,则它会执行一些操作
- ,然后写入记录在数据库中以 url 作为唯一条目
实际发生的是 4,5,6 ...会话同时请求该 url,我在该 url 的数据库中得到最多 9 个重复条目..(可能是 9因为第一个条目的处理时间和数据库写入所花费的时间刚好足以让其他 9 个请求失败)。之后,所有请求都会读取记录已存在的正确条目,这样就很好了。
由于它是一个 WordPress 插件,因此在具有 PHP 可变编译/设置的各种共享托管平台上将会有许多用户。
所以我正在寻找更通用的解决方案。我无法使用数据库或文本文件写入,因为它们太慢了。当我写入数据库时,下一个会话已经过去了。
更新
使用唯一密钥在 uri 的新 md5 哈希值上加上 try catch 似乎可以工作。
我发现 1 个重复条目,
SELECT uri, COUNT( uri ) AS NumOccurrences
FROM edl40_21_wpfavicons_1
GROUP BY uri
HAVING (
COUNT( uri ) >1
)
LIMIT 0 , 30
所以我认为它不起作用,但这是因为它们是:(
http://en.wikipedia.org/wiki/Book_of_the_dead
http://en.wikipedia.org/wiki/Book_of_the_Dead
大写微笑)
Question:
How can I implement shared memory variable in PHP without the semaphore package (http://php.net/manual/en/function.shm-get-var.php) ?
Context
- I have a simple web application (actually a plugin for WordPress)
- this gets a url
- this then checks the database if that url already exists
- if not then it goes out and does some operations
- and then writes the record in the database with the url as unique entry
What happens in reality is that 4,5,6 ... sessions at the same time request the url and I get up to 9 duplicate entries in the database of the url.. (possibly 9 because the processing time and database write of the first entry takes just enough time to let 9 other requests fall through). After that all requests read the correct entry that the record already exists so that is good.
Since it is a WordPress plugin there will be many users on all kind of shared hosting platforms with variable compiles/settings of PHP.
So I'm looking for a more generic solution. I cant use database or text file writes since these will be too slow. while i write to the db the next session will already have passed.
fyi: the database code: http://plugins.svn.wordpress.org/wp-favicons/trunk/includes/server/plugins/metadata_favicon/inc/class-favicon.php
update
Using a unique key on a new md5 hash of the uri together with try catches around it seems to work.
I found 1 duplicate entry with
SELECT uri, COUNT( uri ) AS NumOccurrences
FROM edl40_21_wpfavicons_1
GROUP BY uri
HAVING (
COUNT( uri ) >1
)
LIMIT 0 , 30
So I thought it did not work but this was because they were:
http://en.wikipedia.org/wiki/Book_of_the_dead
http://en.wikipedia.org/wiki/Book_of_the_Dead
(capitals grin)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这可以通过 MySQL 来实现。
您可以通过锁定表的读访问权限来显式地执行此操作。但这将阻止对整个表的任何读取访问,因此可能不是更好的选择。 http://dev.mysql.com/doc/refman/5.5 /en/lock-tables.html
否则,如果表中的字段被分类为唯一,那么当下一个会话尝试将相同的 URL 写入表时,他们将收到错误,您可以捕获该错误并继续,因为如果条目已经存在,则无需执行任何操作。唯一浪费的时间是两个或多个会话创建相同 URL 的可能性,结果仍然是一条记录,因为数据库不会再次添加相同的唯一 URL。
正如评论中所讨论的,因为 URL 的长度可能非常长,固定长度的唯一哈希可以帮助克服这个问题。
This could be achieved with MySQL.
You could do it explicitly by locking the table from read access. This will prevent any read access from the entire table though, so may not be preferable. http://dev.mysql.com/doc/refman/5.5/en/lock-tables.html
Otherwise if the field in the table is classified as unique, then when the next session tries to write the same URL to the table they will get an error, you can catch that error and continue as there's no need to do anything if the entry is already there. The only time wasted is the possibility of two or more sessions creating the same URL, the result is still one record, as the database won't add the same unique URL again.
As discussed in comments, because the length of a URL could be very long, and fixed length unique hash can help overcome that issue.
PHP 中还有其他共享内存模块(shmop 或 APC 例如),但我认为你的意思是存在依赖非标准/未预安装库的问题。
我的建议是,在执行“其他操作”之前,您需要在数据库中创建一个条目,也许状态为“正在编译”(或其他),这样您就知道它仍然不可用。这样您就不会遇到获取多个条目的问题。我还确保您在可用时使用事务,以便您的提交是原子的。
然后,当您完成“其他操作”时,将数据库条目更新为“可用”并执行您需要执行的其他操作。
There are other shared memory modules in PHP (shmop or APC for example), but I think what you are saying is that there is an issue relying on non-standard/not pre-installed libraries.
My suggestion is that before you go and do "other operations" you need to make an entry in the database, perhaps with a status of "compiling" (or something) so you know it is still not available. This way you don't run into issues with getting multiple entries. I would also be sure you are using transactions when they are available so your commits are atomic.
Then, when you "other operations" are done, update the database entry to "available" and do whatever else it is you need to do.