使用 php 为给定字符串生成唯一 id

发布于 2024-09-06 13:09:24 字数 544 浏览 1 评论 0原文

我使用 Zend_Cache_Core 和 Zend_Cache_Backend_File 来缓存为访问数据库的模型类执行的查询结果。

基本上查询本身应该形成 id 来缓存获得的结果,唯一的问题是它们太长了。 Zend_Cache_Backend_File 不会抛出异常,PHP 不会抱怨,但不会创建缓存文件。

我想出了一个根本效率不高的解决方案,将任何执行的查询与自动增量 id 一起存储在单独的文件中,如下所示:

0->>SELECT * FROM table 1->>SELECT * FROM 表1,表2 2->>SELECT * FROM table WHERE foo = bar

你明白了;这样我的每个查询都有一个唯一的ID。每当插入、删除或更新完成时,我都会清除缓存。

现在我确信您在这里看到了潜在的瓶颈,对于任何测试、保存或从缓存中获取,都会向文件系统发出两个(或三个,我们需要添加新 ID)请求。这甚至可能完全不需要缓存。那么有没有一种方法可以在 php 中生成查询的唯一 id,即更短的表示形式,而不必将它们存储在文件系统或数据库中?

I'm using Zend_Cache_Core with Zend_Cache_Backend_File to cache results of queries executed for a model class that accesses the database.

Basically the queries themselves should form the id by which to cache the obtained results, only problem is, they are too long. Zend_Cache_Backend_File doesn't throw an exception, PHP doesn't complain but the cache file isn't created.

I've come up with a solution that is not efficient at all, storing any executed query along with an autoincrementing id in a separate file like so:

0->>SELECT * FROM table
1->>SELECT * FROM table1,table2
2->>SELECT * FROM table WHERE foo = bar

You get the idea; this way i have a unique id for every query. I clean out the cache whenever an insert, delete, or update is done.

Now i'm sure you see the potential bottleneck here, for any test, save or fetch from cache two (or three, where we need to add a new id) requests are made to the file system. This may even defeat the need to cache alltogether. So is there a way i can generate a unique id, ie a much shorter representation, of the queries in php without having to store them on the file system or in a database?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

甜味超标? 2024-09-13 13:09:24

字符串是任意长的,因此显然不可能创建一个可以表示任何任意输入字符串而不重复的固定大小的标识符。然而,出于缓存的目的,您通常可以采用简单的“足够好”的解决方案,并将冲突减少到可接受的水平。

例如,您可以简单地使用 MD5,它只会在 2128 情况下产生 1 次碰撞。如果您仍然担心冲突(为了安全起见,您可能应该担心),您可以将查询和结果存储在缓存的“值”中,并检查何时获得值返回它实际上是您正在寻找的查询。

举个简单的例子(我的 PHP 有点生疏,但希望你能明白):

$query = "SELECT * FROM ...";

$key = "hash-" + hash("md5", $query);
$result = $cache->load($key);
if ($result == null || $result[0] != $query) {
    // object wasn't in cache, do the real fetch and store it
    $result = $db->execute($query); // etc

    $result = array($query, $result);
    $cache->save($result, $key);
}

// the result is now in $result[1] (the original query is in $result[0])

Strings are arbitrarily long, so obviously it's impossible to create a fixed-size identifier that can represent any arbitrary input string without duplication. However, for the purposes of caching, you can usually get away with a solution that's simple "good enough" and reduces collisions to an acceptable level.

For example, you can simply use MD5, which will only produce a collision in 1 in 2128 cases. If you're still worried about collisions (and you probably should be, just to be safe) you can store the query and the result in the "value" of the cache, and check when you get the value back that it's actually the query you were looking for.

As a quick example (my PHP is kind of rusty, but hopefully you get the idea):

$query = "SELECT * FROM ...";

$key = "hash-" + hash("md5", $query);
$result = $cache->load($key);
if ($result == null || $result[0] != $query) {
    // object wasn't in cache, do the real fetch and store it
    $result = $db->execute($query); // etc

    $result = array($query, $result);
    $cache->save($result, $key);
}

// the result is now in $result[1] (the original query is in $result[0])
后知后觉 2024-09-13 13:09:24

MD5!!

Md5 生成一个长度为 32 的字符串,似乎工作正常,创建了缓存文件(文件名长度约为 47),因此操作系统似乎不会拒绝它们。

//returns id for a given query
function getCacheId($query) {
    return md5($query);
}

就是这样!但是存在冲突问题,我认为对 md5 散列进行加盐处理(可能与表的名称一起)应该会使其更加健壮。

//returns id for a given query
function getCacheId($query, $table) {
    return md5($table . $query);
}

如果有人想要我如何实现结果缓存的完整代码,只需发表评论,我很乐意发布它。

MD5!!

Md5 generates a string of length 32 that seems to be working fine, the cache files are created (with filenames about of length 47) so it seems as though the operating system doesn't reject them.

//returns id for a given query
function getCacheId($query) {
    return md5($query);
}

And that's it! But there's that issuse of collisions and i think salting the md5 hash (maybe with the name of the table) should make it more robust.

//returns id for a given query
function getCacheId($query, $table) {
    return md5($table . $query);
}

If anyone wants the full code for how i've implemented the results caching, just leave a comment and i'll be happy to post it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文