Memcache密钥生成策略

发布于 2024-08-29 03:57:53 字数 1980 浏览 2 评论 0原文

给定接收 n 个字符串参数的函数 f1 ,就运行时性能而言,memcache 的随机密钥生成策略什么会被认为更好?

我们的 Memcache 客户端对其获取的键进行内部 md5sum 散列:

   public class MemcacheClient {  
       public Object get(String key) {
            String md5 = Md5sum.md5(key)
            // Talk to memcached to get the Serialization... 
            return memcached(md5);
       }
   }

我的使用场景是:

第一个选项

    public static String f1(String s1, String s2, String s3, String s4) {
         String key = s1 +  s2 + s3 + s4;
         return get(key);
    }

第二个选项

    /**
     * Calculate hash from Strings
     *
     * @param objects vararg list of String's
     *
     * @return calculated md5sum hash
     */
    public static String stringHash(Object... strings) {
        if(strings == null) 
            throw new NullPointerException("D'oh! Can't calculate hash for null");

        MD5 md5sum = new MD5();

//      if(prevHash != null)
//          md5sum.Update(prevHash);

        for(int i = 0; i < strings.length; i++) {
            if(strings[i] != null) {
                md5sum.Update("_"); 
                md5sum.Update(strings[i].toString()); // Convert to String...
                md5sum.Update("_");

            } else {
                // If object is null, allow minimum entropy  by hashing it's position
                md5sum.Update("_");
                md5sum.Update(i);
                md5sum.Update("_");
            }
        }

        return md5sum.asHex();
    }


    public static String f1(String s1, String s2, String s3, String s4) {
         String key = stringHash(s1, s2, s3, s4);
         return get(key);
    }

请注意,第二个选项可能存在的问题是我们正在对已经 md5sum'ed 的对象执行第二个 md5sum(在 memcache 客户端中)消化结果。

感谢您的阅读, 格言。

- 编辑 使用MD5实用程序源

Given function f1 which receives n String arguments, what would be considered better ,in terms of runtime performance, random key generation strategy for memcache?

Our Memcache client does internal md5sum hashing on the keys it gets:

   public class MemcacheClient {  
       public Object get(String key) {
            String md5 = Md5sum.md5(key)
            // Talk to memcached to get the Serialization... 
            return memcached(md5);
       }
   }

My usage scenarios are:

First option

    public static String f1(String s1, String s2, String s3, String s4) {
         String key = s1 +  s2 + s3 + s4;
         return get(key);
    }

Second option

    /**
     * Calculate hash from Strings
     *
     * @param objects vararg list of String's
     *
     * @return calculated md5sum hash
     */
    public static String stringHash(Object... strings) {
        if(strings == null) 
            throw new NullPointerException("D'oh! Can't calculate hash for null");

        MD5 md5sum = new MD5();

//      if(prevHash != null)
//          md5sum.Update(prevHash);

        for(int i = 0; i < strings.length; i++) {
            if(strings[i] != null) {
                md5sum.Update("_"); 
                md5sum.Update(strings[i].toString()); // Convert to String...
                md5sum.Update("_");

            } else {
                // If object is null, allow minimum entropy  by hashing it's position
                md5sum.Update("_");
                md5sum.Update(i);
                md5sum.Update("_");
            }
        }

        return md5sum.asHex();
    }


    public static String f1(String s1, String s2, String s3, String s4) {
         String key = stringHash(s1, s2, s3, s4);
         return get(key);
    }

Note that the possible problem with the second option is that we are doing second md5sum (in the memcache client) on an already md5sum'ed digest result.

Thanks for reading,
Maxim.

-- Edit
Used MD5 utility source

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

北恋 2024-09-05 03:57:53

“更好”在什么意义上?为什么您认为第二个选项“更好”?它执行更多的字符串连接、更多的 MD5 哈希,但通常看起来比第一个效率低得多......

"Better" in what sense? Why would you think the second option is "better"? It does more string concatentations, more MD5 hashes and just generally seems much less efficient than the first...

不一样的天空 2024-09-05 03:57:53

只是吹毛求疵,但您可能不希望随机生成密钥,密钥生成应该确定性,但应该在键空间。

如果只考虑意外碰撞,那么第一种方法几乎没问题。您应该为字符串添加长度前缀,这样当子字符串从一个参数移动到另一个参数时就不会发生冲突。鉴于 md5 相当好的雪崩特性,这将确保意外碰撞很少发生,足以被忽略。

但如果处理用户输入,请小心使用 MD5,它具有已知的冲突攻击。如果不受信任的用户可以为函数参数选择一些任意字节,并且返回错误的结果可能会产生安全隐患,那么您就会遇到安全漏洞。例如,如果您使用它来缓存授权信息,攻击者可以计算出两组散列为单个值的参数。一个将访问公共内容,另一个将访问受保护的服务。现在只需使用第一组请求授权,获取缓存的授权,然后使用另一组访问受保护的服务,从缓存的授权中接收绿灯。

Just nitpicking, but you probably don't want random key generation, the key generation should be deterministic, but should generate a uniform distribution in the key space.

If you consider only accidental collisions, then the first approach is almost fine. You should prefix the strings with their length so you don't get collisions when a substring moves from one param to another. Given md5's pretty good avalanche properties that will ensure that accidental collisions are rare enough to be ignored.

But be careful with MD5 if you process user input, it has known collision attacks. If an untrusted user can pick some arbitrary bytes for the function parameters and returning a wrong result can have security implications, then you have a security hole. For instance if you use this to cache authorization info, an attacker could work out two sets of parameters that hash to a single value. One would access something public and the other accesses a protected service. Now just request authorization with the first set, get the authorization cached and then access the protected service with the other set, receiving a green light from the cached authorization.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文