如何查询以二进制形式存储在数据库中的 UUID(JPA/Hibernate/MySQL)
我有一个基于 Java/JPA/Hibernate/MySQL 的应用程序。我想使用 UUID 进行对象标识,但我想确保数据库性能不会受到影响。
我发现这篇很棒的博客文章 JPA 和 UUID Primary钥匙帮助我找到了一些方法。请注意 UUID 的存储是如何通过以二进制形式存储来优化的(与字符串表示形式相比)。
它解决了部分问题,因为现在我们可以将对象有效地插入到数据库中。
但是,现在当我想要使用 EntityManager.createQuery 从数据库进行查询。是否可以/需要查询二进制数据?或者,我应该将字符串 UUID 与二进制版本一起存储以方便查询吗?
I have a Java/JPA/Hibernate/MySQL based app. I want to use UUIDs for object identity, however I want to ensure database performance does not suffer.
I found this great blog posting JPA and UUID Primary Keys which gets me some of the way there. Notice how the storage of the UUID is optimized by storing it in binary form (versus the string representation.
It solves part of the problem, because now we can insert objects efficiently into the database.
However, now I have an issue when I want to query from the database using EntityManager.createQuery. Is it possible/desirable to query against binary data? or, should I store the String UUID along-side the binary version to facilitate querying?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用 Hibernate 4.1.2 和 MySQL-Connector-J 5.1.18 进行测试,您可以定义 UUID 字段:
...并使用 UUID 实例进行查询:
Tested with Hibernate 4.1.2 and MySQL-Connector-J 5.1.18, you can define a UUID field:
...and query with a UUID instance:
只要你已经有了二进制格式的 ID,查询它就很简单:
实际上,如果你只是通过主键查找,你可以使用
获取二进制格式的 ID 可能会有点痛苦,特别是如果你需要在 URL 等中传递它。我建议对其进行 Base64 编码/解码; Apache Commons Codec 具有从 byte[] 到 URL 安全字符串,然后返回 byte[] 的辅助方法
As long as you already have the ID in binary format, querying it is simple:
Actually if you are just looking up by primary key you can use
Getting the ID in binary format can be a bit of a pain, especially if you need to be passing it around in URLs etc. I recommend Base64 encoding / decoding it; Apache Commons Codec has helper methods from going from byte[] to URL-safe string and then back to byte[]
10 亿条记录的 16 字节开销约为 15Gb。如果您确实拥有这么多数据,您将需要解决更严重的可扩展性问题,而 10 美分/Gb 或更低的 15Gb 并不是什么大问题。多对多关系可以更快地发展到这种规模,但仍然不用担心。
总而言之,只需使用字符串表示即可。它将以相当小的价格为您节省处理数据库的大量精力。
PS 我个人偏好是使用数字 ID,但这是单独讨论的。
16 bytes overhead on 1 billion of records is roughly 15Gb. If you do have that much of the data you will have more serious scalability problems to solve and those 15Gb at 10 cents/Gb or less will not really be a big deal. Many to many relationships can grow to that size quicker but it will still be not that much to worry about.
To summarize, just go with string representation. It will save you a lot of effort in dealing with database at fairly small price.
P.S. My personal preference is to use numeric ids, but that's separate discussion.