mysql键/值存储问题
我正在尝试使用 mysql 实现键/值存储,
我有一个用户表,有 2 列,一列用于全局 ID,一列用于序列化数据。
现在的问题是,每当用户数据发生任何变化时,我都必须从数据库中检索序列化数据,更改数据,然后重新序列化并将其放回数据库中。即使用户的任何数据发生非常小的变化,我也必须重复这些步骤(因为无法更新数据库本身内的该单元格)
基本上我正在研究人们在面对这个问题时通常使用的解决方案问题?
I'm trying to implement a key/value store with mysql
I have a user table that has 2 columns, one for the global ID and one for the serialized data.
Now the problem is that everytime any bit of the user's data changes, I will have to retrieve the serialized data from the db, alter the data, then reserialize it and throw it back into the db. I have to repeat these steps even if there is a very very small change to any of the user's data (since there's no way to update that cell within the db itself)
Basically i'm looking at what solutions people normally use when faced with this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
也许您应该预处理 JSON 数据,并将数据作为适当的 MySQL 行插入到字段中。
由于您的输入是 JSON,因此您有多种选择来转换数据:
您提到您的案例中发生了许多小变化。它们发生在哪里?它们发生在列表的成员中吗?顶级属性?
如果更新主要发生在 JSON 数据的一部分的列表成员中,那么也许每个成员实际上应该在不同的表中表示为单独的行。
如果更新发生在属性中,则将其表示为字段。
我认为预处理的成本不会对你的情况造成伤害。
Maybe you should preprocess your JSON data and insert data as a proper MySQL row separated into fields.
Since your input is JSON, you have various alternatives for converting data:
You mentioned many small changes happen in your case. Where do they occur? Do they happen in a member of a list? A top-level attribute?
If updates occur mainly in list members in a part of your JSON data, then perhaps every member should in fact be represented in a different table as separate rows.
If updates occur in an attribute, then represent it as a field.
I think cost of preprocessing won't hurt in your case.
当这是一个问题时,人们不使用键/值存储,而是设计一个规范化的关系数据库模式,将数据存储在可以更新的单独的单值列中。
When this is a problem, people do not use key/value stores, they design a normalized relational database schema to store the data in separate, single-valued columns which can be updated.
老实说,您的解决方案是使用数据库作为美化的文件系统 - 对于应用程序的核心应用程序数据,我不建议使用这种方法。
在我看来,使用关系数据库的最佳方法是存储关系数据——表、列、主键和外键、数据类型。在某些情况下,这不起作用 - 例如,如果您的数据实际上是文档,或者事先不知道数据结构。对于这些情况,您可以扩展关系模型,或者迁移到文档或对象数据库。
对于您的情况,我首先会查看序列化数据是否可以建模为关系数据,以及您是否需要数据库。如果是这样,请转向关系模型。如果您需要数据库但无法将数据建模为关系集,则可以选择键/值模型,在该模型中将序列化数据提取到单独的键/值对中;这至少意味着您可以更新/添加单个数据字段,而不是修改整个文档。键/值并不自然适合 RDBMS,但与您当前的架构相比,它可能会有较小的跳跃。
To be honest, your solution is using a database as a glorified file system - I would not recommend this approach for application data that is core to your application.
The best way to use a relational database, in my opinion, is to store relational data - tables, columns, primary and foreign keys, data types. There are situations where this doesn't work - for instance, if your data is really a document, or when the data structures aren't known in advance. For those situations, you can either extend the relational model, or migrate to a document or object database.
In your case, I'd see firstly if the serialized data could be modeled as relational data, and whether you even need a database. If so, move to a relational model. If you need a database but can't model the data as a relational set, you could go for a key/value model where you extract your serialized data into individual key/value pairs; this at least means that you can update/add the individual data field, rather than modify the entire document. Key/value is not a natural fit for RDBMSes, but it may be a smaller jump from your current architecture.
当你有一个键/值存储时,假设你的序列化数据是 JSON,只有当你有 memcached 时它才有效,因为你不会每次都动态更新数据库,而是更新 memcache 和数据。然后将其推送到后台数据库。因此,您肯定必须更新整个值,而不是 JSON 数据中的单个字段,例如数据库中单独的地址。您可以更新&从 memcached 快速检索数据。由于数据库中没有复杂的关系,因此推送和推送会很快。将数据从数据库拉取到内存缓存。
when you have a key/value store, assuming your serialized data is JSON,it is effective only when you have memcached along with it, because you don't update the database on the fly every time but instead you update the memcache & then push that to your database in background. so definitely you have to update the entire value but not an individual field in your JSON data like address alone in database. You can update & retrieve data fast from memcached. since there are no complex relations in database it will be fast to push & pull data from database to memcache.
我将继续您正在做的事情,并为可索引数据创建单独的表。这允许您将数据库视为单个数据存储,可以通过大多数操作组(包括更新、备份、恢复、集群等)轻松管理该数据存储。
您可能需要考虑的唯一一件事就是在需要时将 ElasticSearch 添加到组合中执行类似于
like
查询之类的操作,只是为了提高搜索性能。如果空间对您来说不是问题,我什至会将其设为仅插入数据库,因此任何更改都会添加新记录,这样您就可以保留历史记录。当然,您可能想要删除较旧的记录,但您可以有一个后台作业,该作业会在后台批量删除被取代的记录。 (请注意,我所描述的基本上是 Kafka)
现在有许多替代方案在性能方面击败了 RDBMS。然而,它们都增加了额外的操作开销,因为它是另一个需要维护的中间件。
如果您有微服务架构,解决这个问题的方法是将中间件保留为微服务堆栈的一部分。然而,您必须处理跨微服务传输数据的问题,因此您最终仍然会切换到 Kafka。
I would continue with what you are doing and create separate tables for the indexable data. This allows you to treat your database as a single data store which is managed easily through most operation groups including updates, backups, restores, clustering, etc.
The only thing you may want to consider is to add ElasticSearch to the mix if you need to perform anything like a
like
query just for improved search performance.If space is not an issue for you, I would even make it an insert only database so any changes adds a new record that way you can keep the history. Of course you may want to remove the older records but you can have a background job that would delete the superseded records in a batch in the background. (Mind you what I described is basically Kafka)
There's many alternatives out there now that beats RDBMS in terms of performance. However, they all add extra operational overhead in that it's yet another middleware to maintain.
The way around that if you have a microservices architecture is to keep the middleware as part of your microservice stack. However, you have to deal with transmitting the data across the microservices so you'd still end up with a switch to Kafka underneath it all.