IndexedDB 中带有标签模型的文档
我有歌曲和标签。标签可以是“录制位置”或“最后录制日期”等类型。
在关系模型中,我将有一个包含 Song_id 和 tag_id 信息的连接模型。但在像indexeddb这样的文档库数据库中,我会将标签及其信息直接存储在文档中。我想知道如果我没有很多独特的标签,从长远来看这是否不会导致数据库加载?
如果另一首歌曲需要其中一个已经在另一首歌曲上使用的标签,我就会有一个重复的标签。
我当然也可以在这里使用连接存储,但这还包括超过 2 个表的手动获取。
我对该模型有几个问题:
- 我应该有歌曲和标签存储吗?
- 如何对每首歌曲附加的标签进行批量更新?
- 我可能需要哪些索引才能加快速度?
我的主要方面是通过标签值进行搜索(并按类型过滤)。
I have songs and tags. A tag can be of a type like "recording location" or "last recording date".
In a relational model I would have a join model that hold song_id and tag_id infos. But in a document base DB like indexeddb I would store the tags and their infos directly in the document. I wonder if that would not lead to DB bload in the long run if I do not have many unique tags?
If another song would require one of the tags, that are already used on another song, I would have a duplicate tag.
I could of course go with a join store here too, but this would then also include manual fetches over 2 tables.
I have several questions to the model:
- Should I have a songs and a tags store?
- How are bulk updates of tags, that a attached to each song performed?
- What indices would I probably need to make this fast?
My main aspect is to search via tag values (and filtered by type).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
很好地使用 IDB 和其他 NoSQL 存储的关键是不要陷入连接 id 的困境,而只是尝试让每个对象存储在其自身的权利中发挥作用。 (并且很好地使用索引!)请记住,这就是 SQL 式数据库在幕后的工作方式,但它允许您在需要时而不是在一般情况下进行专门的联接。
批量更新更具挑战性,但这更多的是针对最常见的情况(显示查找歌曲/标签)进行优化,而不是针对更罕见的事情(批量更改标签名称)进行优化
您正在查看的最基本的模式是存储歌曲就像这样:
标签可以简单地通过名称进行标识:
首先创建歌曲对象存储:
然后创建多条目索引:
然后创建标签对象存储:
现在,如果确实需要,您可以自己进行连接,但有时这只是一些开关的东西。
由于歌曲上有“标签”索引,反向操作也相当容易。请注意,我们直接使用标签名称,而不是使用一些中间数字 tag_id。
The key with using IDB and other NoSQL stores well is not getting caught up in join ids, and just trying to make each object store useful in its own right. (And using indexes well!) Remember this is just how SQL-ish databases work under the hood, but it allows you to do specialized joins when you need it rather than in the general case.
Bulk updates are more of a challenge but this is more about optimizing for the most common case (showing looking up songs/tags) than optimizing for more rare things (bulk-changing tag names)
The most basic schema you're looking at is storing songs as something like:
And tags can simply be id'd by their name:
First create your song objectStore:
Then create an multiEntry index:
Then create a tag objectStore:
Now you can do the joins yourself if you really need to, but sometimes it's just some on-off stuff.
Thanks to the 'tags' index on songs, the reverse is pretty easy too. Note that we're using tag names directly, not messing around with some intermediate numeric tag_id.
当人们必须在关系型数据库(如 MySQL)和面向文档的数据库(如 MongoDB)之间做出选择时,您基本上面临着同样的问题。复制数据——更不用说键本身——会占用空间,并且存储一个副本并使用外键来代替绝对是“第三种形式”。
也就是说,我在 IndexedDB 工作中经历过这两个方面。相信您所做的同样的事情(关于存储效率)需要与您实际访问数据的方式进行权衡。当您想要在 IndexedDB 中实现类似外键的模式时,它必然需要 2 个以上的对象存储,例如在底层文件系统上存储为两个单独的文件。这意味着对于您想要外键数据(此处为标签)的每个查询,您必须至少有两次对象存储命中,可能是两个事务,并且我假设还有额外的 io 开销以及与这些开销相关的开销。
我会采用面向文档的方法,并尝试使用关键速记等技巧(例如“n”而不是“name”)来减轻存储的严重程度。
You're basically facing the same problem people face when they have to choose between a relational database like MySQL and a document-oriented one like MongoDB. Duplicating data -- not to mention the keys themselves -- take up space and it's definitely more "third form"ish to store one copy and use foreign keys instead.
That said, I've been through both sides of this in my IndexedDB work. Believing the same thing you do -- regarding storage efficiency -- needs to be weighed against the way you're actually accessing the data. When you want to achieve foreign key-like schema in IndexedDB it necessarily requires 2 plus object stores, like stored as two separate files on the underlying filesystem. That means for each query that you want foreign key data (here, tags) you have to have at least two object store hits, probably two transactions and, I'm assuming, the extra io overhead and such associated with those.
I'd go with the document-oriented approach and try to make the storage hit less severe with tricks like key shorthand (e.g. "n" instead of "name).