什么对象数据库允许对数据库中的所有内容建立索引?

发布于 2024-10-19 11:37:34 字数 1230 浏览 2 评论 0原文

目前,db4o 不允许对集合的内容建立索引。哪些对象数据库允许对数据库中的任何单个字段建立索引?

示例:

class RootClass
{
   string thisIsIndexed; // Field can be indexed for quick searching.
   IList<SubClass> contentsNotIndexed = new ArrayList(); // Creates a 1-to-many relationship.
}

class SubClass
{
   string thisIsNotIndexed; // Field cannot be indexed.
}

对于 db4o 要按字段“thisIsNotIndexed”进行搜索,它必须将完整的对象加载到内存中,然后使用 LINQ-to-Objects 扫描该字段。这很慢,因为这意味着您可能必须将整个数据库加载到 RAM 中才能进行搜索。解决此问题的方法是在根对象中包含要搜索的所有字段,但是,这似乎是人为的限制。

是否有任何对象数据库不受此限制,并且允许对子对象中的任何字符串进行索引?

更新

答案#1:

我找到了一种兼具两全其美的方法:易于使用(具有分层结构),并且使用整个树上的完整索引来实现令人眼花缭乱的快速本机查询。它涉及一些技巧,以及缓存父节点内容的方法:

  1. 像平常一样创建嵌套层次结构。
  2. 对于每个子节点,创建对父节点的反向引用。
  3. 您现在可以查询叶节点。现在我们已经完成了一半 - 我们可以查询,但是,它很慢,因为如果您想通过父节点中的某个参数进行搜索,它必须进行连接才能向上导航树节点。
  4. 为了加快速度,请创建一个“缓存”参数,该参数将搜索词缓存在父节点中。它是一个最初设置为 null 的方法,第一次调用它时会执行昂贵的连接,然后它镜像该字段,从那时起搜索速度非常快。
  5. 这对于永不改变的数据(即随时间变化的温度样本)非常有效。如果数据将要更改,那么您需要某种方法在根节点中的值发生更改时清除缓存的值,也许可以通过在每个叶节点中设置“脏”标志来清除。

答案#2:

如果您使用数组而不是列表,则可以使用 SODA 下降到子节点。如果您使用列表,SODA 不支持它,因此您根本无法使用 SODA 进行查询(或任何依赖于 SODA 的其他内容,例如 LINQ、QBE、本机查询等)。

Currently, db4o does not allow indexing on the contents of collections. What object databases do allow indexing of any individual field in the database?

Example:

class RootClass
{
   string thisIsIndexed; // Field can be indexed for quick searching.
   IList<SubClass> contentsNotIndexed = new ArrayList(); // Creates a 1-to-many relationship.
}

class SubClass
{
   string thisIsNotIndexed; // Field cannot be indexed.
}

For db4o to search by field "thisIsNotIndexed", it would have to load the complete object into memory, then use LINQ-to-Objects to scan through the field. This is slow, as it means you would potentially have to load the entire database into RAM to do a search. The way to work around this is to have all of the fields you want to search by in the root object, however, this seems like an artificial limitation.

Are there any object databases that do not suffer from this limitation, and allow indexing of any string in a sub-object?

Update

Answer #1:

I found a method which gives the best of both worlds: ease of use (with a hierarchical structure), and blindingly fast native queries using full indexing on the whole tree. It involves a bit of a trick, and a method that caches the contents of parent nodes:

  1. Create the nested hierarchy as normal.
  2. For each sub-node, create a reverse reference to the nodes parent.
  3. You can now query the leaf nodes. We are half way there now - we can query, however, its slow as it has to do a join to navigate up the tree nodes if you want to search by some parameter in a parent node.
  4. To speed it up, create a "cache" parameter which caches the search terms in the parent node. Its a method that is initially set to null, the first time its called it does an expensive join, then it mirrors the field, and from that point on the search is extremely quick.
  5. This works well for data that never changes, i.e. temperature samples over time. If the data is going to change, then you need some way of clearing the cached values if the value in the root node changes, perhaps by setting a "dirty" flag in each leaf node.

Answer #2:

If you use an Array instead of a List, you can descend into the child node using SODA. If you use a List, SODA doesn't support it, so you simply can't query with SODA (or anything else that depends on SODA, such as LINQ, QBE, Native queries, etc).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

平定天下 2024-10-26 11:37:34

那么,您可以在示例中索引 SubClass.thisIsNotIndexed 。因此您可以快速找到子类实例。

但当然你是对的,你不能索引集合。我的意思是,如果集合包含某些元素等,则不可能进行有效的查询。例如,如果您想查询包含特定子类的所有根类。由于缺乏适当的集合索引,这种情况会很慢。

在 db4o 中,您必须解决这个问题。一个示例是在子类上添加一个字段,其中包含对父类的引用。然后就可以高效地进行查询了。

又是一件小事。您可以在集合字段上设置索引。但这只是集合对象引用的索引。这将允许您找到引用某个集合实例的对象。这通常没什么用。

我猜想更大的对象数据库确实支持集合索引和随之而来的查询。

Well, you can index the SubClass.thisIsNotIndexed in your example. And therefore you quickly can find the subclass-instances.

But of course you are right in that you cannot index collections. By that i mean it's not possible to have efficient queries if a collections contains certain elements etc. For example if you want to query for all RootClass which contain a certain SubClass. That case will be slow, because of lacking of proper collection-indexing.

In db4o you have to work around this issue. A example would be to add a field on the SubClass which contains the reference to the parent. Then you can do the query efficiently.

Another small thing. You can set a index on a collection field. But thats just an index on the reference to the collection-object. That would allow you to find the object which has a reference to a certain collection-instance. That usually pretty useless.

I guess larger object-databases do support indexing of collection and queries which go with it.

眉目亦如画i 2024-10-26 11:37:34

我基于我在 Scala 和 DB40 下使用 DB40 的经验。 Java,但希望这仍然有效:字段“contentsNotIndexed”保存 ArrayList 实例,因此索引该字段应该只会帮助您查询这些 ArrayList 实例。如果您想有效地查询这些列表的内容,则必须为您希望在列表中找到的对象定义一个索引,然后将查询下降到“contentsNotIndexed”字段下的 ArrayList。我不知道 ArrayList 的内部结构来表明它可能会下降到哪里。

根据您的需要,您还可以将您的类设计为在某些情况下使用数组而不是 ArrayList 以达到您想要的效果。

I'm basing this on my experience with DB40 under Scala & Java, but hopefully this is still valid: The field 'contentsNotIndexed' holds ArrayList instances, so indexing that field should only assit you in querying those ArrayList instances. If you want to query the contents of those lists efficiently, you would have to define an index on the objects you expect to find inside the lists and descend you query into the ArrayList under the 'contentsNotIndexed' field. I don't know the internals of ArrayList to suggest where that might descend though.

Depending on your needs, you can also design your class to use an array instead of an ArrayList in some cases to achieve the effect you want.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文