如何避免在 RavenDB 等文档数据库中重复数据?

发布于 2024-09-04 21:12:56 字数 68 浏览 7 评论 0原文

鉴于文档数据库(例如 RavenDB)是非关系型的,如何避免重复多个文档共有的数据?如果可以复制数据,您如何维护这些数据?

Given that document databases, such as RavenDB, are non-relational, how do you avoid duplicating data that multiple documents have in common? How do you maintain that data if it's okay to duplicate it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

や莫失莫忘 2024-09-11 21:12:56

使用文档数据库,您必须在某种程度上复制数据。该程度取决于您的系统和用例。

例如,如果我们有一个简单的博客和用户聚合,我们可以将它们设置为:

  public class User 
  {
    public string Id { get; set; }
    public string Name  { get; set; }
    public string Username  { get; set; }
    public string Password  { get; set; }
  }

  public class Blog
  {
     public string Id  { get; set; }
     public string Title  { get; set; }

     public class BlogUser
     {
       public string Id  { get; set; }
       public string Name  { get; set; }
     }
  }

在本示例中,我在 Blog 类中嵌套了一个 BlogUser 类,其中包含与博客关联的用户聚合的 Id 和 Name 属性。我包含了这些字段,因为它们是 Blog 类感兴趣的唯一字段,在显示博客时它不需要知道用户的用户名或密码。

这些嵌套类将依赖于您的系统用例,因此您必须仔细设计它们,但总体思路是尝试设计可以通过单次读取从数据库加载的聚合,并且它们将包含所需的所有数据显示或操纵它们。

这就引出了当 User.Name 更新时会发生什么的问题。

对于大多数文档数据库,您必须加载属于更新后的用户的所有博客实例,并更新 Blog.BlogUser.Name 字段,并将它们全部保存回数据库。

Raven 略有不同,因为它支持更新集函数,因此您可以对 RavenDB 运行单个更新,这将更新用户博客的 BlogUser.Name 属性,而无需加载它们并单独更新它们。

在 RavenDB 中对所有博客进行更新(手动方式)的代码是:

  public void UpdateBlogUser(User user)
  {
    var blogs = session.Query<Blog>("blogsByUserId")
                  .Where(b.BlogUser.Id == user.Id)
                  .ToList();

    foreach(var blog in blogs)
       blog.BlogUser.Name == user.Name;

    session.SaveChanges()
  }

我已经在 SaveChanges 中添加了作为示例。 RavenDB 客户端使用工作单元模式,因此这实际上应该发生在该方法之外的某个地方。

With a document database you have to duplicate your data to some degree. What that degree is will depend on your system and use cases.

For example if we have a simple blog and user aggregates we could set them up as:

  public class User 
  {
    public string Id { get; set; }
    public string Name  { get; set; }
    public string Username  { get; set; }
    public string Password  { get; set; }
  }

  public class Blog
  {
     public string Id  { get; set; }
     public string Title  { get; set; }

     public class BlogUser
     {
       public string Id  { get; set; }
       public string Name  { get; set; }
     }
  }

In this example I have nested a BlogUser class inside the Blog class with the Id and Name properties of the User Aggregate associated with the Blog. I have included these fields as they are the only fields the Blog class is interested in, it doesn't need to know the users username or password when the blog is being displayed.

These nested classes are going to dependant on your systems use cases, so you have to design them carefully, but the general idea is to try and design Aggregates which can be loaded from the database with a single read and they will contain all the data required to display or manipulate them.

This then leads to the question of what happens when the User.Name gets updated.

With most document databases you would have to load all the instances of Blog which belong to the updated User and update the Blog.BlogUser.Name field and save them all back to the database.

Raven is slightly different as it support set functions for updates, so you are able to run a single update against RavenDB which will up date the BlogUser.Name property of the users blogs without you have to load them and update them all individually.

The code for doing the update within RavenDB (the manual way) for all the blog's would be:

  public void UpdateBlogUser(User user)
  {
    var blogs = session.Query<Blog>("blogsByUserId")
                  .Where(b.BlogUser.Id == user.Id)
                  .ToList();

    foreach(var blog in blogs)
       blog.BlogUser.Name == user.Name;

    session.SaveChanges()
  }

I've added in the SaveChanges just as an example. The RavenDB Client uses the Unit of Work pattern and so this should really happen somewhere outside of this method.

沙沙粒小 2024-09-11 21:12:56

恕我直言,你的问题没有一个“正确”的答案。这实际上取决于您复制的数据的可变性。

查看 RavenDB 文档,了解有关文档 DB 设计与关系数据库设计的大量答案,但请特别查看文档结构设计注意事项文档的“关联管理”部分。简而言之,当文档数据库不想在文档中嵌入共享数据时,它们会使用 ID 引用的概念。这些 ID 与 FK 不同,它们完全由应用程序来确保完整性和解析。

There's no one "right" answer to your question IMHO. It truly depends on how mutable the data you're duplicating is.

Take a look at the RavenDB documentation for lots of answers about document DB design vs. relational, but specifically check out the "Associations Management" section of the Document Structure Design Considerations document. In short, document DBs use the concepts of reference by IDs when they don't want to embed shared data in a document. These IDs are not like FKs, they are entirely up to the application to ensure the integrity of and resolve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文