Windows Azure 表服务 - 扩展属性和表架构

发布于 2024-09-05 23:30:54 字数 2422 浏览 11 评论 0原文

我有一个实体，除了一些常见属性之外，还包含存储为集合中的（名称，值）字符串对的扩展属性列表。我可能应该提到，这些扩展属性因实例而异，并且只需要为每个实例列出它们（不会对扩展属性进行任何查询，例如查找具有特定 (Name,值）对）。我正在探索如何使用 Windows Azure 表服务来保留该实体。对于我现在正在测试的特定方法，我担心随着应用程序遇到更多不同的扩展属性名称，性能可能会随着时间的推移而下降。

如果我将此实体存储在典型的关系数据库中，我可能有两个表来支持此模式：第一个表将包含实体标识符及其公共属性，第二个表将引用实体标识符并使用 EAV 样式行-建模以存储扩展（名称，值）对，每一行一个。

由于 Windows Azure 中的表已经使用 EAV 模型，因此我正在考虑对实体进行自定义序列化，以便存储扩展属性，就像在实体的编译时声明它们一样。我可以使用 DataServiceContext 来完成此任务。

private void OnReadingEntity(object sender, ReadingWritingEntityEventArgs e)
{
    MyEntity Entry = e.Entity as MyEntity;

    if (Entry != null)
    {
        XElement Properties = e.Data
            .Element(Atom + "content")
            .Element(Meta + "properties");

        //select metadata from the extended properties
        Entry.ExtendedProperties = (from p in Properties.Elements()
                          where p.Name.Namespace == Data && !IsReservedPropertyName(p.Name.LocalName) && !string.IsNullOrEmpty(p.Value)
                          select new Property(p.Name.LocalName, p.Value)).ToArray();
    }
}

private void OnWritingEntity(object sender, ReadingWritingEntityEventArgs e)
{
    MyEntity Entry = e.Entity as MyEntity;

    if (Entry != null)
    {
        XElement Properties = e.Data
            .Element(Atom + "content")
            .Element(Meta + "properties");

        //add extended properties from the metadata
        foreach (Property p in (from p in Entry.ExtendedProperties 
                                where !IsReservedPropertyName(p.Name) && !string.IsNullOrEmpty(p.Value)
                                select p))
        {
            Properties.Add(new XElement(Data + p.Name, p.Value));
        }
    }
}

这是有效的，因为我可以定义扩展属性名称和值的要求，所以我可以确保它们符合 Windows Azure 表中实体属性的所有标准要求。

那么，随着时间的推移，当应用程序遇到数千个不同的扩展属性名称时会发生什么？

以下是我在开发存储环境中观察到的情况：

表容器架构随着每个新名称而增长。我不确定这个模式到底是如何使用的（可能是下一点），但显然这个 xml 文档可能会随着时间的推移而变得相当大。
每当读取实例时，传递给 OnReadingEntity 的 xml 都会包含为任何其他实例存储的每个属性名称的元素（而不仅仅是为正在读取的特定实例存储的属性名称）。这意味着随着时间的推移，实体的检索将变得更慢。

我应该在生产存储环境中期待这些行为吗？我可以看到这些行为对于大多数表来说是可以接受的，因为随着时间的推移，模式大部分是静态的。也许 Windows Azure Tables 并不是为这样使用而设计的？如果是这样，我肯定需要改变我的方法。我也愿意接受有关替代方法的建议。

原文

I have an entity that, in addition to a few common properties, contains a list of extended properties stored as (Name, Value) pairs of strings within a collection. I should probably mention that these extended properties widely vary from instance to instance, and that they only need to be listed for each instance (there won't be any queries over the extended properties, for example finding all instances with a particular (Name, Value) pair). I'm exploring how I might persist this entity using Windows Azure Table Services. With the particular approach I'm testing now, I'm concerned that there may be a degradation of performance over time as more distinct extended property names are encountered by the application.

If I were storing this entity in a typical relational database, I'd probably have two tables to support this schema: the first would contain the entity identifier and its common properties, and the second would reference the entity identifier and use EAV style row-modeling to store the extended (Name, Value) pairs, one to each row.

Since tables in Windows Azure already use an EAV model, I'm considering custom serialization of my entity so that the extended properties are stored as though they were declared at compile time for the entity. I can use the Reading- and Writing-Entity events provided by DataServiceContext to accomplish this.

private void OnReadingEntity(object sender, ReadingWritingEntityEventArgs e)
{
    MyEntity Entry = e.Entity as MyEntity;

    if (Entry != null)
    {
        XElement Properties = e.Data
            .Element(Atom + "content")
            .Element(Meta + "properties");

        //select metadata from the extended properties
        Entry.ExtendedProperties = (from p in Properties.Elements()
                          where p.Name.Namespace == Data && !IsReservedPropertyName(p.Name.LocalName) && !string.IsNullOrEmpty(p.Value)
                          select new Property(p.Name.LocalName, p.Value)).ToArray();
    }
}

private void OnWritingEntity(object sender, ReadingWritingEntityEventArgs e)
{
    MyEntity Entry = e.Entity as MyEntity;

    if (Entry != null)
    {
        XElement Properties = e.Data
            .Element(Atom + "content")
            .Element(Meta + "properties");

        //add extended properties from the metadata
        foreach (Property p in (from p in Entry.ExtendedProperties 
                                where !IsReservedPropertyName(p.Name) && !string.IsNullOrEmpty(p.Value)
                                select p))
        {
            Properties.Add(new XElement(Data + p.Name, p.Value));
        }
    }
}

This works, and since I can define requirements for extended property names and values, I can ensure that they conform to all the standard requirements for entity properties within a Windows Azure Table.

So what happens over time as the application encounters thousands of different extended property names?

Here's what I've observed within the development storage environment:

The table container schema grows with each new name. I'm not sure exactly how this schema is used (probably for the next point), but obviously this xml document could grow quite large over time.
Whenever an instance is read, the xml passed to OnReadingEntity contains elements for every property name ever stored for any other instance (not just the ones stored for the particular instance being read). This means that retrieval of an entity will become slower over time.

Should I expect these behaviors in the production storage environment? I can see how these behaviors would be acceptable for most tables, as the schema would be mostly static over time. Perhaps Windows Azure Tables were not designed to be used like this? If so, I will certainly need to change my approach. I'm also open to suggestions on alternate approaches.

分享到QQ

分享到微博