Azure - 更新 BLOB 存储中的现有 xml 文件

发布于 2024-12-08 07:51:38 字数 1226 浏览 0 评论 0原文

我将 XML 文件存储在 BLOB 存储中,并且我试图找出更新它们(和/或向其中添加一些元素)的最有效方法。在 WebRole 中,我想出了这个:

using (MemoryStream ms = new MemoryStream())
{                    
      var blob = container.GetBlobReference("file.xml");
      blob.DownloadToStream(msOriginal);
      XDocument xDoc= XDocument.Load(ms);  

      // Do some updates/inserts using LINQ to XML.  

      blob.Delete();//Details about this later on.

      using(MemoryStream msNew = new MemoryStream())  
      {
           xDoc.Save(msNew);
           msNew.Seek(0,SeekOrigin.Begin);
           blob.UploadFromStream(msNew);                    
      }                               
}

考虑到效率,我正在查看这些参数:

  1. BLOB 事务
  2. 带宽。 (不确定是否计算在内,因为代码在数据中心运行)
  3. 实例上的内存消耗。

需要提及的一些事情:

  • 我的 xml 文件约为 150-200 KB。

  • 我知道 XDocument 将整个文件加载到 内存,并且在流中工作( XmlWriter 和 XmlReader )可以 解决这个问题。但我认为这需要使用 BlobStream 这可能会导致事务效率降低(我认为)。

  • 关于blob.Delete(),没有它,上传的xml在blob存储中 末尾似乎缺少一些结束标签。我以为 这是由于与旧数据的冲突造成的。我可以是 这里完全错误,但是使用删除解决了它(花费了一个 更多事务虽然)。

考虑到我提到的参数,我提供的代码是一种很好的做法还是可能存在更有效的方法?

I have XML files stored in BLOB storage, and I am trying to figure out what is the most efficient way to update them ( and/or add some elements to them). In a WebRole, I came up with this :

using (MemoryStream ms = new MemoryStream())
{                    
      var blob = container.GetBlobReference("file.xml");
      blob.DownloadToStream(msOriginal);
      XDocument xDoc= XDocument.Load(ms);  

      // Do some updates/inserts using LINQ to XML.  

      blob.Delete();//Details about this later on.

      using(MemoryStream msNew = new MemoryStream())  
      {
           xDoc.Save(msNew);
           msNew.Seek(0,SeekOrigin.Begin);
           blob.UploadFromStream(msNew);                    
      }                               
}

I am looking at these parameters considering the efficiency:

  1. BLOB Transactions.
  2. Bandwidth. (Not sure if it's counted, because the code runs in the data-center)
  3. Memory consumption on the instance.

Some things to mention:

  • My xml files are around 150-200 KB.

  • I am aware of the fact that XDocument loads the whole file into
    memory, and working in streams ( XmlWriter and XmlReader ) could
    solve this. But I Assume this will require working with BlobStream
    which could lead to less efficient transaction-wise (I think).

  • About blob.Delete(), without it, the uploaded xml in the blob storage
    seems to be missing some closing tags at the end of it. I assumed
    this is caused by a collision with the old data. I could be
    completely wrong here, but using the delete solved it ( costing one
    more transaction though ).

Is the code I provided is a good practice or maybe a more efficient way exists considering the parameters I mentioned ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

沫尐诺 2024-12-15 07:51:38

我认为基于流的方法的问题在于存储客户端在开始发送数据之前不知道流有多长。这可能会导致内容长度无法更新,从而导致文件末尾出现丢失数据的情况。

以文本格式处理 blob 的内容将会有所帮助。您可以以文本形式下载 Blob 内容,然后以文本形式上传。这样做,您应该能够避免删除(节省 1/3 的事务)并拥有更简单的代码。

var blob = container.GetBlobReference("file.xml");
var xml = blob.DownloadText(); // transaction 1
var xDoc= XDocument.Parse(xml);

// Do some updates/inserts using LINQ to XML.

blob.UploadText(xDoc.ToString()); //  transaction 2

此外,如果您可以重新创建文件而无需首先下载它(我们有时可以这样做),那么您可以上传它并使用一个存储事务覆盖旧文件。

var blob = container.GetBlobReference("file.xml");
var xDoc= new XDocument(/* generate file */);

blob.UploadText(xDoc.ToString()); // transaction 1

I believe the problem with the stream based method is that the storage client doesn't know how long the stream is before it starts to send the data. This is probably causing the content-length to not be updated, giving the appearance of missing data at the end of the file.

Working with the content of the blob in text format will help. You can download the blob contents as text and then upload as text. Doing this, you should be able to both avoid the delete (saving you 1/3rd the transactions) and have simpler code.

var blob = container.GetBlobReference("file.xml");
var xml = blob.DownloadText(); // transaction 1
var xDoc= XDocument.Parse(xml);

// Do some updates/inserts using LINQ to XML.

blob.UploadText(xDoc.ToString()); //  transaction 2

Additionally, if you can recreate the file without downloading it in the first place (we can do this sometimes), then you can just upload it and overwrite the old one using one storage transaction.

var blob = container.GetBlobReference("file.xml");
var xDoc= new XDocument(/* generate file */);

blob.UploadText(xDoc.ToString()); // transaction 1
无悔心 2024-12-15 07:51:38

我知道 XDocument 将整个文件加载到内存中,并且在流( XmlWriter 和 XmlReader )中工作可以解决这个问题。

不确定它会解决太多问题。想一想。当 Koolaid 流过软管时,如何将其添加到水中。这就是流。最好等到它放入容器中。

除此之外,为什么要关注效率(技术问题)而不是编辑(业务问题)?文件更改的频率是否足以保证认真检查性能?或者您只是陷入了正常开发人员倾向于做超出必要范围的事情的牺牲品? (注意:我也经常在这方面感到内疚)

如果没有 Flush() 的概念,乍一看,Delete 是一个可以接受的选项。我不确定转向异步方法是否可以以更少的开销实现相同的目的。

I am aware of the fact that XDocument loads the whole file into memory, and working in streams ( XmlWriter and XmlReader ) could solve this.

Not sure it would solve too much. Think about it. How do you add koolaid to the water while it is flying through the hose. That is what a stream is. Better to wait until it is in a container.

Outside of that, what is the reason for the focus on efficiency (a technical problem) rather than editing (the business problem)? Are the documents changed often enough to warrant a serious look at performance? Or are you just falling prey to the normal developer tendency to do more than what is necessary? (NOTE: I am often guilty in this area too)

Without a concept of a Flush(), the Delete is an acceptable option, at first glance. I am not sure if moving to the asynch methods might facilitate the same end with less overhead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文