使用 .NET XmlTextWriter 检查输出大小

发布于 2024-08-22 19:13:21 字数 97 浏览 8 评论 0原文

我需要生成一个 XML 文件,并且需要将尽可能多的数据放入其中,但文件大小有限制。所以我需要继续插入数据,直到不再有任何内容为止。如何计算 XML 文件的大小而不将其重复写入文件?

I need to generate an XML file and i need to stick as much data into it as possible BUT there is a filesize limit. So i need to keep inserting data until something says no more. How do i figure out the XML file size without repeatably writing it to file?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

执妄 2024-08-29 19:13:21

我同意约翰·桑德斯的观点。下面的一些代码基本上可以完成他所说的操作,但作为 XmlSerializer(除了 FileStream)并使用 MemoryStream 作为中间存储。不过,延长流量可能更有效。

public class PartitionedXmlSerializer<TObj>
{
    private readonly int _fileSizeLimit;

    public PartitionedXmlSerializer(int fileSizeLimit)
    {
        _fileSizeLimit = fileSizeLimit;
    }

    public void Serialize(string filenameBase, TObj obj)
    {
        using (var memoryStream = new MemoryStream())
        {
            // serialize the object in the memory stream
            using (var xmlWriter = XmlWriter.Create(memoryStream))
                new XmlSerializer(typeof(TObj))
                    .Serialize(xmlWriter, obj);

            memoryStream.Seek(0, SeekOrigin.Begin);

            var extensionFormat = GetExtensionFormat(memoryStream.Length);

            var buffer = new char[_fileSizeLimit];

            var i = 0;
            // split the stream into files
            using (var streamReader = new StreamReader(memoryStream))
            {
                int readLength;
                while ((readLength = streamReader.Read(buffer, 0, _fileSizeLimit)) > 0)
                {
                    var filename 
                        = Path.ChangeExtension(filenameBase, 
                            string.Format(extensionFormat, i++));
                    using (var fileStream = new StreamWriter(filename))
                        fileStream.Write(buffer, 0, readLength);
                }
            }
        }
    }

    /// <summary>
    /// Gets the a file extension formatter based on the 
    /// <param name="fileLength">length of the file</param> 
    /// and the max file length
    /// </summary>
    private string GetExtensionFormat(long fileLength)
    {
        var numFiles = fileLength / _fileSizeLimit;
        var extensionLength = Math.Ceiling(Math.Log10(numFiles));
        var zeros = string.Empty;
        for (var j = 0; j < extensionLength; j++)
        {
            zeros += "0";
        }
        return string.Format("xml.part{{0:{0}}}", zeros);
    }
}

要使用它,您可以使用最大文件长度对其进行初始化,然后使用基本文件路径和对象进行序列化。

public class MyType
{
    public int MyInt;
    public string MyString;
}

public void Test()
{
    var myObj = new MyType { MyInt = 42, 
                             MyString = "hello there this is my string" };
    new PartitionedXmlSerializer<MyType>(2)
        .Serialize("myFilename", myObj);
}

这个特定的示例将生成一个 xml 文件,分为

myFilename.xml.part001
myFilename.xml.part002
myFilename.xml.part003
...
myFilename.xml.part110

I agree with John Saunders. Here's some code that will basically do what he's talking about but as an XmlSerializer except as a FileStream and uses a MemoryStream as intermediate storage. It may be more effective to extend stream though.

public class PartitionedXmlSerializer<TObj>
{
    private readonly int _fileSizeLimit;

    public PartitionedXmlSerializer(int fileSizeLimit)
    {
        _fileSizeLimit = fileSizeLimit;
    }

    public void Serialize(string filenameBase, TObj obj)
    {
        using (var memoryStream = new MemoryStream())
        {
            // serialize the object in the memory stream
            using (var xmlWriter = XmlWriter.Create(memoryStream))
                new XmlSerializer(typeof(TObj))
                    .Serialize(xmlWriter, obj);

            memoryStream.Seek(0, SeekOrigin.Begin);

            var extensionFormat = GetExtensionFormat(memoryStream.Length);

            var buffer = new char[_fileSizeLimit];

            var i = 0;
            // split the stream into files
            using (var streamReader = new StreamReader(memoryStream))
            {
                int readLength;
                while ((readLength = streamReader.Read(buffer, 0, _fileSizeLimit)) > 0)
                {
                    var filename 
                        = Path.ChangeExtension(filenameBase, 
                            string.Format(extensionFormat, i++));
                    using (var fileStream = new StreamWriter(filename))
                        fileStream.Write(buffer, 0, readLength);
                }
            }
        }
    }

    /// <summary>
    /// Gets the a file extension formatter based on the 
    /// <param name="fileLength">length of the file</param> 
    /// and the max file length
    /// </summary>
    private string GetExtensionFormat(long fileLength)
    {
        var numFiles = fileLength / _fileSizeLimit;
        var extensionLength = Math.Ceiling(Math.Log10(numFiles));
        var zeros = string.Empty;
        for (var j = 0; j < extensionLength; j++)
        {
            zeros += "0";
        }
        return string.Format("xml.part{{0:{0}}}", zeros);
    }
}

To use it, you'd initialize it with the max file length and then serialize using the base file path and then the object.

public class MyType
{
    public int MyInt;
    public string MyString;
}

public void Test()
{
    var myObj = new MyType { MyInt = 42, 
                             MyString = "hello there this is my string" };
    new PartitionedXmlSerializer<MyType>(2)
        .Serialize("myFilename", myObj);
}

This particular example will generate an xml file partitioned into

myFilename.xml.part001
myFilename.xml.part002
myFilename.xml.part003
...
myFilename.xml.part110
何以笙箫默 2024-08-29 19:13:21

一般来说,即使关闭所有打开的标记,也不能在任意位置破坏 XML 文档。

但是,如果您需要将 XML 文档拆分为多个文件,每个文件不超过一定的大小,那么您应该创建您自己的 Stream 类的子类型。这个“PartitionedFileStream”类可以写入特定文件,直至达到大小限制,然后创建一个新文件,并写入该文件,直至达到大小限制,等等。

这将为您留下多个文件连接起来构成一个有效的 XML 文档。


一般情况下,结束标签不起作用。考虑一种 XML 格式,该格式必须包含一个元素 A,后跟一个元素 B。如果在写入元素 A 后关闭标签,则您没有有效的文档 - 您需要写入元素 B。

但是,在以下特定情况下一个简单的站点地图文件,可能只需关闭标签即可。

In general, you cannot break XML documents at arbitrary locations, even if you close all open tags.

However, if what you need is to split an XML document over multiple files, each of no more than a certain size, then you should create your own subtype of the Stream class. This "PartitionedFileStream" class could write to a particular file, up to the size limit, then create a new file, and write to that file, up to the size limit, etc.

This would leave you with multiple files which, when concatenated, make up a valid XML document.


In the general case, closing tags will not work. Consider an XML format that must contain one element A followed by one element B. If you closed the tags after writing element A, then you do not have a valid document - you need to have written element B.

However, in the specific case of a simple site map file, it may be possible to just close the tags.

执妄 2024-08-29 19:13:21

您可以向XmlTextWriter询问它的BaseStream,并检查它的Position
正如其他人所指出的,您可能需要保留一些空间才能正确关闭 Xml。

You can ask the XmlTextWriter for it's BaseStream, and check it's Position.
As the other's pointed out, you may need to reserve some headroom to properly close the Xml.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文