.NET 中的异步 XmlReader?

发布于 2024-08-21 11:55:28 字数 2452 浏览 6 评论 0原文

有没有办法异步访问 XmlReader? xml 从许多不同的客户端(如 XMPP)从网络中传入;它是...> 的持续流。标签。

我所追求的是能够使用类似 BeginRead/EndRead 的界面。我设法想出的最佳解决方案是在底层网络流上执行 0 字节的异步读取,然后当一些数据到达时,调用 XmlReader 上的 Read - 然而,这将阻塞,直到来自节点的所有数据为止变得可用。该解决方案看起来大致如下

private Stream syncstream;
private NetworkStream ns;
private XmlReader reader;

//this code runs first
public void Init()
{
    syncstream = Stream.Synchronized(ns);
    reader = XmlReader.Create(syncstream);
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

private void ReadCallback(IAsyncResult ar)
{
    syncstream.EndRead(ar);
    reader.Read(); //this will block for a while, until the entire node is available
    //do soemthing to the xml node
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

编辑:这是计算字符串是否包含完整 xml 节点的可能算法?

Func<string, bool> nodeChecker = currentBuffer =>
                {
                    //if there is nothing, definetly no tag
                    if (currentBuffer == "") return false;
                    //if we have <![CDATA[ and not ]]>, hold on, else pass it on
                    if (currentBuffer.Contains("<![CDATA[") && !currentBuffer.Contains("]]>")) return false;
                    if (currentBuffer.Contains("<![CDATA[") && currentBuffer.Contains("]]>")) return true;
                    //these tag-related things will also catch <? ?> processing instructions
                    //if there is a < but no >, we still have an open tag
                    if (currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return false;
                //if there is a <...>, we have a complete element.
                //>...< will never happen because we will pass it on to the parser when we get to >
                if (currentBuffer.Contains("<") && currentBuffer.Contains(">")) return true;
                //if there is no < >, we have a complete text node
                if (!currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return true;
                //> and no < will never happen, we will pass it on to the parser when we get to >
                //by default, don't block
                return false;
            };

Is there a way to access a XmlReader asynchronously? The xml is coming in off the network from many different clients like in XMPP; it is a constant stream of <action>...</action> tags.

What i'm after is to be able to use a BeginRead/EndRead-like interface. The best solution I've managed to come up with is to do an asynchronous read for 0 bytes on the underlying network stream, then when some data arrives, call Read on the XmlReader- this will however block until all of the data from the node becomes available. That solution looks roughly like this

private Stream syncstream;
private NetworkStream ns;
private XmlReader reader;

//this code runs first
public void Init()
{
    syncstream = Stream.Synchronized(ns);
    reader = XmlReader.Create(syncstream);
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

private void ReadCallback(IAsyncResult ar)
{
    syncstream.EndRead(ar);
    reader.Read(); //this will block for a while, until the entire node is available
    //do soemthing to the xml node
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

EDIT: This is a possible algorithm for working out if a string contains a complete xml node?

Func<string, bool> nodeChecker = currentBuffer =>
                {
                    //if there is nothing, definetly no tag
                    if (currentBuffer == "") return false;
                    //if we have <![CDATA[ and not ]]>, hold on, else pass it on
                    if (currentBuffer.Contains("<![CDATA[") && !currentBuffer.Contains("]]>")) return false;
                    if (currentBuffer.Contains("<![CDATA[") && currentBuffer.Contains("]]>")) return true;
                    //these tag-related things will also catch <? ?> processing instructions
                    //if there is a < but no >, we still have an open tag
                    if (currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return false;
                //if there is a <...>, we have a complete element.
                //>...< will never happen because we will pass it on to the parser when we get to >
                if (currentBuffer.Contains("<") && currentBuffer.Contains(">")) return true;
                //if there is no < >, we have a complete text node
                if (!currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return true;
                //> and no < will never happen, we will pass it on to the parser when we get to >
                //by default, don't block
                return false;
            };

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

天邊彩虹 2024-08-28 11:55:28

.NET 4.5 中的 XmlReader 具有大多数涉及 IO 的方法的异步版本。

请在此处查看示例代码。

XmlReader in .NET 4.5 has async versions of most of the methods that would involve IO.

Check the sample code here.

江挽川 2024-08-28 11:55:28

XmlReader 以 4kB 块的形式进行缓冲,如果我还记得几年前我研究这个问题的话。您可以将入站数据填充到 4kB(恶心!),或者使用更好的解析器。我通过将 James Clark 的 XP (Java) 移植到 C# 作为 Jabber-Net 的一部分来修复此问题,此处:

http://code.google.com/p/jabber-net/source/browse/#svn/trunk/xpnet

它是 LGPL,仅处理 UTF8,不是没有打包使用,并且几乎没有文档,所以我不建议使用它。 :)

XmlReader buffers in 4kB chunks, if I remember from when I looked in to this a couple of years ago. You could pad your inbound data to 4kB (ick!), or use a better parser. I fixed this by porting James Clark's XP (Java) to C# as a part of Jabber-Net, here:

http://code.google.com/p/jabber-net/source/browse/#svn/trunk/xpnet

It's LGPL, only handles UTF8, isn't packaged for use, and has almost no documentation, so I wouldn't recommend using it. :)

猫腻 2024-08-28 11:55:28

最简单的方法就是将其放在另一个线程上,也许是一个线程池,具体取决于它保持活动状态的时间。 (不要将线程池线程用于真正长时间运行的任务)。

The easiest thing to do is just put it on another thread, perhaps a ThreadPool depending on how long it stays active. (Don't use thread pool threads for truly long-running tasks).

五里雾 2024-08-28 11:55:28

看起来 DOT NET 4.5 在 XmlReader 上有一个 bool Async 属性,而 3.5 中没有该属性。也许这对你有用?

It looks like DOT NET 4.5 has an bool Async property on XmlReader, that is not there in 3.5. Maybe that will work for you?

叹倦 2024-08-28 11:55:28

这确实很棘手,因为 XmlReader 不提供任何异步接口。

我不太确定当您要求 BeginRead 读取 0 字节时,它的异步行为有多少 - 它可能会立即调用回调,然后在您调用 Read 时阻塞>。这可能与直接调用 Read 然后在线程池中调度下一个 Read 是一样的,例如使用 QueueWorkItem

最好在网络流上使用 BeginRead 来读取数据,例如以 10kB 块的形式读取数据(当系统等待数据时,您不会阻塞任何线程)。当您收到一个块时,您会将其复制到某个本地 MemoryStream 中,并且您的 XmlReader 将从该 MemoryStream 中读取数据。

但这仍然存在一个问题 - 在复制 10kB 数据并多次调用 Read 后,最后一次调用将被阻塞。然后,您可能需要复制较小的数据块来解锁对 Read 的挂起调用。完成后,您可以再次启动新的 BeginRead 调用来异步读取大部分数据。

老实说,这听起来相当复杂,所以我很感兴趣是否有人能提出更好的答案。但是,它至少为您提供了一些有保证的异步操作,这些操作需要一些时间并且同时不会阻塞任何线程(这是异步编程的基本目标)。

旁注:您可以尝试使用F# 异步工作流程 来编写这个,因为它们使异步代码变得更加简单,但即使在 F# 中,我描述的技术也会很棘手)。

This is really tricky, because XmlReader doesn't provide any asynchronous interface.

I'm not really sure how much asynchronously does the BeginRead behave when you ask it to read 0 bytes - it might as well invoke the callback immediately and then block when you call Read. This could be the same thing as calling Read directly and then scheduling the next Read in a thread pool for example using QueueWorkItem.

It may be better to use BeginRead on the network stream to read data for example in 10kB chunks (while the system waits for the data, you wouldn't be blocking any thread). When you receive a chunk, you would copy it into some local MemoryStream and your XmlReader would be reading data from this MemoryStream.

This still has a problem though - after copying 10kB of data and calling Read several times, the last call would block. Then you would probably need to copy smaller chunks of data to unblock the pending call to Read. Once that's done, you could again start a new BeginRead call to read larger portion of data asynchronously.

Honestly, this sounds pretty complicated, so I'm quite interested if anybody comes up with a better answer. However, it gives you at least some guaranteed asynchronous operations that take some time and do not block any threads in the meantime (which is the essential goal of asynchronous programming).

(Side note: You could try using F# asynchronous workflows to write this, because they make asynchronous code a lot simpler. The technique I described will be tricky even in F# though)

幸福%小乖 2024-08-28 11:55:28

您是否正在寻找类似 XamlReader.LoadAsync 方法的方法?

异步 ​​XAML 加载操作
最初会返回一个对象
纯粹是根对象。
异步,然后XAML解析
继续,并且任何子对象都是
填写在根下。

Are you looking for something like the XamlReader.LoadAsync method?

An asynchronous XAML load operation
will initially return an object that
is purely the root object.
Asynchronously, XAML parsing then
continues, and any child objects are
filled in under the root.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文