XSLT 处理器可以是多线程的吗?
我正在寻找解决 XSLT 处理问题的方法。
是否可以使用并行处理来加速 XSLT 处理器?或者 XSLT 处理器本质上是串行的?
我的预感是 XML 可以被分割成可以由不同线程处理的块,但由于我没有真正找到任何有关此类壮举的文档,所以我对此表示怀疑。是否可以使用 StAX 同时对 XML 进行分块?
似乎大多数 XSLT 处理器都是用 Java 或 C/C++ 实现的,但我确实没有目标语言。我只是想知道多线程 XSLT 处理器是否可行。
你有什么想法?
I'm fishing for approaches to a problem with XSLT processing.
Is it possible to use parallel processing to speed up an XSLT processor? Or are XSLT processors inherently serial?
My hunch is that XML can be partitioned into chunks which could be processed by different threads, but since I'm not really finding any documentation of such a feat, I'm getting skeptical. It possible to use StAX to concurrently chunk XML?
It seems that most XSLT processors are implemented in Java or C/C++, but I really don't have a target language. I just want to know if a multi-threaded XSLT processor is conceivable.
What are your thoughts?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于通过搜索而点击此线程的人来说,这是一个迟到的答案。在提出这个问题时,XSLT 中的多线程在理论上是可能的,但实际上并未在任何生产 XSLT 处理器中实现。如今,Saxon-EE 中的多线程“开箱即用”可用。描述其工作原理的论文已在 2015 年 XML 布拉格发表:请参阅 http://www.saxonica .com/papers/xmlprague-2015mhk.pdf
A late answer, for people who hit this thread as a result of a search. At the time this question was asked, multithreading in XSLT was a theoretical possibility but wasn't actually realised in any production XSLT processors. Today multithreading is available "out-of-the-box" in Saxon-EE. A paper describing how this works was published at XML Prague 2015: see http://www.saxonica.com/papers/xmlprague-2015mhk.pdf
与大多数编程语言一样,只要遵循几条规则,循环本质上是可并行的,这称为数据并行性
任何循环结构都可以并行化在 XSLT 中相当容易。
通过针对突变和依赖性的类似规则,您确实可以以一种基于任务的并行性并行化大部分 XSLT 转换。
首先,将整个文档分段为任务,在 XSLT 命令和文本节点边界处分段;应根据每个任务在文档中的位置(从上到下)为其分配一个顺序索引。
接下来,将任务分散到不同的 XSLT 处理函数,每个函数在不同的线程上运行;这些处理器都需要使用相同的全局状态(变量、常量等)进行初始化。
最后,一旦所有转换完成,控制线程应该按索引顺序收集结果(转换后的字符串)并将它们组装到完成的文档中。
Like most programming languages looping is inherently parallelizable as long as you follow a couple rules, this is known as Data Parallelism
Any looping constructs could be parallelized in XSLT fairly easily.
With similar rules against mutation and dependencies you really could parallelize most of an XSLT transformation in a kind of a task based parallelism.
First, fragment the document whole into tasks, segmented at XSLT command and text node boundaries; each task should be assigned a sequential index according to it's position in the document (top to bottom).
Next, scatter the tasks to distinct XSLT processing functions each running on different threads; these processors will all need to be initialized with the same global state (variables, constants, etc...).
Finally, once all the transformations are complete, the controlling thread should gather the results (transformed strings) in index order and assemble them into the finished document.
Saxon:XSLT 处理器剖析,关于 XSLT 处理器、saxon 的优秀文章特别的。它涵盖了多线程。
顺便说一句,Saxon 可用于 .NET 和 Java,并且是最好的处理器之一。
Saxon: Anatomy of an XSLT Processor, excellent article about XSLT processors, saxon in particular. It covers multithreading.
Saxon by the way is available both for .NET and Java and is one of the best processors available.