XSLT 1.0 HTML 字数统计
我希望调用一个模板,将字段缩减为 30 个单词。但是,此字段包含 HTML,并且 HTML 不应算作单词。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我希望调用一个模板,将字段缩减为 30 个单词。但是,此字段包含 HTML,并且 HTML 不应算作单词。
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
尝试一下这个,尽管不可否认翻译调用有点难看:
这当然要求翻译调用中的字符串包含可能出现在字段中的所有字符(空格除外)。它的工作原理是首先调用
normalize-space(.)
来删除双空格以及除文本内容之外的所有内容。然后它会删除除空格之外的所有内容,计算结果字符串的长度并加一。这确实意味着如果您有Mytext test
这将计为 2,因为它会将
Mytext
视为一个单词。如果您需要更强大的解决方案,那就有点复杂了:
这会将
normalize-space(.)
的结果传递到递归命名模板中,当 $text 中存在空格时,该模板会调用自身,增加其count
参数,并每次使用substring-after($text,' ')
调用截断第一个单词。如果没有空格,则它将$text
视为单个单词,并仅返回$count + 1
(+1 表示当前单词)。请记住,这将包括字段内的所有文本内容,包括内部元素内的文本内容。
编辑:自我注意:正确阅读问题,只是注意到您需要的不仅仅是字数统计。如果您想包含任何 xml 标签,那么做起来要复杂得多,但是对上面的内容稍作修改就可以吐出每个单词,而不是简单地计算它们:
有一个额外的 子句在计数达到 30 时停止递归,并且递归子句输出文本(如果它不是第一个单词,则在开头添加一个空格)。
编辑:好的,这里有一个保留转义 XML 内容的解决方案:
如果您需要更好地解释其中任何内容,请告诉我,除非您需要,否则我宁愿不详细说明!
Try this, although admittedly the translate call's a bit ugly:
This of course requires that the string in the translate call includes all characters that could appear in the field, other than spaces. It works by first calling
normalize-space(.)
to strip out both double-spaces and all but the text content. It then removes everything except spaces, counts the length of the resulting string and adds one. It does mean if you have<p>My<b>text</b> test</p>
this will count as 2, as it will considerMytext
to be one word.If you need a more robust solution, it's a little more convoluted:
This passes the result of
normalize-space(.)
into a recursive named template that calls itself when there's a space in$text
, incrementing it'scount
parameter, and chopping off the first word each time using thesubstring-after($text,' ')
call. If there's no space, then it treats$text
as a single word, and just returns$count + 1
(+1 for the current word).Bear in mind that this will include ALL text content within the field, including those within inner elements.
EDIT: Note to self: read the question properly, just noticed you needed more than just a word count. That's significantly more complicated to do if you want to include any xml tags, but a slight modification of the above is all it takes to spit out each word rather than simply count them:
There's an extra
<xsl:when
clause to simply stop recursing when count hits 30, and the recursive clause outputs the text, after adding a space at the beginning if it wasn't the first word.EDIT: Ok, here's a solution that keeps the escaped XML content:
If you need any of it explained better, let me know, I'd rather not go into detail unless you need it!
这是一种稍微不同的方法:
如果您可以清理输入,以便获得想要进行字数统计的文本的规范化字符串,则可以将带空格的字符串的字符串长度与带空格的字符串的字符串长度进行比较已删除。差异应该是你的字数。
字数统计函数(模板)将如下所示:
$sep 参数允许您定义要计为单词分隔符的任何字符(以及空格)的列表。
然后,您可以在调用模板时使用序列构造函数来构建所需的字符串(我将其作为读者的练习):
Here's a slightly different approach:
If you can clean your input so that you get a normalised string of the text you want to word count, you can compare the string-length of the string with spaces to the string-length of the string with spaces removed. The difference should be your word count.
The word count function (template) will look something like this:
The $sep parameter allows you to define a list of any character(s) (as well as white-space) that you want to count as a word separator.
You can then use a sequence constructor when you call the template to build the string you want (I'll leave that as an exercise for the reader):