当前位置：文江博客话题详情

正则表达式在 JavaScript 中检测文本区域中的 MathML

发布于 2024-11-19 04:25:04 字数 824 浏览 13 评论 0 原文

我正在使用 CodeMirror 并尝试创建我自己的模式更改演示。我有一个 </code> ，我在其中侦听更改，当发生更改时，我想查看 <code><textarea></code> 中的值并确定是否它采用 MathML 的形式。

所以我只需要一种非常粗略/黑客的方式来检测 </code> 中的值是否是 MathML；它不一定是完美的。我认为我可以在 <code><textarea></code> 更改时运行正则表达式并查找以下任何标签：

<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>

我需要从 </code> 中获取字符串code> 并查看这些标签中是否有一个是子字符串。我将如何编写这个正则表达式？

原文

I'm using CodeMirror and trying to create my own version of the mode-changing demo. I have a <textarea> on which I listen for changes and when there is a change, I want to look at the value in the <textarea> and determine if it is in the form of MathML.

So I just need to a very crude/hackish way to detect if the value in the <textarea> is MathML; it doesn't have to be perfect. I'm thinking that I can run a regex when the <textarea> changes and look for any of the following tags:

<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>

I need to take the string from the <textarea> and look if any of these tags are a substring. How would I write this regex?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

顾忌 2024-11-26 04:25:04

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b/

将识别任何此类标签的开始。

要找到完整的格式良好的标签，您需要在结束 > 之前查找属性，这比较困难。类似的东西

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b[^>]*>/

不能保证匹配整个标签，但会确保标签开头后有一个 > 。

编辑：

...什么是 /]*>/？

正则表达式必须位于 /.../ 内部，就像字符串必须位于引号内一样，因为 JavaScript 解释器就是这样区分正则表达式文字和字符串、数字或任何其他内容的一种令牌。

匹配任何 mathml 标记的前两个字符。 (?: 和 ) 形成非捕获组。它与算术表达式中的括号相同。同样的，你必须在 (a + b) * (c + d) 中使用括号，我在上面使用括号来区分来自。后者将匹配 " 和 "sup" ，前面不带。

末尾的 \b 是一个分词符。它说名称后面不应该有另一个单词字符。因此 匹配 " 但不匹配 "。

[^>]* 位匹配除 '>' 之外的任意数量的字符。 [...] 是一个字符集，因此 [az] 匹配任何小写罗马字母。 ^ 对其取反，因此 [^az] 匹配任何非小写罗马字母的字符。

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b/

will identify the start of any such tag.

To find a whole well-formed tag, you need to look for attributes before the closing > which is tougher. Something like

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b[^>]*>/

is not guaranteed to match a whole tag, but will make sure there is a > after the start of the tag.

EDIT:

... what is /]*>/?

The regular expression has to be inside /.../ the same way a string has to be inside quotes because that is how the JavaScript interpreter tells a regular expression literal from a string or a number or any other kind of token.

The <m matches the first two characters of any mathml tag. The (?: and ) form a non-capturing group. It's the same as parentheses in an arithmetic expression. In the same way you have to use parentheses in (a + b) * (c + d) I use parentheses above to distinguish <m(?:frac|sup) from <mfrac|sup. The latter would match both "<mfrac" and "sup" without a <m before it.

The \b at the end is a word break. It says that there shouldn't be another word character after the name. So <msub\b matches "<msub" but not "<msubmarine".

The [^>]* bit matches any number of characters other than '>'. The [...] is a character set, so [a-z] matches any lower-case roman letter. The ^ negates it, so [^a-z] matches any character that is not a lower-case roman letter.

回复收藏 0 原文

~没有更多了~

关于作者

卷耳

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

正则表达式在 JavaScript 中检测文本区域中的 MathML

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

正则表达式在 JavaScript 中检测文本区域中的 MathML

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。