正则表达式在 JavaScript 中检测文本区域中的 MathML

发布于 2024-11-19 04:25:04 字数 824 浏览 5 评论 0 原文

我正在使用 CodeMirror 并尝试创建我自己的 模式更改演示。我有一个

所以我只需要一种非常粗略/黑客的方式来检测

<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>

我需要从

I'm using CodeMirror and trying to create my own version of the mode-changing demo. I have a <textarea> on which I listen for changes and when there is a change, I want to look at the value in the <textarea> and determine if it is in the form of MathML.

So I just need to a very crude/hackish way to detect if the value in the <textarea> is MathML; it doesn't have to be perfect. I'm thinking that I can run a regex when the <textarea> changes and look for any of the following tags:

<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>

I need to take the string from the <textarea> and look if any of these tags are a substring. How would I write this regex?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

顾忌 2024-11-26 04:25:04
/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b/

将识别任何此类标签的开始。

要找到完整的格式良好的标签,您需要在结束 > 之前查找属性,这比较困难。类似的东西

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b[^>]*>/

不能保证匹配整个标签,但会确保标签开头后有一个 >

编辑:

...什么是 /]*>/?

正则表达式必须位于 /.../ 内部,就像字符串必须位于引号内一样,因为 JavaScript 解释器就是这样区分正则表达式文字和字符串、数字或任何其他内容的一种令牌。

匹配任何 mathml 标记的前两个字符。 (?:) 形成非捕获组。它与算术表达式中的括号相同。同样的,你必须在 (a + b) * (c + d) 中使用括号,我在上面使用括号来区分 来自。后者将匹配 ""sup" ,前面不带

末尾的 \b 是一个分词符。它说名称后面不应该有另一个单词字符。因此 匹配 " 但不匹配 "

[^>]* 位匹配除 '>' 之外的任意数量的字符。 [...] 是一个字符集,因此 [az] 匹配任何小写罗马字母。 ^ 对其取反,因此 [^az] 匹配任何非小写罗马字母的字符。

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b/

will identify the start of any such tag.

To find a whole well-formed tag, you need to look for attributes before the closing > which is tougher. Something like

/<m(?:frac|sup|sub|sqrt|root|fenced|subsup|underover|under|table|tr|td|row|i|o)\b[^>]*>/

is not guaranteed to match a whole tag, but will make sure there is a > after the start of the tag.

EDIT:

... what is /]*>/?

The regular expression has to be inside /.../ the same way a string has to be inside quotes because that is how the JavaScript interpreter tells a regular expression literal from a string or a number or any other kind of token.

The <m matches the first two characters of any mathml tag. The (?: and ) form a non-capturing group. It's the same as parentheses in an arithmetic expression. In the same way you have to use parentheses in (a + b) * (c + d) I use parentheses above to distinguish <m(?:frac|sup) from <mfrac|sup. The latter would match both "<mfrac" and "sup" without a <m before it.

The \b at the end is a word break. It says that there shouldn't be another word character after the name. So <msub\b matches "<msub" but not "<msubmarine".

The [^>]* bit matches any number of characters other than '>'. The [...] is a character set, so [a-z] matches any lower-case roman letter. The ^ negates it, so [^a-z] matches any character that is not a lower-case roman letter.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文