我正在使用 CodeMirror 并尝试创建我自己的 模式更改演示。我有一个
所以我只需要一种非常粗略/黑客的方式来检测
<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>
我需要从
I'm using CodeMirror and trying to create my own version of the mode-changing demo. I have a <textarea>
on which I listen for changes and when there is a change, I want to look at the value in the <textarea>
and determine if it is in the form of MathML.
So I just need to a very crude/hackish way to detect if the value in the <textarea>
is MathML; it doesn't have to be perfect. I'm thinking that I can run a regex when the <textarea>
changes and look for any of the following tags:
<mfrac>
<msup>
<msub>
<msqrt>
<mroot>
<mfenced>
<msubsup>
<munderover>
<munder>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>
<mo>
I need to take the string from the <textarea>
and look if any of these tags are a substring. How would I write this regex?
发布评论
评论(1)
将识别任何此类标签的开始。
要找到完整的格式良好的标签,您需要在结束
>
之前查找属性,这比较困难。类似的东西不能保证匹配整个标签,但会确保标签开头后有一个
>
。编辑:
正则表达式必须位于
/.../
内部,就像字符串必须位于引号内一样,因为 JavaScript 解释器就是这样区分正则表达式文字和字符串、数字或任何其他内容的一种令牌。(?:
和)
形成非捕获组。它与算术表达式中的括号相同。同样的,你必须在(a + b) * (c + d)
中使用括号,我在上面使用括号来区分" 和 。
"sup"
,前面不带末尾的 匹配
\b
是一个分词符。它说名称后面不应该有另一个单词字符。因此" 但不匹配
"。
[^>]*
位匹配除'>'
之外的任意数量的字符。[...]
是一个字符集,因此[az]
匹配任何小写罗马字母。^
对其取反,因此[^az]
匹配任何非小写罗马字母的字符。will identify the start of any such tag.
To find a whole well-formed tag, you need to look for attributes before the closing
>
which is tougher. Something likeis not guaranteed to match a whole tag, but will make sure there is a
>
after the start of the tag.EDIT:
The regular expression has to be inside
/.../
the same way a string has to be inside quotes because that is how the JavaScript interpreter tells a regular expression literal from a string or a number or any other kind of token.The
<m
matches the first two characters of any mathml tag. The(?:
and)
form a non-capturing group. It's the same as parentheses in an arithmetic expression. In the same way you have to use parentheses in(a + b) * (c + d)
I use parentheses above to distinguish<m(?:frac|sup)
from<mfrac|sup
. The latter would match both"<mfrac"
and"sup"
without a<m
before it.The
\b
at the end is a word break. It says that there shouldn't be another word character after the name. So<msub\b
matches"<msub"
but not"<msubmarine"
.The
[^>]*
bit matches any number of characters other than'>'
. The[...]
is a character set, so[a-z]
matches any lower-case roman letter. The^
negates it, so[^a-z]
matches any character that is not a lower-case roman letter.