ECMA-262中lookahead的定义如何理解

发布于 2022-09-12 02:23:27 字数 979 浏览 23 评论 0

问题背景

阅读ECMAScript® 2019 Language Specification的11.8.4章节，StringLiteral规则的描述如下：

StringLiteral ::    
    " DoubleStringCharactersopt "
    ' SingleStringCharactersopt '
DoubleStringCharacters ::  
    DoubleStringCharacter
    DoubleStringCharactersopt
DoubleStringCharacter ::  
    SourceCharacter but not one of " or \\ or LineTerminator 
    <LS>  
    <PS>  
    LineContinuation
    \ EscapeSequence 
EscapeSequence :: 
    CharacterEscapeSequence
    HexEscapeSequence
    UnicodeEscapeSequence
    0 \[lookahead ∉ DecimalDigit\] /* 我是对这个条件分支的含义有困惑 */

关于lookahead的描述，参看5.15章节（page19）
“”
“if the phrase “[lookahead ∈ set]” appears in the right-hand side of a production, it indicates that the production may only be used if the immediately following input token sequence is a member of the given set。”

我的问题

我的问题是这个后续的Token是什么，是任何Unicode字符还是符合上一层规则的上下文字符？我该如何来写这个Token的正则呢？
感谢关注，解答

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谈情不如逗狗 2022-09-19 02:23:27

我想我搞明白characterA[lookahead<condition>]产生式中关于“the immediately following input token sequence”这个Token的描述了。

如果产生式是 characterA[lookahead<condition>]characterB,那么[lookahead<condition>]限制条件就加在了characterB上，这个很明确。

如果产生式是characterA[lookahead<condition>]，那么该condition是characterA的先行断言条件，是characterA的匹配规则啊。我想我是阅读理解上犯了错误。

之前一直不明白[lookahead<condition>]后面的Token该如何书写匹配规则，是写any unicode character 还是任何带有上下文规则的character。

最后，写一下“EscapeSequence”这条正则作为问题的结束，如下：

/^\\(?:'"\\\b\f\n\r\t\v)|(?:0(?![\d])|(?:x[0-9a-fA-F]{2})|(?:u[0-9a-fA-F]{4})|(?:u\{[{0000}-{10FFFF}]\})$/u

回复收藏 0

无敌元气妹 2022-09-19 02:23:27

看了一眼，写得挺清楚啊 ...

If the phrase “[lookahead ∉set]” appears in the right-hand side of a production, it indicates that the production may not be used if the immediately following input token sequence is a member of the given set. Thesetcan be written as a comma separated list of one or two element terminal sequences enclosed in curly brackets. For convenience, the set can also be written as a nonterminal, in which case it represents the set of all terminals to which that nonterminal could expand. If thesetconsists of a single terminal the phrase “[lookahead ≠terminal]” may be used.

回复收藏 0

~没有更多了~