用于解析 JSON 等文本的正则表达式
我有以下形式的正则表达式:
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
冒号左侧的内容是标准字母字符 ([a-zA-Z]
),第一个字符始终以大写字母开头。它们不能是 Field1、Field2 或 Field3 以外的任何内容。但是,右侧的值可以跨越多行,并且可以包含任何字符:[a-zA-Z]
、空格、$
、%< /code>、
^
等。我正在尝试在 TCL 中单独匹配 {Field1:value}{Field2:value}{Field3:value} 的正则表达式。
I have regular expression of the form:
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
Field1:Value
Field2:Value
Field3:Value
Things to the left of the colon are standard alphabetical characters ([a-zA-Z]
) and the first character always starts with a capital letter. They can't be anything other than Field1 or Field2 or Field3. The value to the right, however, can span multiple lines and can contain any character: [a-zA-Z]
, white space, $
, %
, ^
, etc. I am trying for a regular expression that could match {Field1:value}{Field2:value}{Field3:value} separately in TCL.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一般来说,我的工作方式是首先将数据解析为行,然后为每行分配一个解释(例如,起始行或延续行),然后将起始行与其后续延续相结合(形成“逻辑”行)。只有完成此操作后,我才会使用 RE 将键与值分开。作为格式建议,如果该行以空格开头,请尝试将其作为延续。这非常容易实现,并且在文件中看起来不错。
作为代码:
好的,您可能有不同的规则来组合线条,但是通过将事情分成这样的两个阶段,您的生活会变得更加轻松。同样,可以对行有效性使用更严格的测试(例如,将
([AZ]\w+)
替换为(Field[123])
),但我不是确信这实际上是明智的。In general, I'd work by parsing the data first into lines, then assigning to each line an interpretation (e.g., start line or continuation line), then combining the start lines with their following continuations (forming “logical” lines). Only once that was done would I then use an RE to split the key from the value. As a suggestion for format, try having the line be a continuation if it starts with a space. That's dead easy to implement and looks good in a file.
As code:
OK, you might have different rules for combining the lines, but by splitting things up into two stages like this, you make your life much easier. Similarly, it's possible to use a tighter test for line validity (replacing
([A-Z]\w+)
with(Field[123])
for example) but I'm not convinced it's actually sensible.