使用 Python 对字符串中的文本进行解析和分组

发布于 2024-10-02 02:05:09 字数 262 浏览 2 评论 0原文

我需要解析一系列短字符串,这些字符串由 3 个部分组成:一个问题和 2 个可能的答案。该字符串将遵循一致的格式:

这是问题 "answer_option_1 is inquotes" "answer_option_2 is inquotes"

我需要识别问题部分以及单引号或双引号中的两个可能的答案选项。

前任。: 今天的天空是什么颜色? “蓝色”或“灰色”
谁将赢得“密歇根州”“俄亥俄州”比赛

我如何在 python 中做到这一点?

I need to parse a series of short strings that are comprised of 3 parts: a question and 2 possible answers. The string will follow a consistent format:

This is the question "answer_option_1 is in quotes" "answer_option_2 is in quotes"

I need to identify the question part and the two possible answer choices that are in single or double quotes.

Ex.:
What color is the sky today? "blue" or "grey"
Who will win the game 'Michigan' 'Ohio State'

How do I do this in python?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

如痴如狂 2024-10-09 02:05:09
>>> import re
>>> s = "Who will win the game 'Michigan' 'Ohio State'"
>>> re.match(r'(.+)\s+([\'"])(.+?)\2\s+([\'"])(.+?)\4', s).groups()
('Who will win the game', "'", 'Michigan', "'", 'Ohio State')
>>> import re
>>> s = "Who will win the game 'Michigan' 'Ohio State'"
>>> re.match(r'(.+)\s+([\'"])(.+?)\2\s+([\'"])(.+?)\4', s).groups()
('Who will win the game', "'", 'Michigan', "'", 'Ohio State')
剧终人散尽 2024-10-09 02:05:09

如果您的格式如您所说的简单(即不是,如您的示例中所示),则不需要正则表达式。只需分割该行:

>>> line = 'What color is the sky today? "blue" "grey"'.strip('"')
>>> questions, answers = line.split('"', 1)
>>> answer1, answer2 = answers.split('" "')
>>> questions
'What color is the sky today? '
>>> answer1
'blue'
>>> answer2
'grey'

If your format is a simple as you say (i.e. not as in your examples), you don't need regex. Just split the line:

>>> line = 'What color is the sky today? "blue" "grey"'.strip('"')
>>> questions, answers = line.split('"', 1)
>>> answer1, answer2 = answers.split('" "')
>>> questions
'What color is the sky today? '
>>> answer1
'blue'
>>> answer2
'grey'
国产ˉ祖宗 2024-10-09 02:05:09

一种可能性是您可以使用正则表达式。

import re
robj = re.compile(r'^(.*) [\"\'](.*)[\"\'].*[\"\'](.*)[\"\']')
str1 = "Who will win the game 'Michigan' 'Ohio State'"
r1 = robj.match(str1)
print r1.groups()
str2 = 'What color is the sky today? "blue" or "grey"'
r2 = robj.match(str2)
r2.groups()

输出:

('Who will win the game', 'Michigan', 'Ohio State')
('What color is the sky today?', 'blue', 'grey')

One possibility is that you can use regex.

import re
robj = re.compile(r'^(.*) [\"\'](.*)[\"\'].*[\"\'](.*)[\"\']')
str1 = "Who will win the game 'Michigan' 'Ohio State'"
r1 = robj.match(str1)
print r1.groups()
str2 = 'What color is the sky today? "blue" or "grey"'
r2 = robj.match(str2)
r2.groups()

Output:

('Who will win the game', 'Michigan', 'Ohio State')
('What color is the sky today?', 'blue', 'grey')
机场等船 2024-10-09 02:05:09

Pyparsing 将为您提供一个适应输入文本中的某些变化的解决方案:

questions = """\
What color is the sky today? "blue" or "grey"
Who will win the game 'Michigan' 'Ohio State'""".splitlines()

from pyparsing import *

quotedString.setParseAction(removeQuotes)
q_and_a = SkipTo(quotedString)("Q") + delimitedList(quotedString, Optional("or"))("A")

for qn in questions:
    print qn
    qa = q_and_a.parseString(qn)
    print "qa.Q", qa.Q
    print "qa.A", qa.A
    print

将打印:

What color is the sky today? "blue" or "grey"
qa.Q What color is the sky today? 
qa.A ['blue', 'grey']

Who will win the game 'Michigan' 'Ohio State'
qa.Q Who will win the game 
qa.A ['Michigan', 'Ohio State']

Pyparsing will give you a solution that will adapt to some variability in the input text:

questions = """\
What color is the sky today? "blue" or "grey"
Who will win the game 'Michigan' 'Ohio State'""".splitlines()

from pyparsing import *

quotedString.setParseAction(removeQuotes)
q_and_a = SkipTo(quotedString)("Q") + delimitedList(quotedString, Optional("or"))("A")

for qn in questions:
    print qn
    qa = q_and_a.parseString(qn)
    print "qa.Q", qa.Q
    print "qa.A", qa.A
    print

Will print:

What color is the sky today? "blue" or "grey"
qa.Q What color is the sky today? 
qa.A ['blue', 'grey']

Who will win the game 'Michigan' 'Ohio State'
qa.Q Who will win the game 
qa.A ['Michigan', 'Ohio State']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文