对嵌入式元组/字符串进行操作,python
假设我有一个元组格式的标记文本(单词,标签)。我想将其转换为字符串以便对标签进行一些更改。我下面的函数只看到文本中的最后一句话,我想有一些我无法意识到的明显且愚蠢的错误,所以请帮助使它在整个文本上工作。
>>> import nltk
>>> tpl = [[('This', 'V'), ('is', 'V'), ('one', 'NUM'), ('sentence', 'NN'), ('.', '.')], [('And', 'CNJ'), ('This', 'V'), ('is', 'V'), ('another', 'DET'), ('one', 'NUM')]]
def translate(tuple2string):
for sent in tpl:
t = ' '.join([nltk.tag.tuple2str(item) for item in sent])
>>> print t
'And/CNJ This/V is/V another/DET one/NUM'
PS对于那些感兴趣的人,tuple2str函数在这里描述
< strong>编辑:现在我应该将其转换回元组,具有相同的格式。我该怎么做?
>>> [nltk.tag.str2tuple(item) for item in t.split()]
上面的转换成整个元组,但我需要嵌入一个(与输入(tpl
)中的相同)
编辑2:好吧,可能值得发布整个元组代码:
def translate(tpl):
t0 = [' '.join([nltk.tag.tuple2str(item) for item in sent]) for sent in tpl]
for t in t0:
t = re.sub(r'/NUM', '/N', t)
t = [nltk.tag.str2tuple(item) for item in t.split()]
print t
say I have a tagged text (word, tag) in tuple format. i want to convert it to a string in order to make some changes to the tags. my function below only sees the last sentence in the text, i guess there is some obvious and stupid mistake which i cant realize, so plz help to make it work on the entire text.
>>> import nltk
>>> tpl = [[('This', 'V'), ('is', 'V'), ('one', 'NUM'), ('sentence', 'NN'), ('.', '.')], [('And', 'CNJ'), ('This', 'V'), ('is', 'V'), ('another', 'DET'), ('one', 'NUM')]]
def translate(tuple2string):
for sent in tpl:
t = ' '.join([nltk.tag.tuple2str(item) for item in sent])
>>> print t
'And/CNJ This/V is/V another/DET one/NUM'
P.S. for those who are interested, tuple2str function is described here
EDIT: now i should convert it back to a tuple, having the same format. How do i do it?
>>> [nltk.tag.str2tuple(item) for item in t.split()]
the one above converts in into entire tuple, but i need embedded one (the same as in the input (tpl
) )
EDIT2: well, probably it's worth to publish the entire code:
def translate(tpl):
t0 = [' '.join([nltk.tag.tuple2str(item) for item in sent]) for sent in tpl]
for t in t0:
t = re.sub(r'/NUM', '/N', t)
t = [nltk.tag.str2tuple(item) for item in t.split()]
print t
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
编辑:
如果您希望这是可逆的,那么就不要进行外部联接。
编辑2:
我想我们已经讨论过这个了......
将其分解为非列表理解形式:
EDIT:
If you want this to be reversible then just don't do the outer join.
EDIT 2:
I thought we went over this already...
Splitting it out into the non-list-comprehension form: