将句子放入列表中 - python

发布于 2024-10-19 15:01:56 字数 426 浏览 6 评论 0原文

我知道 nltk 可以分割句子并使用以下代码将其打印出来。但是我如何将句子放入列表中而不是输出到屏幕上？

import nltk.data
from nltk.tokenize import sent_tokenize
import os, sys, re, glob
cwd = './extract_en' #os.getcwd()
for infile in glob.glob(os.path.join(cwd, 'fileX.txt')):
    (PATH, FILENAME) = os.path.split(infile)
    read = open(infile)
    for line in read:
        sent_tokenize(line)

sent_tokenize(line) 将其打印出来。我如何将其放入列表中？

原文

I understand that nltk can split sentences and print it out using the following code.
but how do i put the sentences into a list instead of outputing onto the screen?

import nltk.data
from nltk.tokenize import sent_tokenize
import os, sys, re, glob
cwd = './extract_en' #os.getcwd()
for infile in glob.glob(os.path.join(cwd, 'fileX.txt')):
    (PATH, FILENAME) = os.path.split(infile)
    read = open(infile)
    for line in read:
        sent_tokenize(line)

the sent_tokenize(line) prints it out. how do i put it into a list?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

东风软 2024-10-26 15:01:56

这是我用来测试代码的简化版本：

import nltk.data
from nltk.tokenize import sent_tokenize
import sys
infile = open(sys.argv[1])
slist = []
for line in infile:
    slist.append(sent_tokenize(line))
print slist
infile.close()

当像这样调用时，它会打印以下内容：

me@mine:~/src/ $ python nltkplay.py nltkplay.py 
[['import nltk.data\n'], ['from nltk.tokenize import sent_tokenize\n'], ['import sys\n'], ['infile = open(sys.argv[1])\n'], ['slist = []\n'], ['for line in infile:\n'], ['    slist.append(sent_tokenize(line))\n'], ['print slist\n'], ['\n']]

当做这样的事情时，列表理解更简洁，IMO 更容易阅读：

slist = [sent_tokenize(line) for line in infile]

为了澄清，上面返回了一个列表句子列表，每行一个句子列表。如果您想要一个简单的句子列表，请按照 eyquem 建议的那样执行此操作：

slist = sent_tokenize(infile.read())

Here's a simplified version that I used to test the code:

import nltk.data
from nltk.tokenize import sent_tokenize
import sys
infile = open(sys.argv[1])
slist = []
for line in infile:
    slist.append(sent_tokenize(line))
print slist
infile.close()

When called like so, it prints the following:

me@mine:~/src/ $ python nltkplay.py nltkplay.py 
[['import nltk.data\n'], ['from nltk.tokenize import sent_tokenize\n'], ['import sys\n'], ['infile = open(sys.argv[1])\n'], ['slist = []\n'], ['for line in infile:\n'], ['    slist.append(sent_tokenize(line))\n'], ['print slist\n'], ['\n']]

When doing something like this, a list comprehension is more concise and IMO more pleasant to read:

slist = [sent_tokenize(line) for line in infile]

To clarify, the above returns a list of lists of sentences, one list of sentences for each line. If you want a flat list of sentences, do this instead, as eyquem suggests:

slist = sent_tokenize(infile.read())

回复收藏 0 原文

苹果你个爱泡泡 2024-10-26 15:01:56

您不得使用关键字名称（read）来命名程序的对象。

。

如果你想追加到列表中，你必须有一个列表：

reclist = []
for line in f:
    reclist.append(line)

或者使用列表理解

reclist = [ line for line in f ]

或使用Python工具

reclist = f.readlines()

，或者我不明白你想要什么

编辑：

好吧，考虑到 Jochen Ritzel 的评论，你想要

f = open(infile)
reclist = sent_tokenise(f.read())

You must not use a keyword name (read) to name an object of your programm.

If you want to append in a list, you must have a list:

reclist = []
for line in f:
    reclist.append(line)

or with a list comprehension

reclist = [ line for line in f ]

or using the tools of Python

reclist = f.readlines()

or I didn't understand what you want

EDIT:

Well, considering the Jochen Ritzel 's remark, you want

f = open(infile)
reclist = sent_tokenise(f.read())

回复收藏 0 原文

~没有更多了~

关于作者

ら栖息

暂无简介

文章

27 人气

关注发私信

╰ゝ天使的微笑

文章 0 评论 0

关注

少女净妖师

文章 0 评论 0

关注

朱洁

文章 0 评论 0

关注

觉浅

文章 0 评论 0

关注

滥情空心

文章 0 评论 0

关注

hl1314520

文章 0 评论 0

友情链接

文江博客

将句子放入列表中 - python

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

将句子放入列表中 - python

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。