是否有一种更简单的方法可以在元组中加入两个值,而不是将它们连接到列表中?
我创建了一个程序来基于trigrams生成句子。句子的头应包含由空间隔开的前两个单词,尾巴应包含最后一个单词。
我得到这样的单词: ('我的','失败','to') 并希望以一种格式返回它们: ('我的失败',''')
我不确定是否创建列表并加入单词是一个优化的想法,因此想询问您是否可以看到其他解决方案。
这是我创建Trigrams列表的代码:
def create_trigrams(f):
tokenizer = WhitespaceTokenizer()
file_text = f.read()
tokens = tokenizer.tokenize(file_text)
trigrams_tuple = ngrams(tokens, 3)
trigrams_list = []
for x in trigrams_tuple:
print(x)
string_pair = (x[0] + " " + x[1], x[2])
trigrams_list.append(string_pair)
return trigrams_list
I created a program to generate sentences based on trigrams. A head of the sentence should contain the first two words separated by space and a tail should contain the last word.
I get the words in such format:
('my', 'failure', 'to')
And want to return them in a format:
('my failure', 'to')
I am not sure if creating a list and concatenating the words is an optimized idea so wanted to ask if you can see another solution to this.
Here is the code where I create the list of trigrams:
def create_trigrams(f):
tokenizer = WhitespaceTokenizer()
file_text = f.read()
tokens = tokenizer.tokenize(file_text)
trigrams_tuple = ngrams(tokens, 3)
trigrams_list = []
for x in trigrams_tuple:
print(x)
string_pair = (x[0] + " " + x[1], x[2])
trigrams_list.append(string_pair)
return trigrams_list
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论