使用Pickle保存的机器学习模型会正确预测文本值

发布于 2025-01-28 02:08:01 字数 1457 浏览 4 评论 0原文

我目前有一个机器学习模型,该模型可以预测当前单词属于的语音的哪个部分

penn_results = penn_crf.predict_single(features)

,然后我制作了一个代码,其中它可以使其制作(Word,pos)样式打印;

penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

当我尝试运行此功能时,它会给我这个输出。

[('the','dt'),('quick','jj'),('brown'','nn'),('fox','nn'),(“跳跃”,','','','',' nns'),('over','in in')] [('the','det'),('quick','noun'),('brown'','adj'),('fox','fox',' 'noun'),(“跳跃”,“名词”),(“ over”,“ adp'”)]

,所以我

penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))

在尝试通过此操作的模型时使用了该模型,然后加载HTE HTE保存的腌菜文件

test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))

它打印出来 [['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp' ],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'], ['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['' nnp'],['nnp'],['nnp'],['nnp'],['nnp'],['nnp']]

我如何使它打印出像这样的准确预测值 [('the','dt'),('quick','jj'),('brown'','nn'),('fox','nn'),(“跳跃”,','','','',' nns'),('over','in in')] [('the','det'),('quick','noun'),('brown'','adj'),('fox','fox',' '名词'),(“跳”,“名词”),(“ over”,“ adp”)]

I currently have a Machine Learning model which would predict what part of speech does a current word belong to

penn_results = penn_crf.predict_single(features)

and then, I made a code wherein it makes a print making a (word, POS) style print;

penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

and when I try to run this, it gives me this output.

[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]

and so I saved this model using

penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))

Upon trying to run the model by loading hte saved pickle file with this

test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))

It prints this
[['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP']]

How can I make it print the accurate predicted values like this
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

命硬 2025-02-04 02:08:01

注意:此代码未测试。

替换最后一行:

print(pickled_model.predict(test))

用这样的东西

tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

Caution: this code was not tested.

Replace the last line

print(pickled_model.predict(test))

with something like this:

tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)
假面具 2025-02-04 02:08:01

您需要

penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

在当前代码中

tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

包含功能功能,以便它可以通过预先保存的模型进行预测。

You need to include the feature function

penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

In your current code

tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

So that it would predict the same with your pre-saved model.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文