使用Pickle保存的机器学习模型会正确预测文本值

发布于 2025-01-28 02:08:01 字数 1457 浏览 4 评论 0原文

我目前有一个机器学习模型，该模型可以预测当前单词属于的语音的哪个部分

penn_results = penn_crf.predict_single(features)

，然后我制作了一个代码，其中它可以使其制作（Word，pos）样式打印；

penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

当我尝试运行此功能时，它会给我这个输出。

[（'the'，'dt'），（'quick'，'jj'），（'brown''，'nn'），（'fox'，'nn'），（“跳跃”，'，''，''，''，' nns'），（'over'，'in in'）] [（'the'，'det'），（'quick'，'noun'），（'brown''，'adj'），（'fox'，'fox'，' 'noun'），（“跳跃”，“名词”），（“ over”，“ adp'”）]

，所以我

penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))

在尝试通过此操作的模型时使用了该模型，然后加载HTE HTE保存的腌菜文件

test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))

它打印出来 [['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp' ]，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']， ['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['' nnp']，['nnp']，['nnp']，['nnp']，['nnp']，['nnp']]

我如何使它打印出像这样的准确预测值 [（'the'，'dt'），（'quick'，'jj'），（'brown''，'nn'），（'fox'，'nn'），（“跳跃”，'，''，''，''，' nns'），（'over'，'in in'）] [（'the'，'det'），（'quick'，'noun'），（'brown''，'adj'），（'fox'，'fox'，' '名词'），（“跳”，“名词”），（“ over”，“ adp”）]

原文

I currently have a Machine Learning model which would predict what part of speech does a current word belong to

penn_results = penn_crf.predict_single(features)

and then, I made a code wherein it makes a print making a (word, POS) style print;

penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

and when I try to run this, it gives me this output.

[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]

and so I saved this model using

penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))

Upon trying to run the model by loading hte saved pickle file with this

test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))

It prints this
[['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP']]

How can I make it print the accurate predicted values like this
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

命硬 2025-02-04 02:08:01

注意：此代码未测试。

替换最后一行：

print(pickled_model.predict(test))

用这样的东西

tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

Caution: this code was not tested.

Replace the last line

print(pickled_model.predict(test))

with something like this:

tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

回复收藏 0 原文

假面具 2025-02-04 02:08:01

您需要

penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

在当前代码中

tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

包含功能功能，以便它可以通过预先保存的模型进行预测。

You need to include the feature function

penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

In your current code

tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

So that it would predict the same with your pre-saved model.

回复收藏 0 原文

~没有更多了~