如何为NER Spacy模型创建混乱矩阵?
我想为我的模型开发一个混乱矩阵,但我不确定如何使用它或使用哪种变量。由于我的模型具有两个功能用于训练,另一个用于测试,所以我不确定是否应该为两组结果或仅用于测试的混淆矩阵。
这是进行训练的功能,
def training(training_data, nlp, batch_size, iteration_index):
# batch up the examples using spaCy's minibatch
losses = {}
batches = minibatch(training_data, size=batch_size)
for batch in batches:
for text, annotations in batch:
doc = nlp.make_doc(text)
example = Example.from_dict(doc, annotations)
# Update the model
nlp.update([example], losses=losses, drop=0.3)
mlflow.log_metrics(losses, step=iteration_index)
return nlp
此功能使我可以测试模型
def testing(testing_data, nlp, iteration_index):
testing_examples = []
scorer_example = []
for text, annotations in testing_data:
doc = nlp.make_doc(text)
testing_examples.append(Example.from_dict(doc, annotations))
scorer_example = nlp.evaluate(testing_examples)
del scorer_example["ents_per_type"]
mlflow.log_metrics(scorer_example, step=iteration_index)
并最好地完成循环
with nlp.disable_pipes(*unaffected_pipes):
with mlflow.start_run(experiment_id=experiment_id, run_name=model_name):
mlflow.set_tag("model_flavor", model_name)
mlflow.log_param("BATCH_SIZE", BATCH_SIZE)
mlflow.log_param("TRAINING_ITERATION", TRAINING_ITERATION)
mlflow.log_param("LABELS", CLASSES)
mlflow.log_param("PIPE_NAMES", nlp.pipe_names)
# Training for many iterations
for iteration in tqdm(range(TRAINING_ITERATION)):
# shuufling examples before every iteration
random.shuffle(train_data)
nlp = training(train_data, nlp, BATCH_SIZE, iteration)
testing(test_data, nlp, iteration)
#Save results in MLflow
mlflow.log_artifact(local_path = './ner.ipynb')
mlflow.spacy.log_model(spacy_model=nlp, artifact_path=str(train_name))
,我想使用Plotly,因为我已经有机会在另一个项目中使用它
I want to develop a confusion matrix for my model, but I'm not sure how to go about it or which variable to use. Since my model has two functions one for training and the other for testing I'm not sure if I should make the confusion matrix for both sets of results or just for testing.
Here is the function to do the training
def training(training_data, nlp, batch_size, iteration_index):
# batch up the examples using spaCy's minibatch
losses = {}
batches = minibatch(training_data, size=batch_size)
for batch in batches:
for text, annotations in batch:
doc = nlp.make_doc(text)
example = Example.from_dict(doc, annotations)
# Update the model
nlp.update([example], losses=losses, drop=0.3)
mlflow.log_metrics(losses, step=iteration_index)
return nlp
This function allows me to test the model
def testing(testing_data, nlp, iteration_index):
testing_examples = []
scorer_example = []
for text, annotations in testing_data:
doc = nlp.make_doc(text)
testing_examples.append(Example.from_dict(doc, annotations))
scorer_example = nlp.evaluate(testing_examples)
del scorer_example["ents_per_type"]
mlflow.log_metrics(scorer_example, step=iteration_index)
And to complete the loop
with nlp.disable_pipes(*unaffected_pipes):
with mlflow.start_run(experiment_id=experiment_id, run_name=model_name):
mlflow.set_tag("model_flavor", model_name)
mlflow.log_param("BATCH_SIZE", BATCH_SIZE)
mlflow.log_param("TRAINING_ITERATION", TRAINING_ITERATION)
mlflow.log_param("LABELS", CLASSES)
mlflow.log_param("PIPE_NAMES", nlp.pipe_names)
# Training for many iterations
for iteration in tqdm(range(TRAINING_ITERATION)):
# shuufling examples before every iteration
random.shuffle(train_data)
nlp = training(train_data, nlp, BATCH_SIZE, iteration)
testing(test_data, nlp, iteration)
#Save results in MLflow
mlflow.log_artifact(local_path = './ner.ipynb')
mlflow.spacy.log_model(spacy_model=nlp, artifact_path=str(train_name))
Preferably I would like to use Plotly because I already had the opportunity to use it in another project
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论