Spacy Custentity培训什么都没有回报

发布于 2025-01-31 09:07:06 字数 1303 浏览 3 评论 0原文

我有要提取颜色的描述。因此，我认为我会用spacy使用ner。我有这样的数据，用于8000行，

import spacy
nlp=spacy.load('en_core_web_sm')

# Getting the pipeline component
ner=nlp.get_pipe("ner")

Train_data = 
[
("Bermuda shorts anthracite/black",{"entities" : [(15,31,"COL")]}),
("Acrylic antique white",{"entities" : [(8,22,"COL")]}),
("Pincer black",{"entities" : [(8,13,"COL")]}),
("Cable tie black",{"entities" : [(10,15,"COL")]}),
("Water pump pliers blue",{"entities" : [(18,22,"COL")]})
]

我的代码是

for _, annotations in Train_data:
    for ent in annotations.get("entities"):
        ner.add_label(ent[2])

pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"]
unaffected_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]


from spacy.training.example import Example

for batch in spacy.util.minibatch(Train_data, size=2):
    for text, annotations in batch:
        # create Example
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        # Update the model
        nlp.update([example], losses=losses, drop=0.3)

测试模型时我什么都没得到的。


doc = nlp("Bill Gates has a anthracite house worth 10 EUR.")
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])

我为什么做错了？请帮忙...

原文

I have descriptions from which I want to extract colours. Hence I thought I would use NER by spacy.
I have data like this for 8000 lines

import spacy
nlp=spacy.load('en_core_web_sm')

# Getting the pipeline component
ner=nlp.get_pipe("ner")

Train_data = 
[
("Bermuda shorts anthracite/black",{"entities" : [(15,31,"COL")]}),
("Acrylic antique white",{"entities" : [(8,22,"COL")]}),
("Pincer black",{"entities" : [(8,13,"COL")]}),
("Cable tie black",{"entities" : [(10,15,"COL")]}),
("Water pump pliers blue",{"entities" : [(18,22,"COL")]})
]

My code is

for _, annotations in Train_data:
    for ent in annotations.get("entities"):
        ner.add_label(ent[2])

pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"]
unaffected_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]


from spacy.training.example import Example

for batch in spacy.util.minibatch(Train_data, size=2):
    for text, annotations in batch:
        # create Example
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        # Update the model
        nlp.update([example], losses=losses, drop=0.3)

WHen I test the model I get nothing.


doc = nlp("Bill Gates has a anthracite house worth 10 EUR.")
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])

Why am I doing wrong?
Please help...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

野生奥特曼 2025-02-07 09:07:06

您的代码有几个问题。

您在哪里保存了模型？

您的代码中没有任何东西可以表明您保存并重新加载了模型。当您训练这样的模型时，您不会在磁盘上修改现有模型。如果您在训练后不保存模型，那就消失了，这意味着您没有颜色注释。

您的输入看起来不像您的培训数据！

您的输入是一个完整的句子，但是您的培训数据是孤立的短语。这将导致性能差，因为该模型不确定该如何处理颜色和动词。（不过，您可能仍然会收到一些注释。）

我强烈建议您通过 spacy课程，，涵盖了培训自己的NER模型。我还强烈建议您使用基于V3配置的培训，而不是编写自己的培训循环。

回复收藏 0 原文

浅黛梨妆こ 2025-02-07 09:07:06

当我尝试运行您的示例时：

for batch in spacy.util.minibatch(Train_data, size=2):
    for text, annotations in batch:
        # create Example
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        # Update the model
        nlp.update([example], drop=0.3)     # I took off losses from the code

运行以下代码时，我会得到：

doc = nlp("Bill Gates has a anthracite house worth 10 EUR.")
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])

#output

Entities [('Bill Gates', 'PERSON'), ('10', 'CARDINAL'), ('EUR', 'ORG')]

When I try to run your example:

for batch in spacy.util.minibatch(Train_data, size=2):
    for text, annotations in batch:
        # create Example
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        # Update the model
        nlp.update([example], drop=0.3)     # I took off losses from the code

When I run the below code I get:

doc = nlp("Bill Gates has a anthracite house worth 10 EUR.")
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])

#output

Entities [('Bill Gates', 'PERSON'), ('10', 'CARDINAL'), ('EUR', 'ORG')]

回复收藏 0 原文

~没有更多了~

关于作者

嘿咻

暂无简介

文章

27 人气

关注发私信

十二

文章 0 评论 0

关注

飞烟轻若梦

文章 0 评论 0

关注

OPleyuhuo

文章 0 评论 0

关注

wxb0109

文章 0 评论 0

关注

旧城空念

文章 0 评论 0

关注

-小熊_

文章 0 评论 0

友情链接

文江博客

Spacy Custentity培训什么都没有回报

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

Spacy Custentity培训什么都没有回报

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。