Mallet:在Maxent算法中获取置信度值
我在 mallet 中使用 maxent 算法进行标签分类。我想知道是否有可能获得 maxent 分类器预测的标签的某种置信值。我基本上需要的是分类器最有信心的前 K 个预测(不是针对每个标记,而是针对整个数据)实例,并将它们用于引导。有什么办法可以做到这一点吗?
I am using the maxent algo in mallet for label classification. I was wondering whether it is possible to get some kind of confidence value for the label predicted by the maxent classifier. What I basically need is the top K prediction(not for each token, but from entire data) instances that the classifier is most confident about and use them for bootstrapping. Is there any way to do this ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
查看调用任何 mallet 分类方法时返回的 Labeling 对象。此类包含每个标签的所有计算分数:最好的分数成为答案(这是 getBestLabel() 方法返回的分数)。问题在于这些分数通常不是概率。我不熟悉 mallet 的 maxent 分类器 - 您需要查看代码并确定返回的分数是否可以以某种方式“转换”为概率,或者可能已经采用正确的形式。无论如何,听起来您想要做的就是为前 K 个分数之间的相对差异赋予意义。同样,这取决于 maxent 分类器的确切细节。因此,请查看数据集上实际返回的标签对象实例,并使用您的最佳判断。
希望这有帮助!
Look at the Labeling object returned when you call any mallet classify method. This class contains all of the computed scores for each label: the best one becomes the answer (this is the one returned by getBestLabel() method). The rub is that these scores are not usually probabilities. I'm not familiar with mallet's maxent classifier--you will need to look at the code and determine if the returned scores can be "cast" to probabilities somehow or perhaps are already in the proper form. At any rate it sounds like what you want to do is assign meaning to the relative difference between the top K scores. Again, this depends on the exact details of the maxent classifier. So look at the Labeling object instances actually returned on your data set and use your best judgement.
Hope this helps!