Mallet HMM 训练问题

发布于 2024-12-12 13:45:38 字数 680 浏览 3 评论 0原文

我目前正在努力解决 Mallet 关于 HMM 的极其糟糕的文档。我已经设法将数据导入到实例中(改编自 ImportExample.java 片段),我只是想知道如何使用它们来训练 HMM 模型。 我首先创建一个 HMM 实例,但我不确定是否要这样做:

 HMM hmm = new HMM(instances.getDataAlphabet(),instances.getTargetAlphabet());

或者使用相同的数据字母表两次,如下所示:

 HMM hmm = new HMM(instances.getDataAlphabet(),instances.getDataAlphabet());

不管怎样,当我到达时

 hmm.train(实例);

我收到以下错误:

cc.mallet.types.FeatureVector 无法转换为 cc.mallet.types.FeatureVectorSequence

如果您能提供任何帮助,我将不胜感激。

干杯

I am struggling at the moment with Mallet's ridiculously poor documentation regarding HMMs. I have managed to import the data into instances(adapted from the ImportExample.java snippet) and I was just wondering how they can be used to train an HMM model.
I first started by creating an HMM instance but I wasn't sure whether to go for:

    HMM hmm = new HMM(instances.getDataAlphabet(), instances.getTargetAlphabet());

Or use the same data alphabet twice like so:

    HMM hmm = new HMM(instances.getDataAlphabet(), instances.getDataAlphabet());

Either way when I get to

    hmm.train(instances);

I get the following error:

cc.mallet.types.FeatureVector cannot be cast to
cc.mallet.types.FeatureVectorSequence

I would be grateful for any help you can provide.

Cheers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

月下凄凉 2024-12-19 13:45:38

我已经设法解决了这个特定问题,并认为它可能对遇到同样问题的其他人有用。 mallet 中的示例包中有一个解决方案: http://hg-iesl.cs.umass.edu/hg/mallet/file/83adf71b0824/src/cc/mallet/examples/TrainHMM.java

主要问题与如何您通过管道导入了数据。另外,据我所知,如果您的数据采用这种格式,它会有所帮助:

TOKEN  TAG 
TOKEN  TAG

我假设您可以拥有 TOKEN 和 TAG 之间的功能,但我不能 100% 确定。如果有人知道有关在 mallet 中使用 HMM 的任何好的示例和文档,请告诉我。

I have managed to solve this particular problem and thought it may be useful to others with the same problem. There is a solution within the examples package in mallet: http://hg-iesl.cs.umass.edu/hg/mallet/file/83adf71b0824/src/cc/mallet/examples/TrainHMM.java

The main problem was related to how you imported the data through the pipe. Also from what I can tell it helps if you data is in this format:

TOKEN  TAG 
TOKEN  TAG

I assuming you can have features in between the TOKEN and TAG but am not a 100% sure. If anyone knows of any good examples and documentation about using HMM within mallet, please let me know.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文