重用Weka代码解析ARFF文件
有人这样做过吗?有没有关于如何使用这个解析器模块的文档?我已经查看了代码,但我不清楚如何在解析数据后实际使用数据。
文件 src\main\java\weka\core\converters\ArffLoader.java (我假设这是 Arff 解析发生的地方)具有以下说明:
- 批量使用的典型代码:
- BufferedReader reader = new BufferedReader (new FileReader("/some/where/file.arff"));
- ArffReader arff = new ArffReader(阅读器);
- 实例数据 = arff.getData();
- data.setClassIndex(data.numAttributes() - 1);
但是我还能用“数据”做什么呢?如何访问每一行以及每一行中的值?
(顺便说一下,我是 Java 新手。如果我运行此代码,是否可以对数据进行某种内省看看它提供了什么?这就是我在 Python 中要做的。)
(如果存在的话,我也愿意接受关于在我的项目中使用更简单的开源 Arff 解析器的建议。)
Has anyone done this? Is there any documentation on how to use this parser module? I've looked through the code but it's not clear to me to how to actually use the data after it's been parsed.
The file src\main\java\weka\core\converters\ArffLoader.java (which I assume is where the Arff parsing happens) has these instructions:
- Typical code for batch usage:
- BufferedReader reader = new BufferedReader(new FileReader("/some/where/file.arff"));
- ArffReader arff = new ArffReader(reader);
- Instances data = arff.getData();
- data.setClassIndex(data.numAttributes() - 1);
But what else can I do with 'data'? How do I access each row and the values in each row?
(By the way, I'm new to Java. If I run this code, is there some kind of introspection I could do on data to see what it offers? That's what I would do in Python.)
(I'm also open to suggestions for a simpler open source Arff parser to use in my project if one exists.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
在我看来,您的答案在于
Instances
类 - 这是存储数据的地方。我会通过查找或生成其 javadoc,或者简单地仔细阅读其源代码来找到实例类的 API。此类的方法应该允许您操作从 ARFF 文件加载的数据。
It looks to me that your answer lies in the
Instances
class - that is where the data is stored.I would find the API of the Instances classes, either by locating or generating its javadoc, or simply perusing its source. The methods of this class should allow you to manipulate the data that has been loaded from the ARFF file.
您可以使用 Weka from Python,并进行内省。我已经成功地使用 JRuby 中的 Weka 来完成同样的事情。谷歌“Weka 文档”找到链接到稳定版和开发版 API 的页面。我没有足够的声誉来在我的答案中添加第二个链接:)
You can use Weka from Python, and get introspection. I've been successfully using Weka from JRuby to do the same thing. Google "Weka documentation" to find the page that links to the API for the stable and development version. I don't have enough reputation to put a second link in my answer :)
weka 解析器与其内部数据模型 -
实例
紧密相关。ARFF 格式并不难解析,您最好编写一个自定义解析器来直接生成所需的数据表示。
The weka parser is closely tied to their internal data model -
Instances
.The ARFF format is not that hard to parse, you might be better off writing an custom parser that directly produces your desired data representation.
获得 Instances 对象数据后,您可以使用它来:
您可以在以下位置查看所有方法: 实例 JavaDoc
after you have the Instances object data, you can use it to:
You can see all the methods at: Instances JavaDoc
我用过这样的东西:
I used something like this: