如何将WEKA实例附加到对象上?

发布于 2025-02-09 23:07:08 字数 2217 浏览 1 评论 0 原文

我可以使用WEKA库 https://weka.sourceforge.io/doc.dev/overview-summary.html

    public static Instances createWekaInstances(List<Ticket> tickets, String name) {
    // Create numeric attributes "x" and "y" and "z"
    Attribute x = new Attribute("x"); //sqrt of row pos
    Attribute y = new Attribute("y"); // section cv
    // Create arrayList of the above attributes
    ArrayList<Attribute> attributes = new ArrayList<Attribute>();
    attributes.add(x);
    attributes.add(y);
    // Create the empty datasets "ticketInstances" with above attributes
    Instances ticketInstances = new Instances(name, attributes, 0);

    ticketInstances.setClassIndex(ticketInstances.numAttributes() - 1);

    for (Ticket ticket : tickets) {
        // Create empty instance with attribute values
        Instance inst = new DenseInstance(ticketInstances.numAttributes());
        // get the Ticket
        Ticket t = ticket;
        // Set instance's values for the attributes "x", "y" and so on
        inst.setValue(x, Math.sqrt(t.getRowPosition()));
        inst.setValue(y, t.getSectionCVS());
        // Set instance's dataset to be the dataset "ticketInstances"
        inst.setDataset(ticketInstances);
        // Add the Instance to Instance
        ticketInstances.add(inst);
    }
    return ticketInstances;
}

我能够使用

Instances neighbors = tree.kNearestNeighbours(ticketInstances.get(indexToSearch), 2);

但是,它返回了一个实例看起来像 - &gt; {0 2.44949,1 0.4} 因此,我无法将其关联到对象。因此,是否有一种“ WEKA”附加ID或其他内容的方式,因此我可以知道在此实例列表中哪个对象与目标对象最接近?

更新

好,这样做似乎适用于我的用例

 BallTree bTree = new BallTree();
    try{
        bTree.setInstances(dataset);
        EuclideanDistance euclideanDistance = new EuclideanDistance();
        euclideanDistance.setDontNormalize(true);
        euclideanDistance.setAttributeIndices("2-last");
        euclideanDistance.setInstances(dataset);
        bTree.setDistanceFunction(euclideanDistance);

    } catch(Exception e){
        e.printStackTrace();
    }

I'm able to create Instances from a list of objects like below using the Weka library https://weka.sourceforge.io/doc.dev/overview-summary.html

    public static Instances createWekaInstances(List<Ticket> tickets, String name) {
    // Create numeric attributes "x" and "y" and "z"
    Attribute x = new Attribute("x"); //sqrt of row pos
    Attribute y = new Attribute("y"); // section cv
    // Create arrayList of the above attributes
    ArrayList<Attribute> attributes = new ArrayList<Attribute>();
    attributes.add(x);
    attributes.add(y);
    // Create the empty datasets "ticketInstances" with above attributes
    Instances ticketInstances = new Instances(name, attributes, 0);

    ticketInstances.setClassIndex(ticketInstances.numAttributes() - 1);

    for (Ticket ticket : tickets) {
        // Create empty instance with attribute values
        Instance inst = new DenseInstance(ticketInstances.numAttributes());
        // get the Ticket
        Ticket t = ticket;
        // Set instance's values for the attributes "x", "y" and so on
        inst.setValue(x, Math.sqrt(t.getRowPosition()));
        inst.setValue(y, t.getSectionCVS());
        // Set instance's dataset to be the dataset "ticketInstances"
        inst.setDataset(ticketInstances);
        // Add the Instance to Instance
        ticketInstances.add(inst);
    }
    return ticketInstances;
}

I'm able to do a nearest neighbor search of whatever instance I want to see it's K nearest neighbors using https://weka.sourceforge.io/doc.dev/weka/core/neighboursearch/NearestNeighbourSearch.html.

Instances neighbors = tree.kNearestNeighbours(ticketInstances.get(indexToSearch), 2);

However it returns a list of 2 instances where an instance looks like -> {0 2.44949,1 0.4} so there is no way for me to associate it to my object. So is there a "Weka" way of attaching an ID or something so I'd be able to know which Object is nearest to the target object in this list of instances?

UPDATE

Okay doing this seems to work for my use case

 BallTree bTree = new BallTree();
    try{
        bTree.setInstances(dataset);
        EuclideanDistance euclideanDistance = new EuclideanDistance();
        euclideanDistance.setDontNormalize(true);
        euclideanDistance.setAttributeIndices("2-last");
        euclideanDistance.setInstances(dataset);
        bTree.setDistanceFunction(euclideanDistance);

    } catch(Exception e){
        e.printStackTrace();
    }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

半枫 2025-02-16 23:07:08

WEKA没有概念 weka.core.instance 对象的唯一ID,而是需要创建一个允许您识别行的附加属性(例如,票务ID或具有唯一的数字属性值)。

您可以使用

从您的代码中,您似乎只是在使用最近的邻居搜索而没有任何分类器或群集(对于这些分类器或集群,您将使用 filteredClassifier / filteredClusterer 方法来删除ID从用于构建模型的数据中的属性),因此您需要在 demandfunction 属性用于距离计算。这是通过向 setAttributeIndices(String)方法提供属性范围来完成的。如果您的ID属性是第一个属性,则将使用 2-LAST

Weka has not concept of unique IDs for weka.core.Instance objects, instead you need to create an additional attribute that will allow you to identify your rows (e.g., the ticket ID or a numeric attribute with unique values).

You can use the AddID filter to add a numeric attribute to your dataset that will contain such an ID, as mentioned in the Weka wiki article on Instance ID.

From your code it seems that you are just using the nearest neighbor search without any classifier or cluster involved (for these, you would use the FilteredClassifier/FilteredClusterer approach to remove the ID attribute from the data that is used for building the model), therefore you need to specify in the DistanceFunction which attributes to use for the distance calculation. This is done by supplying an attribute range to the setAttributeIndices(String) method. If your ID attribute is the first one, then you would use 2-last.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文