从不一致的 HTML 页面收集数据 - JSoup

发布于 2024-12-16 11:34:48 字数 927 浏览 3 评论 0原文

我试图从多个页面获取大量数据，但它并不总是一致的。这是我正在使用的 html 示例！：

我需要得到类似以下内容：团队 |团队|结果全部放入不同的变量或列表中。

我只需要一些关于从哪里开始的帮助，因为我在多个页面上使用的主表在每个人身上都不相同。

到目前为止，这是我的java：

    try {
        Document team_page = Jsoup.connect("http://www.soccerstats.com/team.asp?league=" + league + "&teamid=" + teamNumber).get();
        Element home_team = team_page.select("[class=homeTitle]").first();
        String teamName = home_team.text();
        System.out.println(teamName + "'s Latest Results: ");

        Elements main_page = team_page.select("[class=stat]");
        System.out.println(main_page);

    } catch (IOException e) {
        System.out.println("unable to parse content");
    }

我从程序的不同方法中获取了 league 和 teamid 。

谢谢！

原文

I'm trying to get a lot of data from multiple pages but its not always consistent. here is an example of the html I am working with!:

Example HTML

I need to get something like: Team | Team | Result all into different variables or lists.

I just need some help on where to start because the main table I'm working with on multiple pages isn't the same on everyone.

heres my java so far:

    try {
        Document team_page = Jsoup.connect("http://www.soccerstats.com/team.asp?league=" + league + "&teamid=" + teamNumber).get();
        Element home_team = team_page.select("[class=homeTitle]").first();
        String teamName = home_team.text();
        System.out.println(teamName + "'s Latest Results: ");

        Elements main_page = team_page.select("[class=stat]");
        System.out.println(main_page);

    } catch (IOException e) {
        System.out.println("unable to parse content");
    }

I am getting the league and teamid from different methods of my program.

Thanks!

分享到QQ

分享到微博