我如何让 jsoup 工作?

发布于 2024-12-07 17:30:51 字数 1912 浏览 2 评论 0原文

我一直在浏览这些 joup 位以从 div 中获取一些信息:

http://jsoup。 org/cookbook/extracting-data/dom-navigation

Document doc = Jsoup.connect(path).get();
Element cat = doc.getElementById("category_1");
Elements links = cat.getElementsByTag("a");
for (Element link : links) 
{
    rstring += link.attr("href");
    rstring += link.text() + "\n";
}

我写的代码不起作用,我已经为此工作了几个小时。

我可以通过不同的 jsoup 函数获得一些我想要的东西,但我需要获取此特定操作中的链接,以便我可以为我的 Android 应用程序填充和排列某些内容。

我正在尝试解析 http://android.myfewclicks.com 以便为我的真实网站测试和构建应用程序。

任何帮助都会很棒。 jsoup 只是不合作。

    <table class="table_list">
        <tbody class="header" id="category_1">
            <tr>
                <td colspan="4">
                    <div class="cat_bar">
                        <h3 class="catbg">
                            <a class="collapse" href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1"><img src="http://android.myfewclicks.com/Themes/default/images/collapse.gif" alt="-" /></a>
                            <a class="unreadlink" href="http://android.myfewclicks.com/index.php?action=unread;c=1">Unread Posts</a>
                            <a id="c1"></a><a href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1">Category A</a>
                        </h3>
                    </div>
                </td>
            </tr>
        </tbody>

在我的测试论坛上,有四个类别。这个特定部分中的三个链接是 4 个链接中的一组。如果我能弄清楚如何正确解析这些链接,那么我应该能够在我的应用程序上取得重大飞跃。但是 jsoup 的行为并不像我想象的那样,或者我错过了一些非常重要的东西。

ive been going through these joup bits to get some information from a div:

http://jsoup.org/cookbook/extracting-data/dom-navigation

Document doc = Jsoup.connect(path).get();
Element cat = doc.getElementById("category_1");
Elements links = cat.getElementsByTag("a");
for (Element link : links) 
{
    rstring += link.attr("href");
    rstring += link.text() + "\n";
}

that code bit i wrote does not work, and ive been working on this for hours.

i can get some of what i want with different jsoup functions, but i need to get the links in this particular action so i can populate and array of certain things for my android app.

im attempting to parse http://android.myfewclicks.com for testing and building an app for my real site.

any assistance at all would be wonderful. jsoup just wont cooperate.

    <table class="table_list">
        <tbody class="header" id="category_1">
            <tr>
                <td colspan="4">
                    <div class="cat_bar">
                        <h3 class="catbg">
                            <a class="collapse" href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1"><img src="http://android.myfewclicks.com/Themes/default/images/collapse.gif" alt="-" /></a>
                            <a class="unreadlink" href="http://android.myfewclicks.com/index.php?action=unread;c=1">Unread Posts</a>
                            <a id="c1"></a><a href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1">Category A</a>
                        </h3>
                    </div>
                </td>
            </tr>
        </tbody>

on my test forum, there are four categorys. the three links inside this particular part is 1 set of the 4. if i can figure out how to adaquitely parse these out, then i should be able to make a big leap on my app. but jsoup isnt behaving the way im thinking it should, or im missing something very crucial.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

黑凤梨 2024-12-14 17:30:51

显然,您需要先登录才能获取带有href的链接。当我在未登录的情况下在浏览器中打开该网站时,我发现

<tbody class="header" id="category_1">
    <tr>
        <td colspan="4">
            <div class="cat_bar">
                <h3 class="catbg">
                    <a id="c1"></a>Category A
                </h3>
            </div>
        </td>
    </tr>
</tbody>

我可以获得如下链接:

Document document = Jsoup.connect("http://android.myfewclicks.com/").get();
Elements category1links = document.select("#category_1 a");

for (Element category1link : category1links) {
    System.out.println(category1links);
}

打印

<a id="c1"></a>

请注意,没有 href 或文本!

Jsoup 不会自动为您登录,也不会接管您计算机上已安装的任意浏览器的 cookie。您需要自己登录并维护会话cookie。另请参阅使用用户名和密码发送 POST 请求并保存会话 cookie 举个例子。

You apparently need to login first in order to get the links with href. When I open the site in my browser while not logged in, I see

<tbody class="header" id="category_1">
    <tr>
        <td colspan="4">
            <div class="cat_bar">
                <h3 class="catbg">
                    <a id="c1"></a>Category A
                </h3>
            </div>
        </td>
    </tr>
</tbody>

I can get the links as follows:

Document document = Jsoup.connect("http://android.myfewclicks.com/").get();
Elements category1links = document.select("#category_1 a");

for (Element category1link : category1links) {
    System.out.println(category1links);
}

Which prints

<a id="c1"></a>

Note that there's no href or text!

Jsoup does not login for you automatically, nor does it take over the cookies of an arbitrary browser which is already installed on your machine. You need to login and maintain the session cookie yourself. See also Sending POST request with username and password and save session cookie for an example.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文