使用 POST 数据和 cookie 处理网站

发布于 2024-09-06 09:32:30 字数 5800 浏览 6 评论 0原文

我尝试访问 ASPX 网站,其中后续页面根据以下内容返回 发布数据。不幸的是,我所有获取以下页面的尝试都失败了。 希望这里有人知道在哪里可以找到错误!

在第一步中,我从 cookie 中读取会话 ID 以及 返回的 html 页面中的 viewstate 变量。第二步 打算发送 返回服务器以获取所需的页面。

嗅探网络浏览器中的数据给出

Host=www.geocaching.com
User-Agent=Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100618
Iceweasel/3.5.9 (like Firefox/3.5.9)
Accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language=en-us,en;q=0.5
Accept-Encoding=gzip,deflate
Accept-Charset=ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive=300
Connection=keep-alive
Referer=http://www.geocaching.com/seek/nearest.aspx?state_id=149
Cookie=Send2GPS=garmin; BMItemsPerPage=200; maprefreshlock=true; ASP.
NET_SessionId=c4jgygfvu1e4ft55dqjapj45
Content-Type=application/x-www-form-urlencoded
Content-Length=4099
POSTDATA=__EVENTTARGET=ctl00%24ContentBody%24pgrBottom%
24lbGoToPage_3&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPD[...]2Xg%3D%
3D&language=on&logcount=on&gpx=on

当前,我的脚本看起来像这样

import java.net.*;
import java.io.*;
import java.util.*;
import java.security.*;
import java.net.*;

public class test1 {
    public static void main(String args[]) {
        // String loginWebsite="http://geocaching.com/login/default.aspx";
        final String loginWebsite = "http://www.geocaching.com/seek/nearest.aspx?state_id=159";
        final String POST_CONTENT_TYPE = "application/x-www-form-urlencoded";

        // step 1: get session ID from cookie
        String sessionId = "";
        String viewstate = "";
        try {
            URL url = new URL(loginWebsite);

            String key = "";
            URLConnection urlConnection = url.openConnection();

            if (urlConnection != null) {
                for (int i = 1; (key = urlConnection.getHeaderFieldKey(i)) != null; i++) {
                    // get ASP.NET_SessionId from cookie
                    // System.out.println(urlConnection.getHeaderField(key));
                    if (key.equalsIgnoreCase("set-cookie")) {
                        sessionId = urlConnection.getHeaderField(key);
                        sessionId = sessionId.substring(0, sessionId.indexOf(";"));

                    }
                }

                BufferedReader in = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

                // get the viewstate parameter
                String aLine;
                while ((aLine = in.readLine()) != null) {
                    // System.out.println(aLine);
                    if (aLine.lastIndexOf("id=\"__VIEWSTATE\"") > 0) {
                        viewstate = aLine.substring(aLine.lastIndexOf("value=\"") + 7, aLine.lastIndexOf("\" "));
                    }
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println(sessionId);
        System.out.println("\n");
        System.out.println(viewstate);
        System.out.println("\n");

        // String goToPage="3";

        // step2: post data to site
        StringBuilder htmlResult = new StringBuilder();
        try {

            String encoded = "__EVENTTARGET=ctl00$ContentBody$pgrBottom$lbGoToPage_3" + "&" + "__EVENTARGUMENT=" + "&"
                + "__VIEWSTATE=" + viewstate;

            URL url = new URL(loginWebsite);
            URLConnection urlConnection = url.openConnection();
            urlConnection = url.openConnection();

            // Specifying that we intend to use this connection for input
            urlConnection.setDoInput(true);

            // Specifying that we intend to use this connection for output
            urlConnection.setDoOutput(true);

            // Specifying the content type of our post
            urlConnection.setRequestProperty("Content-Type", POST_CONTENT_TYPE);

            // urlConnection.setRequestMethod("POST");

            urlConnection.setRequestProperty("Cookie", sessionId);
            urlConnection.setRequestProperty("Content-Type", "text/html");

            DataOutputStream out = new DataOutputStream(urlConnection.getOutputStream());
            out.writeBytes(encoded);
            out.flush();
            out.close();

            BufferedReader in = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

            String aLine;
            while ((aLine = in.readLine()) != null) {
                System.out.println(aLine);
            }

        } catch (MalformedURLException e) {
            // Print out the exception that occurred
            System.err.println("Invalid URL " + e.getMessage());
        } catch (IOException e) {
            // Print out the exception that occurred
            System.err.println("Unable to execute " + e.getMessage());
        }
    }
}

知道出了什么问题吗?非常感谢任何帮助!

更新

感谢您的快速回复!

我改用 HttpURLConnection 而不是实现 setRequestMethod() 的 URLConnection。我还纠正了您提到的小错误,例如删除了过时的第一个 setRequestProperty 调用。

不幸的是,这并没有改变任何东西......我想我设置了所有相关参数,但仍然只获得了列表的第一页。似乎“__EVENTTARGET=ctl00$ContentBody$pgrBottom$lbGoToPage_3”被忽略。我不知道为什么。

在内部,网站上的表单如下所示:

由以下 javascript 调用:

<script type="text/javascript"> 
//<![CDATA[
var theForm = document.forms['aspnetForm'];
if (!theForm) {
    theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
//]]>
</script> 

希望这有助于找到解决方案?

问候 迈克。

I try to access an ASPX-website where subsequent pages are returned based on
post data. Unfortunately all my attempts to get the following pages fail.
Hopefully, someone here has an idea where to find the error!

In step one I read the session ID from the cookie as well as the value of the
viewstate variable in the returned html page. Step two intends to send it
back to the server to get the desired page.

Sniffing the data in the webbrowser gives

Host=www.geocaching.com
User-Agent=Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100618
Iceweasel/3.5.9 (like Firefox/3.5.9)
Accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language=en-us,en;q=0.5
Accept-Encoding=gzip,deflate
Accept-Charset=ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive=300
Connection=keep-alive
Referer=http://www.geocaching.com/seek/nearest.aspx?state_id=149
Cookie=Send2GPS=garmin; BMItemsPerPage=200; maprefreshlock=true; ASP.
NET_SessionId=c4jgygfvu1e4ft55dqjapj45
Content-Type=application/x-www-form-urlencoded
Content-Length=4099
POSTDATA=__EVENTTARGET=ctl00%24ContentBody%24pgrBottom%
24lbGoToPage_3&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPD[...]2Xg%3D%
3D&language=on&logcount=on&gpx=on

Currently, my script looks like this

import java.net.*;
import java.io.*;
import java.util.*;
import java.security.*;
import java.net.*;

public class test1 {
    public static void main(String args[]) {
        // String loginWebsite="http://geocaching.com/login/default.aspx";
        final String loginWebsite = "http://www.geocaching.com/seek/nearest.aspx?state_id=159";
        final String POST_CONTENT_TYPE = "application/x-www-form-urlencoded";

        // step 1: get session ID from cookie
        String sessionId = "";
        String viewstate = "";
        try {
            URL url = new URL(loginWebsite);

            String key = "";
            URLConnection urlConnection = url.openConnection();

            if (urlConnection != null) {
                for (int i = 1; (key = urlConnection.getHeaderFieldKey(i)) != null; i++) {
                    // get ASP.NET_SessionId from cookie
                    // System.out.println(urlConnection.getHeaderField(key));
                    if (key.equalsIgnoreCase("set-cookie")) {
                        sessionId = urlConnection.getHeaderField(key);
                        sessionId = sessionId.substring(0, sessionId.indexOf(";"));

                    }
                }

                BufferedReader in = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

                // get the viewstate parameter
                String aLine;
                while ((aLine = in.readLine()) != null) {
                    // System.out.println(aLine);
                    if (aLine.lastIndexOf("id=\"__VIEWSTATE\"") > 0) {
                        viewstate = aLine.substring(aLine.lastIndexOf("value=\"") + 7, aLine.lastIndexOf("\" "));
                    }
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println(sessionId);
        System.out.println("\n");
        System.out.println(viewstate);
        System.out.println("\n");

        // String goToPage="3";

        // step2: post data to site
        StringBuilder htmlResult = new StringBuilder();
        try {

            String encoded = "__EVENTTARGET=ctl00$ContentBody$pgrBottom$lbGoToPage_3" + "&" + "__EVENTARGUMENT=" + "&"
                + "__VIEWSTATE=" + viewstate;

            URL url = new URL(loginWebsite);
            URLConnection urlConnection = url.openConnection();
            urlConnection = url.openConnection();

            // Specifying that we intend to use this connection for input
            urlConnection.setDoInput(true);

            // Specifying that we intend to use this connection for output
            urlConnection.setDoOutput(true);

            // Specifying the content type of our post
            urlConnection.setRequestProperty("Content-Type", POST_CONTENT_TYPE);

            // urlConnection.setRequestMethod("POST");

            urlConnection.setRequestProperty("Cookie", sessionId);
            urlConnection.setRequestProperty("Content-Type", "text/html");

            DataOutputStream out = new DataOutputStream(urlConnection.getOutputStream());
            out.writeBytes(encoded);
            out.flush();
            out.close();

            BufferedReader in = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

            String aLine;
            while ((aLine = in.readLine()) != null) {
                System.out.println(aLine);
            }

        } catch (MalformedURLException e) {
            // Print out the exception that occurred
            System.err.println("Invalid URL " + e.getMessage());
        } catch (IOException e) {
            // Print out the exception that occurred
            System.err.println("Unable to execute " + e.getMessage());
        }
    }
}

Any idea what's wrong? Any help is very appreciated!

Update

Thank you for the fast reply!

I switched to use the HttpURLConnection instead of the URLConnection which implements the setRequestMethod(). I also corrected the minor mistakes you mentioned, e.g. removed the obsolete first setRequestProperty call.

Unfortunately this doesn’t change anything... I think I set all relevant parameters but still get the first page of the list, only. It seems that the "__EVENTTARGET=ctl00$ContentBody$pgrBottom$lbGoToPage_3" is ignored. I don't have any clues why.

Internally, the form on the website looks like this:

<form name="aspnetForm" method="post" action="nearest.aspx?state_id=159" id="aspnetForm">

It is called by the following javascript:

<script type="text/javascript"> 
//<![CDATA[
var theForm = document.forms['aspnetForm'];
if (!theForm) {
    theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
//]]>
</script> 

Hopefully, this helps to find a solution?

Greetings
maik.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

爱,才寂寞 2024-09-13 09:32:30

您实际上想要 GET 还是 POST?如果你想 POST,那么你可能需要 setRequestMethd() 行。

您设置了两次 Content-Type - 我认为您可能需要将它们合并为一行。

然后,在尝试从输入流读取数据之前,不要关闭输出流。

除此之外,您是否可以输入更多日志记录/可以提供有关出错方式的线索?

Do you actually want to GET or POST? If you want to POST, then you may need the setRequestMethd() line.

You're setting Content-Type twice -- I think you may need to combine these into one line.

Then, don't close the output stream before you try and read from the input stream.

Other than that, is there any more logging you can put in/clues you can give as to what way it's going wrong in?

夜唯美灬不弃 2024-09-13 09:32:30

嘿,使用以下代码

String userAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0";

        org.jsoup.nodes.Document jsoupDoc = Jsoup.connect(url).timeout(15000).userAgent(userAgent).referrer("http://calendar.legis.ga.gov/Calendar/?chamber=House").ignoreContentType(true)
                .data("__EVENTTARGET", eventtarget).data("__EVENTARGUMENT", eventarg).data("__VIEWSTATE", viewState).data("__VIEWSTATEGENERATOR", viewStateGenarator)
                .data("__EVENTVALIDATION", viewStateValidation).parser(Parser.xmlParser()).post();

Hey use following code

String userAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0";

        org.jsoup.nodes.Document jsoupDoc = Jsoup.connect(url).timeout(15000).userAgent(userAgent).referrer("http://calendar.legis.ga.gov/Calendar/?chamber=House").ignoreContentType(true)
                .data("__EVENTTARGET", eventtarget).data("__EVENTARGUMENT", eventarg).data("__VIEWSTATE", viewState).data("__VIEWSTATEGENERATOR", viewStateGenarator)
                .data("__EVENTVALIDATION", viewStateValidation).parser(Parser.xmlParser()).post();
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文