浏览器不支持框架

发布于 2024-11-19 18:00:00 字数 4342 浏览 3 评论 0原文

我正在尝试创建一个 java 程序,该程序对 achievo 实例执行登录。我正在尝试使用屏幕抓取

我设法使用以下代码登录:

@Test
public void testLogin() throws Exception {
    HashMap<String, String> data = new HashMap<String, String>();
    data.put("auth_user", "user");
    data.put("auth_pw", "password");
    doSubmit("https://someurl.com/achievo/index.php", data);
}

private void doSubmit(String url, HashMap<String, String> data) throws Exception {
    URL siteUrl = new URL(url);
    HttpsURLConnection conn = (HttpsURLConnection) siteUrl.openConnection();
    conn.setRequestMethod("POST");
    conn.setDoOutput(true);
    conn.setDoInput(true);
    //conn.setRequestProperty( "User-agent", "spider" );
    //conn.setRequestProperty("User-agent", "Opera/9.80 (X11; Linux i686; U; en) Presto/2.7.62 Version/11.01");

    conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)");

    DataOutputStream out = new DataOutputStream(conn.getOutputStream());

    Set<String> keys = data.keySet();
    Iterator<String> keyIter = keys.iterator();
    StringBuilder content = new StringBuilder("");
    for(int i=0; keyIter.hasNext(); i++) {
        Object key = keyIter.next();
        if(i!=0) {
            content.append("&");
        }
        content.append(key + "=" + URLEncoder.encode(data.get(key), "UTF-8"));
    }
    System.out.println(content.toString());

    out.writeBytes(content.toString());
    out.flush();
    out.close();
    BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    String line = "";
    while((line=in.readLine())!=null) {
        System.out.println(line);
    }
    in.close();
}

但是,当achievo成功登录时,我被重定向到主页,其中显示:

<head>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
    <title>Achievo</title>
  </head>
    <frameset rows="113,*" frameborder="0" border="0">
    <frame name="top" scrolling="no" noresize src="top.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
    <frameset cols="210,*" frameborder="0" border="0">
      <frame name="menu" scrolling="no" noresize src="menu.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
      <frame name="main" scrolling="auto" noresize src="dispatch.php?atknodetype=pim.pim&atkaction=pim&atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
    </frameset>
    <noframes>
      <body bgcolor="#CCCCCC" text="#000000">
        <p>Your browser doesnt support frames, but this is required to run Achievo</p>
      </body>
    </noframes>
  </frameset>

显然我得到您的浏览器不支持框架,但这需要运行Achievo< /强>。

我尝试直接访问dispatch.php框架,因为这可能是我想要的,但是,它报告我的会话已过期,并且我需要重新登录。

有没有办法伪造一个框架?或者以某种方式保持连接,更改 url,并尝试获取dispatch.php 框架?


使用 HtmlUnit,我完成了以下操作:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3);
HtmlPage page = webClient.getPage("https://someurl.com/index.php");
System.out.println(page.asXml());

List<HtmlForm> forms = page.getForms();
assertTrue(forms != null && !forms.isEmpty());

HtmlForm form = forms.get(0);
HtmlSubmitInput submit = form.getInputByName("login");
HtmlInput inputUsername = form.getInputByName("auth_user");
HtmlInput inputPw = form.getInputByName("auth_pw");

inputUsername.setValueAttribute("foo");
inputPw.setValueAttribute("bar");

HtmlPage page2 = submit.click();

CookieManager cookieManager = webClient.getCookieManager();
Set<Cookie> cookies = cookieManager.getCookies();
System.out.println("Is cookie " + cookieManager.isCookiesEnabled());

for(Cookie cookie : cookies) {
    System.out.println(cookie.toString());
}

System.out.println(page2.asXml());
webClient.closeAllWindows();

在这里,我获取表单,提交它,然后检索相同的消息。当我也打印出来时,我可以看到我有一个cookie。现在的问题是,如何使用登录的 cookie 获取dispatch.php 框架?

I am trying to create a java program that performs a login against an achievo instance. I am trying to use Screen Scraping.

I manage to login using the following code:

@Test
public void testLogin() throws Exception {
    HashMap<String, String> data = new HashMap<String, String>();
    data.put("auth_user", "user");
    data.put("auth_pw", "password");
    doSubmit("https://someurl.com/achievo/index.php", data);
}

private void doSubmit(String url, HashMap<String, String> data) throws Exception {
    URL siteUrl = new URL(url);
    HttpsURLConnection conn = (HttpsURLConnection) siteUrl.openConnection();
    conn.setRequestMethod("POST");
    conn.setDoOutput(true);
    conn.setDoInput(true);
    //conn.setRequestProperty( "User-agent", "spider" );
    //conn.setRequestProperty("User-agent", "Opera/9.80 (X11; Linux i686; U; en) Presto/2.7.62 Version/11.01");

    conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)");

    DataOutputStream out = new DataOutputStream(conn.getOutputStream());

    Set<String> keys = data.keySet();
    Iterator<String> keyIter = keys.iterator();
    StringBuilder content = new StringBuilder("");
    for(int i=0; keyIter.hasNext(); i++) {
        Object key = keyIter.next();
        if(i!=0) {
            content.append("&");
        }
        content.append(key + "=" + URLEncoder.encode(data.get(key), "UTF-8"));
    }
    System.out.println(content.toString());

    out.writeBytes(content.toString());
    out.flush();
    out.close();
    BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    String line = "";
    while((line=in.readLine())!=null) {
        System.out.println(line);
    }
    in.close();
}

However, when achievo successfully logs-in, I get redirected to the main page where it says:

<head>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
    <title>Achievo</title>
  </head>
    <frameset rows="113,*" frameborder="0" border="0">
    <frame name="top" scrolling="no" noresize src="top.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
    <frameset cols="210,*" frameborder="0" border="0">
      <frame name="menu" scrolling="no" noresize src="menu.php?atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
      <frame name="main" scrolling="auto" noresize src="dispatch.php?atknodetype=pim.pim&atkaction=pim&atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43" marginwidth="0" marginheight="0">
    </frameset>
    <noframes>
      <body bgcolor="#CCCCCC" text="#000000">
        <p>Your browser doesnt support frames, but this is required to run Achievo</p>
      </body>
    </noframes>
  </frameset>

Obviously I get the Your browser doesnt support frames, but this is required to run Achievo.

I have tried to directly access the dispatch.php frame, as this is what I probably want, however, it reports that my session has expired, and that I need to re-login.

Is there someway to fake a frame? Or somehow keep the connection, change the url, and try to get the dispatch.php frame?


Using HtmlUnit, I have done the following:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3);
HtmlPage page = webClient.getPage("https://someurl.com/index.php");
System.out.println(page.asXml());

List<HtmlForm> forms = page.getForms();
assertTrue(forms != null && !forms.isEmpty());

HtmlForm form = forms.get(0);
HtmlSubmitInput submit = form.getInputByName("login");
HtmlInput inputUsername = form.getInputByName("auth_user");
HtmlInput inputPw = form.getInputByName("auth_pw");

inputUsername.setValueAttribute("foo");
inputPw.setValueAttribute("bar");

HtmlPage page2 = submit.click();

CookieManager cookieManager = webClient.getCookieManager();
Set<Cookie> cookies = cookieManager.getCookies();
System.out.println("Is cookie " + cookieManager.isCookiesEnabled());

for(Cookie cookie : cookies) {
    System.out.println(cookie.toString());
}

System.out.println(page2.asXml());
webClient.closeAllWindows();

Here I get the form, I submit it, and I retrieve the same message. When I also print out, I can see that I have a cookie. Now the question is, how do I proceed to get the dispatch.php frame using the logged in cookie?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

燕归巢 2024-11-26 18:00:00

这种刮取有点复杂,有几个因素需要考虑。

  1. Achieve 应用程序是否设置任何 cookie?如果是这样,您将需要接受它们并与下一个请求一起发送。我认为
  2. 从表面上看,您将需要解析该 HTML 页面并提取您想要加载的框架。我怀疑您收到了会话过期消息,因为您没有发送 cookie 或类似的内容。您需要确保使用框架集中提供的准确 URL。

我建议使用 Apache HttpClient 模块,它比标准功能更全面Java URL 提供程序,并且可以为您管理 cookie 之类的东西。

This kind of scraping is a bit complicated, there are several factors to think about.

  1. Does the Achieve app set any cookies? If so, you will need to accept them and send them with the next request. I think
  2. By the looks of things, you will need to parse that HTML page and extract the frame you wish to load. I suspect you're getting back a session expired message because you're not sending a cookie or something like that. You need to make sure you use the exact URL provided in the FRAMESET.

I suggest using the Apache HttpClient module which is a bit more fully-featured than the standard Java URL provider, and can manage things like cookies for you.

他不在意 2024-11-26 18:00:00

您必须提取主框架的 URL (dispatch.php?atknodetype=pim.pim&atkaction=pim&atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43) 并创建一个对此 URL 的第二次请求。如果使用 cookie 来跟踪会话,您还必须发送登录请求响应中包含的 cookie。

我会使用更高级别的 API 来执行此操作(例如 Apache HttpClient ),甚至是像 HtmlUnit 这样的编程浏览器。

You'll have to extract the URL of the main frame (dispatch.php?atknodetype=pim.pim&atkaction=pim&atklevel=-1&atkprevlevel=0&achievo=37b552462afdfd248a21fedbf0eebe43) and make a second request to this URL. If cookies are used to track sessions, you'll also have to send the cookies contained in the response to your login request.

I would use a higher-level API to do this (like Apache HttpClient), or even a programmatic browser like HtmlUnit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文