不知道如何在这个特定实例中处理 javascript 和 mechanize

发布于 2024-12-26 19:55:11 字数 1362 浏览 0 评论 0原文

我将访问 Amazon KDP 上的多个账户 - http://kdp.amazon.com/

我的任务是登录每个帐户并检查该帐户的收入。 Mechanize 非常适合登录和处理 cookie 等,但显示帐户收入的页面使用 javascript 动态填充页面。

我做了一些挖掘,发现 javascript 发出了以下请求:

https://kdp.amazon.com/self-publishing/reports/transactionSummary?_=1326419839161&marketplaceID=ATVPDKIKX0DER

以及一个包含会话 ID、令牌和一些随机内容的 cookie。每次我点击链接显示结果时,上面的 GET url 的数字部分都是不同的,即使是同一个链接。

为了响应该请求,浏览器会收到此消息(剪掉一堆,这样它就不会占用整个页面):

 {"iTotalDisplayRecords":13,"iTotalRecords":13,"aaData":[["12/03/2011","<span
 title=\"Booktitle\">Hold That ...<\/span>","<span
 title=\"Author\">Amy  

 ....

 <\/span>","B004PGMHEM","1","1","0","70%","4.47","0.06","4.47","0.01","0.00",""],["","","","","","","","","","","","","<div
class='grandtotal'>Total: $ 39.53<\/div>","Junk"]]}

我想我可以使用 mechanize 的 cookie 容器来提取属于该请求一部分的 cookie,但是我如何知道该数字是什么以及它是如何生成的?即使在最好的情况下,页面源代码中的 JavaScript 也显得很神秘。这是其中之一:

http://kdp.amazon.com/DTPUIFramework/ js/all-signin-thin.js

有没有一种方法可以真正追踪哪些 JavaScript 在“幕后”运行,可以这么说,在我点击页面上的某些内容后,以便我可以模拟与机械化结合的要求?

Danke..

PS:我不能(或者更确切地说,我不想)使用 watir 来完成这项任务,因为理论上我可能处理的不仅仅是少数几个帐户,所以这必须非常快捷。

I'm going to be accessing a number of accounts on Amazon's KDP - http://kdp.amazon.com/

My task is to login to each account and check the account's earnings. Mechanize works great for logging in and dealing with the cookies and such but the page which displays the account earnings uses javascript to dynamically populate the page.

I did a little bit of digging and found that the javascripts sends out the following request:

https://kdp.amazon.com/self-publishing/reports/transactionSummary?_=1326419839161&marketplaceID=ATVPDKIKX0DER

Along with a cookie which contains a session ID, a token, and some random stuff. Every time I click a link to display the results, the numerical part of the above GET url is different, even if it's the same link.

In response to the request, the browser then receives this (cut out a bunch of it so it doesn't take up the whole page):

 {"iTotalDisplayRecords":13,"iTotalRecords":13,"aaData":[["12/03/2011","<span
 title=\"Booktitle\">Hold That ...<\/span>","<span
 title=\"Author\">Amy  

 ....

 <\/span>","B004PGMHEM","1","1","0","70%","4.47","0.06","4.47","0.01","0.00",""],["","","","","","","","","","","","","<div
class='grandtotal'>Total: $ 39.53<\/div>","Junk"]]}

I think I can use mechanize's cookie container to extract the cookies which are a part of that request but how do I figure out what that number is and how it's generated? The javascripts in the source code of the page seem cryptic on the best of days. Here's one of them:

http://kdp.amazon.com/DTPUIFramework/js/all-signin-thin.js

Is there a way to really track down what javascripts are running "behind the scenes" so to speak after I click on something on the page so that I can emulate that request in conjunction with mechanize?

Danke..

PS: I can't (or, rather, I don't want to) use watir for this task, because in theory I might be handling more than just a handful of accounts so this's gotta be pretty snappy.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

メ斷腸人バ 2025-01-02 19:55:11

它只是一个时间戳,仅用于缓存清除。试试这个:

Time.now.to_i.to_s

It's just a timestamp and it's only used for cache busting. Try this:

Time.now.to_i.to_s
开始看清了 2025-01-02 19:55:11

Mechanize 不运行页面中嵌入的 JavaScript。它仅检索 HTML。

如果页面包含 JavaScript,Mechanize 可以看到它,并且您可以使用 Mechanize 内部使用的 Nokogiri 来检索

您可以在浏览器中单步浏览页面并查看源代码,以了解 FireBug 正在运行的内容。从这些信息中,您可以了解 JavaScript 正在做什么,然后使用 Mechanize 和 Nokogiri 从页面中提取所需的信息,以便您构建下一个 URL,但这可能需要大量工作。

您提出的问题与许多其他人关于 Mechanize 和 JavaScript 的问题类似。我建议您查看这些 SO 链接以获得替代想法:

搜索 Stack Overflow 了解有关 Ruby、JavaScript 和 Mechanize 的问题。

Mechanize doesn't run JavaScript that is embedded in the page. It only retrieves the HTML.

If the page contains JavaScript, Mechanize can see it and you can use Nokogiri, which Mechanize uses internally, to retrieve the <script> tags' content. But, anything that would be loaded as a result of the JavaScript being executed in a browser will not run in Mechanize. Watir is the solution for that, because it drives the browser itself, which will interpret and run the JavaScript in the page.

You can step through the pages in a browser and look at the source code to get an idea what is running using FireBug. From that information you can get an understanding of what the JavaScript is doing, and then use Mechanize and Nokogiri to extract the needed information from a page that lets you build up your next URLs, but it can be a lot of work.

What you ask is similar to many other's questions regarding Mechanize and JavaScript. I'd recommend you look at these SO links to get alternate ideas:

Or search Stack Overflow for questions about Ruby, JavaScript and Mechanize.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文