获取网站上 javascript 重定向的最终目的地
我用 python 解析一个网站。他们使用大量重定向,并通过调用 javascript 函数来实现。
所以当我只是使用 urllib 来解析网站时,它对我没有帮助,因为我在返回的 html 代码中找不到目标 url。
有没有办法访问 DOM 并从我的 python 代码中调用正确的 javascript 函数?
我所需要的只是重定向到的网址。
I parse a website with python. They use a lot of redirects and they do them by calling javascript functions.
So when I just use urllib to parse the site, it doesn't help me, because I can't find the destination url in the returned html code.
Is there a way to access the DOM and call the correct javascript function from my python code?
All I need is the url, where the redirect takes me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我研究了硒。如果您没有运行纯脚本(意味着您没有显示器并且无法启动“普通”浏览器),那么解决方案实际上非常简单:
对于我的用例来说,这已经足够了。 Selenium 还可以与表单交互并将击键发送到网站。
I looked into Selenium. And if you are not running a pure script (meaning you don't have a display and can't start a "normal" browser) the solution is actually quite simple:
For my usecase this is more than enough. Selenium can also interact with forms and send keystrokes to the website.
对我来说,这听起来并不有趣,但每个 javascript 函数都是一个对象,因此您可以只读取函数而不是调用它,也许 URL 就在其中。否则,该函数可能会调用另一个函数,然后您必须递归到该函数中...同样,听起来并不有趣,但可能是可行的。
It doesnt sound like fun to me, but every javascript function is a is also an object, so you can just read the function rather than call it and perhaps the URL is in it. Otherwise, that function may call another which you would then have to recurse into... Again, doesnt sound like fun, but might be doable.