使用 phantomJS 将数据从一个页面复制到另一个页面
我正在尝试将一些数据从一个已处理的网页复制到我想要导出的新网页中。背景是我需要抓取页面的部分内容,并需要使用原始页面的部分内容构建一个新页面。 问题似乎是 phantomJs includeJs() 和 evaluate() 方法被沙箱化了,我看不到将 DOM 从一个页面导入到另一个页面的正确方法。
我有一些测试代码,如下所示,页面是原始页面,新页面是:
....
var title = page.evaluate(function() {
return title = document.getElementById('fooo').innerHTML;
});
console.log('page title:' + title);
//fs.write('c:/Temp/title.js', "var title = '" + title + "';", 'w');
var out = new WebPage;
out.viewportSize = page.viewportSize;
out.content = '<html><head></head><body><div id="wrapper"></div><p>done</p></body></html>';
out.includeJs('c:/Temp/title.js', function() {
var p = document.createElement('p');
p.appendChild(document.createTextNode(title));
document.getElementById('wrapper').appendChild(p);
});
...
I am trying to copy some data from one processed web page into a new one that I want to export. The background is that I need to scrape parts of a page and need to build a new page with parts of the original page.
The problem seems that phantomJs includeJs() and evaluate() methods are sandboxed and I can't see a proper way to import DOM from one page to another.
I have some test code that looks like this, with page being the original and out the new page:
....
var title = page.evaluate(function() {
return title = document.getElementById('fooo').innerHTML;
});
console.log('page title:' + title);
//fs.write('c:/Temp/title.js', "var title = '" + title + "';", 'w');
var out = new WebPage;
out.viewportSize = page.viewportSize;
out.content = '<html><head></head><body><div id="wrapper"></div><p>done</p></body></html>';
out.includeJs('c:/Temp/title.js', function() {
var p = document.createElement('p');
p.appendChild(document.createTextNode(title));
document.getElementById('wrapper').appendChild(p);
});
...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您最后一次
includeJs
调用中的函数将不起作用 - 正如您所注意到的,它是沙盒的,这意味着闭包将不起作用,因此title
将不会定义的。将变量传递给page.evaluate
的方法是 注明为功能请求,但从 PhantomJS v.1.4.1 开始不可用。我解决这个问题的一般方法是使用
Function
构造函数,它允许您使用字符串创建一个函数:现在您可以
评估
像您拥有的函数一样,在沙箱中引用myVar
,您的数据将在客户端范围内可用。The function in your last
includeJs
call here won't work - as you note, it's sandboxed, and that means that closures won't work, sotitle
won't be defined. A method of passing variables topage.evaluate
is noted as a feature request, but isn't available as of PhantomJS v.1.4.1.The general way I get around this is by using the
Function
constructor, which allows you to create a function using a string:Now you can
evaluate
a function like the one you have, referencingmyVar
in the sandbox, and your data will be available in the client scope.