使用Puppeteer下载PDF文件表单嵌入式标签
我正在尝试从网站下载PDF。
该网站是使用框架ZK制成的,它在输入栏中的ID号类型时揭示了PDF的动态URL。此步骤非常容易,我可以获取PDF URL,该PDF URL可以在嵌入式标签上打开浏览器。
但是,我不可能找到将文件下载到计算机的方法。几天来,我尝试并阅读了所有内容,从 /a>,to this ,to this 。
我能够通过此代码获得的关闭内容:
let [ iframe ] = await page.$x('//iframe');
let pdf_url = await page.evaluate( iframe => iframe.src, iframe)
let res = await page.evaluate( async url =>
await fetch(url, {
method: 'GET',
credentials: 'same-origin', // usefull when we are logged into a website and want to send cookies
responseType: 'arraybuffer', // get response as an ArrayBuffer
}).then(response => response.text()),
pdf_url
)
console.log('res:', res);
//const response = await page.goto(pdf);
fs.writeFileSync('somepdf.pdf', res);
这导致了一个空白的PDF文件,该文件的大小为92K。
而我要获得的文件为52k。我怀疑后端可能会向我发送“虚拟” pdf文件,因为我在提取请求上的标题可能不正确。
我还能尝试什么?
您可以使用我发现的随机ID号:'1705120630'
I am trying to download a pdf from a Website.
The website is made with the framework ZK, and it reveals a dynamic URL to the PDF for a window of time when an id number type in a input bar. This step is easy enough and I a able to get the PDF URL which opens up in the browser on a embedded tag.
However, it has been impossible for me to find a way to download the file to my computer. For days, I have tried and read everything from this, to this, to this.
The closes thing I have been able to get with this code:
let [ iframe ] = await page.$x('//iframe');
let pdf_url = await page.evaluate( iframe => iframe.src, iframe)
let res = await page.evaluate( async url =>
await fetch(url, {
method: 'GET',
credentials: 'same-origin', // usefull when we are logged into a website and want to send cookies
responseType: 'arraybuffer', // get response as an ArrayBuffer
}).then(response => response.text()),
pdf_url
)
console.log('res:', res);
//const response = await page.goto(pdf);
fs.writeFileSync('somepdf.pdf', res);
This results in a blank PDF file which is of 92K in size.
While the file I am trying to get is of 52K. I suspect the back-end might be sending me 'dummy' pdf file because my headers on the fetch request might not be correct.
What else can I try?
Here is the link to the PDF page.
You can use the random ID number I found: '1705120630'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论