使用虚假引用抓取图像
我现在正在努力抓取图像......听起来很傻,但请查看此链接:P
http://manga.justcarl.co.uk/A/Oishii_Kankei/31/1
如果您获取图像 URL,则会加载图像。返回,看起来工作正常,但这只是浏览器加载缓存的图像。
该应用程序之前运行良好,我认为他们对其图像实施了某种引用检查。所以我找到了一些代码并提出了以下内容......
$ref = 'http://www.thesite.com/';
$file = 'theimage.jpg';
$hdrs = array( 'http' => array(
'method' => "GET",
'header'=> "accept-language: en\r\n" .
"Accept:application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*\/*;q=0.5\r\n" .
"Referer: $ref\r\n" . // Setting the http-referer
"Content-Type: image/jpeg\r\n"
)
);
// get the requested page from the server
// with our header as a request-header
$context = stream_context_create($hdrs);
$fp = fopen($imgChapterPath.$file, 'rb', false, $context);
fpassthru($fp);
fclose($fp);
本质上它是在编造一个错误的引荐来源网址。我返回的只是一堆乱码(感谢 fpassthru),所以我认为它正在获取图像,但我不敢说我不知道如何输出/显示收集的图像。
I'm struggling with grabbing an image at the moment... sounds silly, but check out this link :P
http://manga.justcarl.co.uk/A/Oishii_Kankei/31/1
If you get the image URL, the image loads. Go back, it looks like it's working fine, but that's just the browser loading up the cached image.
The application was working fine before, I'm thinking they implemented some kind of Referer check on their images. So I found some code and came up with the following...
$ref = 'http://www.thesite.com/';
$file = 'theimage.jpg';
$hdrs = array( 'http' => array(
'method' => "GET",
'header'=> "accept-language: en\r\n" .
"Accept:application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*\/*;q=0.5\r\n" .
"Referer: $ref\r\n" . // Setting the http-referer
"Content-Type: image/jpeg\r\n"
)
);
// get the requested page from the server
// with our header as a request-header
$context = stream_context_create($hdrs);
$fp = fopen($imgChapterPath.$file, 'rb', false, $context);
fpassthru($fp);
fclose($fp);
Essentially it's making up a false referrer. All I'm getting back though is a bunch of gibberish (thanks to fpassthru) so I think it's getting the image, but I'm afraid to say I have no idea how to output/display the collected image.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
尝试在调用 fpassthru 之前调用此函数:
这将告诉您的浏览器接下来的数据是 JPEG 图像。否则PHP会自动说它是一个HTML文档,这显然是错误的。
注意:请记住,对
header
的所有调用都必须在其他任何内容输出到浏览器之前进行,这意味着您不能有任何echo
或 < code>print 在调用header
之前调用,并且在开始标记之前不能有任何内容。
Try calling this before your call to fpassthru:
This will tell your browser that the data coming next is a JPEG image. Otherwise PHP will automatically say it's a HTML document, which is obviously wrong.
Note: Remember that all calls to
header
must be made before anything else is output to the browser, which means you cannot have anyecho
orprint
calls before callingheader
, and you cannot have anything at all before the opening<?php
tag.