如何强制 PHP 的 fopen() 返回网页的当前版本?

发布于 2024-08-31 04:34:50 字数 3505 浏览 11 评论 0原文

此 Google 文档页面 是:

替代文本 http://www.deviantsart.com/upload/i9k01q.png

但是,当使用以下 PHP fopen() 脚本阅读此页面时,我得到一个较旧的缓存版本:

替代文字
(来源:deviantsart.com

我已经尝试过这个问题中提出了两个解决方案(随机属性并使用 POST),我也尝试过 clearstatcache() 但我总是得到网页的缓存版本。

我必须在以下脚本中更改什么,以便 fopen() 返回网页的当前版本?

<?php
$url = 'http://docs.google.com/View?id=dc7gj86r_32g68627ff&amp;rand=' . getRandomDigits(10);

echo $url . '<hr/>';
echo loadFile($url);

function loadFile($sFilename) {
    clearstatcache();
    if (floatval(phpversion()) >= 4.3) {
        $sData = file_get_contents($sFilename);
    } else {
        if (!file_exists($sFilename)) return -3;

        $opts = array('http' =>
          array(
            'method'  => 'POST',
            'content'=>''
          )
        );
        $context  = stream_context_create($opts);                

        $rHandle = fopen($sFilename, 'r', $context);
        if (!$rHandle) return -2;

        $sData = '';
        while(!feof($rHandle))
            $sData .= fread($rHandle, filesize($sFilename));
        fclose($rHandle);
    }
    return $sData;
}

function getRandomDigits($numberOfDigits) {
 $r = "";
 for($i=1; $i<=$numberOfDigits; $i++) {
  $nr=rand(0,9);
  $r .=  $nr;
 }
 return $r;
}

?>

添加:取出 $opts$ context 也给了我一个缓存的页面:

function loadFile($sFilename) {
    if (floatval(phpversion()) >= 4.3) {
        $sData = file_get_contents($sFilename);
    } else {
        if (!file_exists($sFilename)) return -3;              

        $rHandle = fopen($sFilename, 'r');
        if (!$rHandle) return -2;

        $sData = '';
        while(!feof($rHandle))
            $sData .= fread($rHandle, filesize($sFilename));
        fclose($rHandle);
    }
    return $sData;
}

添加:这个发送 Firefox 用户代理的 curl 脚本也返回缓存的版本:

<?php
$url = 'http://docs.google.com/View?id=dc7gj86r_32g68627ff';
//$user_agent = 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)';
$user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)';
$ch = curl_init();
//curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookie");
//curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookie");
curl_setopt($ch, CURLOPT_URL, $url ); 
curl_setopt($ch, CURLOPT_FAILONERROR, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
echo curl_exec($ch);
?>

The current content of this google docs page is:

alt text http://www.deviantsart.com/upload/i9k01q.png

However, when reading this page with the following PHP fopen() script, I get an older, cached version:

alt text
(source: deviantsart.com)

I've tried two solutions proposed in this question (a random attribute and using POST) and I also tried clearstatcache() but I always get the cached version of the web page.

What do I have to change in the following script so that fopen() returns the current version of the web page?

<?php
$url = 'http://docs.google.com/View?id=dc7gj86r_32g68627ff&rand=' . getRandomDigits(10);

echo $url . '<hr/>';
echo loadFile($url);

function loadFile($sFilename) {
    clearstatcache();
    if (floatval(phpversion()) >= 4.3) {
        $sData = file_get_contents($sFilename);
    } else {
        if (!file_exists($sFilename)) return -3;

        $opts = array('http' =>
          array(
            'method'  => 'POST',
            'content'=>''
          )
        );
        $context  = stream_context_create($opts);                

        $rHandle = fopen($sFilename, 'r', $context);
        if (!$rHandle) return -2;

        $sData = '';
        while(!feof($rHandle))
            $sData .= fread($rHandle, filesize($sFilename));
        fclose($rHandle);
    }
    return $sData;
}

function getRandomDigits($numberOfDigits) {
 $r = "";
 for($i=1; $i<=$numberOfDigits; $i++) {
  $nr=rand(0,9);
  $r .=  $nr;
 }
 return $r;
}

?>

ADDED: taking out the $opts and $context gives me a cached page as well:

function loadFile($sFilename) {
    if (floatval(phpversion()) >= 4.3) {
        $sData = file_get_contents($sFilename);
    } else {
        if (!file_exists($sFilename)) return -3;              

        $rHandle = fopen($sFilename, 'r');
        if (!$rHandle) return -2;

        $sData = '';
        while(!feof($rHandle))
            $sData .= fread($rHandle, filesize($sFilename));
        fclose($rHandle);
    }
    return $sData;
}

ADDED: this curl script which sends a Firefox user agent returns the cached version as well:

<?php
$url = 'http://docs.google.com/View?id=dc7gj86r_32g68627ff';
//$user_agent = 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)';
$user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)';
$ch = curl_init();
//curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookie");
//curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookie");
curl_setopt($ch, CURLOPT_URL, $url ); 
curl_setopt($ch, CURLOPT_FAILONERROR, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
echo curl_exec($ch);
?>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

纵情客 2024-09-07 04:34:50

我已经成功复制了这个。当您不是已发布网络文档的所有者时,Google 会进行缓存。如果你退出,它会给我旧版本。

在我取消发布并重新发布后,我无法再重现该问题。确保在更新文档时在“共享为网页”下继续发布文档。

只是为了确保这一点,请检查未登录的浏览器(或您的脚本)。如果不更新:取消发布并再次发布。它并没有改变我的网址。

I have successfully reproduced this. Google IS caching when you aren't the owner of the published web document. If you log out, it gave me the old version.

After I unpublished it and republished it, I could no longer reproduce the issue. Ensure that you keep publishing the document under the "Share as Web Page" when you update it.

Just to make sure, check in a browser that isn't logged in (or your script). If it doesn't update: unpublish and publish again. It did not change the URL for me.

蓝天 2024-09-07 04:34:50

我还得到了这个:

Test One;http://docs.google.com/View?id=dc7gj86r_30dzgzbjch
Test Two;http://docs.google.com/View?id=dc7gj86r_31dbssfrzx

“缓存”必须是在 Google Docs 上完成的,或者更可能的是,这是你的错(错误的 URL?)。


响应标头:

Set-Cookie: ******
Content-Type: text/html; charset=UTF-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Sun, 02 May 2010 03:30:29 GMT
X-Frame-Options: ALLOWALL
Content-Encoding: gzip
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Length: 3987
Server: GSE

I also get this:

Test One;http://docs.google.com/View?id=dc7gj86r_30dzgzbjch
Test Two;http://docs.google.com/View?id=dc7gj86r_31dbssfrzx

The "caching" must be being done at Google Docs or, more probably, it's your fault (wrong URL?).


Response headers:

Set-Cookie: ******
Content-Type: text/html; charset=UTF-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Sun, 02 May 2010 03:30:29 GMT
X-Frame-Options: ALLOWALL
Content-Encoding: gzip
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Length: 3987
Server: GSE
时间海 2024-09-07 04:34:50

尝试确保您的浏览器没有缓存该信息。我没有看到任何缓存标头或任何东西。您的网络服务器可能正在添加某些内容,或者您​​的浏览器可能假设它已被缓存。尝试在输出中包含时间,以便确保请求是在正确的时间生成的。

几年前我使用 fopen 来处理经常更新的数据。 fopen 从未遇到过缓存问题。事实上,如果 PHP 开发人员向 fopen 添加 Web 缓存,我会感到失望,因为这会破坏大多数有效的用例,而且他们的文档中也没有。我会去查看 PHP 源代码来确定一下。

您能否更新该文档以便我们中的一些人可以尝试复制?

Try making sure your browser isn't caching the information. I'm not seeing any cache headers or anything. Your webserver might be adding something, or your browser might be assuming it's cached. Try including the time with the output so you can make sure the request was generated at the correct time.

I used fopen years ago for data that updated quite often. Never ran into a cache problem with fopen. In fact, I would be disappointed if the PHP developers added a web cache to fopen as it would ruin most of the valid use-cases AND it isn't in their documentation. I'll go and look at the PHP source code just to make sure.

Can you update the document so that some of us may try reproducing?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文