PHP：如何从不同服务器加载文件作为字符串？

发布于 2024-07-23 07:49:55 字数 161 浏览 8 评论 0原文

我正在尝试从不同的域名加载 XML 文件作为字符串。我想要的只是 << 中的文本数组。标题>< /标题> xml 文件的标签，所以我想既然我使用 php4，最简单的方法就是对其进行正则表达式来获取它们。有人可以解释一下如何将 XML 作为字符串加载吗？谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

羅雙樹 2024-07-30 07:49:55

您可以像下面的示例一样使用 cURL。我应该补充一点，基于正则表达式的 XML 解析通常不是一个好主意，使用真正的解析器可能会更好，尤其是当它变得更加复杂时。

您可能还想添加一些正则表达式修饰符以使其跨多行工作等，但我认为问题更多是关于将内容提取到字符串中。

<?php

$curl = curl_init('http://www.example.com');

//make content be returned by curl_exec rather than being printed immediately                                 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($curl);

if ($result !== false) {
    if (preg_match('|<title>(.*)</title>|i', $result, $matches)) {
        echo "Title is '{$matches[1]}'";   
    } else {
        //did not find the title    
    }
} else {
    //request failed
    die (curl_error($curl)); 
}

You could use cURL like the example below. I should add that regex-based XML parsing is generally not a good idea, and you may be better off using a real parser, especially if it gets any more complicated.

You may also want to add some regex modifiers to make it work across multiple lines etc., but I assume the question is more about fetching the content into a string.

<?php

$curl = curl_init('http://www.example.com');

//make content be returned by curl_exec rather than being printed immediately                                 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($curl);

if ($result !== false) {
    if (preg_match('|<title>(.*)</title>|i', $result, $matches)) {
        echo "Title is '{$matches[1]}'";   
    } else {
        //did not find the title    
    }
} else {
    //request failed
    die (curl_error($curl)); 
}

回复收藏 0 原文

樱花细雨 2024-07-30 07:49:55

第一次使用
file_get_contents('http://www.example.com/');

获取文件，
插入到变量。
解析xml后
链接是
http://php.net/manual/en/function.xml-parse。 php
评论里有例子

回复收藏 0 原文

孤蝉 2024-07-30 07:49:55

如果您要加载格式正确的 xml，请跳过基于字符的解析，并使用 DOM 函数：

$d = new DOMDocument;
$d->load("http://url/file.xml");
$titles = $d->getElementsByTagName('title');
if ($titles) {
    echo $titles->item(0)->nodeValue;
}

如果由于 php 的设置方式而无法使用 DOMDocument::load()，则可以使用curl 来获取文件然后执行以下操作：

$d = new DOMDocument;
$d->loadXML($grabbedfile);
...

If you're loading well-formed xml, skip the character-based parsing, and use the DOM functions:

$d = new DOMDocument;
$d->load("http://url/file.xml");
$titles = $d->getElementsByTagName('title');
if ($titles) {
    echo $titles->item(0)->nodeValue;
}

If you can't use DOMDocument::load() due to how php is set up, the use curl to grab the file and then do:

$d = new DOMDocument;
$d->loadXML($grabbedfile);
...

回复收藏 0 原文

半窗疏影 2024-07-30 07:49:55

我将此函数作为一个片段：

function getHTML($url) {
    if($url == false || empty($url)) return false;
    $options = array(
        CURLOPT_URL            => $url,     // URL of the page
        CURLOPT_RETURNTRANSFER => true,     // return web page
        CURLOPT_HEADER         => false,    // don't return headers
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects
        CURLOPT_ENCODING       => "",       // handle all encodings
        CURLOPT_USERAGENT      => "spider", // who am i
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
        CURLOPT_TIMEOUT        => 120,      // timeout on response
        CURLOPT_MAXREDIRS      => 3,       // stop after 3 redirects
    );

    $ch      = curl_init( $url );
    curl_setopt_array( $ch, $options );
    $content = curl_exec( $ch );
    $header  = curl_getinfo( $ch );
    curl_close( $ch );

    //Ending all that cURL mess...


    //Removing linebreaks,multiple whitespace and tabs for easier Regexing
    $content = str_replace(array("\n", "\r", "\t", "\o", "\xOB"), '', $content);
    $content = preg_replace('/\s\s+/', ' ', $content);
    $this->profilehtml = $content;
    return $content;
}

它返回没有换行符、制表符、多个空格等的 HTML，只有 1 行。

所以现在你执行 preg_match:

$html = getHTML($url)
preg_match('|<title>(.*)</title>|iUsm',$html,$matches);

并且 $matches[1] 将拥有你需要的信息。

I have this function as a snippet:

function getHTML($url) {
    if($url == false || empty($url)) return false;
    $options = array(
        CURLOPT_URL            => $url,     // URL of the page
        CURLOPT_RETURNTRANSFER => true,     // return web page
        CURLOPT_HEADER         => false,    // don't return headers
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects
        CURLOPT_ENCODING       => "",       // handle all encodings
        CURLOPT_USERAGENT      => "spider", // who am i
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
        CURLOPT_TIMEOUT        => 120,      // timeout on response
        CURLOPT_MAXREDIRS      => 3,       // stop after 3 redirects
    );

    $ch      = curl_init( $url );
    curl_setopt_array( $ch, $options );
    $content = curl_exec( $ch );
    $header  = curl_getinfo( $ch );
    curl_close( $ch );

    //Ending all that cURL mess...


    //Removing linebreaks,multiple whitespace and tabs for easier Regexing
    $content = str_replace(array("\n", "\r", "\t", "\o", "\xOB"), '', $content);
    $content = preg_replace('/\s\s+/', ' ', $content);
    $this->profilehtml = $content;
    return $content;
}

That returns the HTML with no linebreaks, tabs, multiple spaces, etc, only 1 line.

So now you do this preg_match:

$html = getHTML($url)
preg_match('|<title>(.*)</title>|iUsm',$html,$matches);

and $matches[1] will have the info you need.

回复收藏 0 原文

~没有更多了~

关于作者

半山落雨半山空

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

PHP：如何从不同服务器加载文件作为字符串？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

PHP：如何从不同服务器加载文件作为字符串？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。