如何使用 file_get_contents 仅提取某些数据

发布于 2024-12-18 03:25:32 字数 195 浏览 0 评论 0原文

如何使用 file_get_contents() 提取 $homepage 变量的特定部分?

<?php
$homepage = file_get_contents('http://www.example.com/');
echo $homepage;
?> 

How can I use file_get_contents() to extract a certain part of the $homepage variable?

<?php
$homepage = file_get_contents('http://www.example.com/');
echo $homepage;
?> 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

慢慢从新开始 2024-12-25 03:25:32

你的问题不是很重要,但是在抽象层面上,我相信你正在寻找字符串操作课程:)这里有一些我分享的链接,

顺便说一句,这完全取决于你到底想要什么摘录,如果您能详细说明问题,这将有助于我们及时回答您!

干杯

PS:屏幕抓取是一个坏主意,除非你正在抓取自己的网页(这确实有意义:))。原因是您永远不知道 WWW.EXAMPLE.COM 何时会发生变化,您的操纵逻辑将不再有用

Your question is not very greatly asked, However on abstract level I believe that you are looking for string manipulation lessons :) here are few links I am sharing,

BTW, it all depends what exactly you want to extract, If you could elaborate more with detailed question that will help us to answer you spot on!!

cheers

PS: SCREEN SCRAPING IS A BAD IDEA UNLESS YOU ARE SCRAPPING YOUR OWN WEBPAGE(WHICH DOSENT REALLY MAKE SENSE :) ). REASON IS YOU NEVER KNOW WHEN WWW.EXAMPLE.COM IS GOING TO CHANGE, AND YOUR LOGIC OF MANIPULATION WILL BE NO LONGER USEFULL

活雷疯 2024-12-25 03:25:32

最好的解决方案可能是在加载 $homepage 变量后对其进行处理。查看 字符串函数正则表达式

file_get_contents() 支持 offsetmaxlen 选项,可用于控制加载文件的哪些部分,但 offset当在非本地文件上使用时(如示例中所示),code> 的行为被文档描述为“不可预测”。

也就是说,maxlen 可能是安全的,因此如果您知道您想要的内容将位于文件的前 N ​​段中,则可以使用它来修剪结尾。因此,如果您确定只需要主页的前 100 个字节,您可以执行类似 file_get_contents ( 'http://www.example.com/', false, NULL, 0, 100)< /代码>。但除非您想要确切的前 100 个字节,否则您仍然需要进行一些后处理。

请参阅 http://php.net/manual/en/function.file -get-contents.php 了解更多信息。 (除了 maxlen 之外,这些都是默认值。)

The best solution is probably to process the $homepage variable after it has been loaded. Have a look at String functions and regular expressions.

file_get_contents() supports offset and maxlen options that can be used to control what parts of the file get loaded, but offset has behavior described by the documentation as "unpredictable" when used on non-local files as in your example.

That said, maxlen is probably safe so you can use that to trim off the end if you know that what you want will be in the first N bites of the file. So, if you are certain that you only want the first 100 bytes of the homepage, you can do something like file_get_contents ( 'http://www.example.com/', false, NULL, 0, 100). But unless you want exactly the first 100 bytes, you'll still have to do some post-processing.

See http://php.net/manual/en/function.file-get-contents.php for more information. (Except for maxlen these are the default values.)

屋顶上的小猫咪 2024-12-25 03:25:32

这是正则表达式和 PHP 的示例。

<?php
$f = file_get_contents ("http://www.example.com");
$f = preg_replace ("(\ |\r|\n|\t)", "", $f); // optional
$f = preg_replace ("/\s\s+/", " ", $f); // optional spaces

if (preg_match ("/<h1>(.*)<\/h1>/", $f, $res)) {
    $data = $res [1];
}

echo $data;
?>

Here an example with Regular Expressions and PHP.

<?php
$f = file_get_contents ("http://www.example.com");
$f = preg_replace ("(\ |\r|\n|\t)", "", $f); // optional
$f = preg_replace ("/\s\s+/", " ", $f); // optional spaces

if (preg_match ("/<h1>(.*)<\/h1>/", $f, $res)) {
    $data = $res [1];
}

echo $data;
?>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文