获取“属性og”使用 PHP 从 URL 获取元标记

发布于 2025-01-05 03:36:45 字数 311 浏览 4 评论 0原文

我想创建一个类似于 Facebook 使用的发布功能(您将链接粘贴到文本框中,点击帖子,它会发布标题、描述和图像)。我意识到最好提取具有 og 属性的元标记,例如“og:title”和“og:image”,因为如果我使用普通标记,有时它们会有换行符和其他类似的东西,并且会出现错误。

有没有办法使用 PHP 获取这些标签的内容,但无需 AJAX 或其他自定义解析器?出发点是:

<?php

$url = $_POST['link'];

?>

我们通过 POST 方法从上一页获取 URL,但是剩下的怎么办呢?

I want to create a posting feature similar to the one Facebook uses (You paste a link into textbox, hit post and it posts a title, description and an image). I realized that it is best to extract the meta tags that have og properties such as "og:title" and "og:image" because if I use normal tags, sometimes they have line breaks and such other things and it comes out with errors.

Is there a way to fetch contents of these tags using PHP, but without AJAX or other custom parsers? The starting point would be:

<?php

$url = $_POST['link'];

?>

We get the URL from the previous page through POST method, but how to do the rest?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

甜嗑 2025-01-12 03:36:45

解决方案是这样的:

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@property='og:title']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@property='og:description']") as $el) {
    echo $el->getAttribute("content");
}

The solution is this:

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@property='og:title']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@property='og:description']") as $el) {
    echo $el->getAttribute("content");
}
涫野音 2025-01-12 03:36:45

使用类似下面的内容:

libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);

How上找到了这个通过 php 获取网页的开放图谱协议? - 搜索很有帮助,Google 也一样!

http://www.google.co.uk/search? q=元+属性+og+标签

Use something like the below:

libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);

Found this on How to get Open Graph Protocol of a webpage by php? - search is helpful, as is Google!

http://www.google.co.uk/search?q=meta+property+og+tags

分开我的手 2025-01-12 03:36:45

使用这个: https://github.com/baj84/MetaData

它既简单又高效。

$metaData = MetaData::fetch($url);
var_dump($metaData->tags());

Use this: https://github.com/baj84/MetaData

It's easy and efficient.

$metaData = MetaData::fetch($url);
var_dump($metaData->tags());
愛放△進行李 2025-01-12 03:36:45

我们通过 php(命令行实用程序)使用 Apache Tika,并使用 -j 表示 json :

http:// tika.apache.org/

<?php
    shell_exec( 'java -jar tika-app-1.4.jar -j http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying' );
?>

这是来自随机监护人文章的输出示例:

{
   "Content-Encoding":"UTF-8",
   "Content-Length":205599,
   "Content-Type":"text/html; charset\u003dUTF-8",
   "DC.date.issued":"2013-07-21",
   "X-UA-Compatible":"IE\u003dEdge,chrome\u003d1",
   "application-name":"The Guardian",
   "article:author":"http://www.guardian.co.uk/profile/nicholaswatt",
   "article:modified_time":"2013-07-21T22:42:21+01:00",
   "article:published_time":"2013-07-21T22:00:03+01:00",
   "article:section":"Politics",
   "article:tag":[
      "Lynton Crosby",
      "Health policy",
      "NHS",
      "Health",
      "Healthcare industry",
      "Society",
      "Public services policy",
      "Lobbying",
      "Conservatives",
      "David Cameron",
      "Politics",
      "UK news",
      "Business"
   ],
   "content-id":"/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "dc:title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "fb:app_id":180444840287,
   "keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "msapplication-TileColor":"#004983",
   "msapplication-TileImage":"http://static.guim.co.uk/static/a314d63c616d4a06f5ec28ab4fa878a11a692a2a/common/images/favicons/windows_tile_144_b.png",
   "news_keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "og:description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "og:image":"https://static-secure.guim.co.uk/sys-images/Guardian/Pix/pixies/2013/7/21/1374433351329/Lynton-Crosby-008.jpg",
   "og:site_name":"the Guardian",
   "og:title":"Tory strategist Lynton Crosby in new lobbying row",
   "og:type":"article",
   "og:url":"http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "resourceName":"tory-strategist-lynton-crosby-lobbying",
   "title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "twitter:app:id:googleplay":"com.guardian",
   "twitter:app:id:iphone":409128287,
   "twitter:app:name:googleplay":"The Guardian",
   "twitter:app:name:iphone":"The Guardian",
   "twitter:app:url:googleplay":"guardian://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "twitter:card":"summary_large_image",
   "twitter:site":"@guardian"
}

We use Apache Tika via php (command line utility) with -j for json :

http://tika.apache.org/

<?php
    shell_exec( 'java -jar tika-app-1.4.jar -j http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying' );
?>

This is a sample output from a random guardian article :

{
   "Content-Encoding":"UTF-8",
   "Content-Length":205599,
   "Content-Type":"text/html; charset\u003dUTF-8",
   "DC.date.issued":"2013-07-21",
   "X-UA-Compatible":"IE\u003dEdge,chrome\u003d1",
   "application-name":"The Guardian",
   "article:author":"http://www.guardian.co.uk/profile/nicholaswatt",
   "article:modified_time":"2013-07-21T22:42:21+01:00",
   "article:published_time":"2013-07-21T22:00:03+01:00",
   "article:section":"Politics",
   "article:tag":[
      "Lynton Crosby",
      "Health policy",
      "NHS",
      "Health",
      "Healthcare industry",
      "Society",
      "Public services policy",
      "Lobbying",
      "Conservatives",
      "David Cameron",
      "Politics",
      "UK news",
      "Business"
   ],
   "content-id":"/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "dc:title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "fb:app_id":180444840287,
   "keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "msapplication-TileColor":"#004983",
   "msapplication-TileImage":"http://static.guim.co.uk/static/a314d63c616d4a06f5ec28ab4fa878a11a692a2a/common/images/favicons/windows_tile_144_b.png",
   "news_keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "og:description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "og:image":"https://static-secure.guim.co.uk/sys-images/Guardian/Pix/pixies/2013/7/21/1374433351329/Lynton-Crosby-008.jpg",
   "og:site_name":"the Guardian",
   "og:title":"Tory strategist Lynton Crosby in new lobbying row",
   "og:type":"article",
   "og:url":"http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "resourceName":"tory-strategist-lynton-crosby-lobbying",
   "title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "twitter:app:id:googleplay":"com.guardian",
   "twitter:app:id:iphone":409128287,
   "twitter:app:name:googleplay":"The Guardian",
   "twitter:app:name:iphone":"The Guardian",
   "twitter:app:url:googleplay":"guardian://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "twitter:card":"summary_large_image",
   "twitter:site":"@guardian"
}
薄凉少年不暖心 2025-01-12 03:36:45

试试这个..
它对我有用..

foreach($linkHtml->find('head meta[property=og:url]') as $url)
{   
    echo $url->content.'</br>';
}

try this..
it worked for me..

foreach($linkHtml->find('head meta[property=og:url]') as $url)
{   
    echo $url->content.'</br>';
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文