获取“属性og”使用 PHP 从 URL 获取元标记

发布于 2025-01-05 03:36:45 字数 311 浏览 4 评论 0原文

我想创建一个类似于 Facebook 使用的发布功能（您将链接粘贴到文本框中，点击帖子，它会发布标题、描述和图像）。我意识到最好提取具有 og 属性的元标记，例如“og:title”和“og:image”，因为如果我使用普通标记，有时它们会有换行符和其他类似的东西，并且会出现错误。

有没有办法使用 PHP 获取这些标签的内容，但无需 AJAX 或其他自定义解析器？出发点是：

<?php

$url = $_POST['link'];

?>

我们通过 POST 方法从上一页获取 URL，但是剩下的怎么办呢？

原文

I want to create a posting feature similar to the one Facebook uses (You paste a link into textbox, hit post and it posts a title, description and an image). I realized that it is best to extract the meta tags that have og properties such as "og:title" and "og:image" because if I use normal tags, sometimes they have line breaks and such other things and it comes out with errors.

Is there a way to fetch contents of these tags using PHP, but without AJAX or other custom parsers? The starting point would be:

<?php

$url = $_POST['link'];

?>

We get the URL from the previous page through POST method, but how to do the rest?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜嗑 2025-01-12 03:36:45

解决方案是这样的：

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@property='og:title']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@property='og:description']") as $el) {
    echo $el->getAttribute("content");
}

The solution is this:

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@property='og:title']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@property='og:description']") as $el) {
    echo $el->getAttribute("content");
}

回复收藏 0 原文

涫野音 2025-01-12 03:36:45

使用类似下面的内容：

libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);

在How上找到了这个通过 php 获取网页的开放图谱协议？ - 搜索很有帮助，Google 也一样！

http://www.google.co.uk/search? q=元+属性+og+标签

Use something like the below:

libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);

Found this on How to get Open Graph Protocol of a webpage by php? - search is helpful, as is Google!

http://www.google.co.uk/search?q=meta+property+og+tags

回复收藏 0 原文

分开我的手 2025-01-12 03:36:45

使用这个： https://github.com/baj84/MetaData

它既简单又高效。

$metaData = MetaData::fetch($url);
var_dump($metaData->tags());

Use this: https://github.com/baj84/MetaData

It's easy and efficient.

$metaData = MetaData::fetch($url);
var_dump($metaData->tags());

回复收藏 0 原文

愛放△進行李 2025-01-12 03:36:45

我们通过 php（命令行实用程序）使用 Apache Tika，并使用 -j 表示 json ：

http:// tika.apache.org/

<?php
    shell_exec( 'java -jar tika-app-1.4.jar -j http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying' );
?>

这是来自随机监护人文章的输出示例：

{
   "Content-Encoding":"UTF-8",
   "Content-Length":205599,
   "Content-Type":"text/html; charset\u003dUTF-8",
   "DC.date.issued":"2013-07-21",
   "X-UA-Compatible":"IE\u003dEdge,chrome\u003d1",
   "application-name":"The Guardian",
   "article:author":"http://www.guardian.co.uk/profile/nicholaswatt",
   "article:modified_time":"2013-07-21T22:42:21+01:00",
   "article:published_time":"2013-07-21T22:00:03+01:00",
   "article:section":"Politics",
   "article:tag":[
      "Lynton Crosby",
      "Health policy",
      "NHS",
      "Health",
      "Healthcare industry",
      "Society",
      "Public services policy",
      "Lobbying",
      "Conservatives",
      "David Cameron",
      "Politics",
      "UK news",
      "Business"
   ],
   "content-id":"/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "dc:title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "fb:app_id":180444840287,
   "keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "msapplication-TileColor":"#004983",
   "msapplication-TileImage":"http://static.guim.co.uk/static/a314d63c616d4a06f5ec28ab4fa878a11a692a2a/common/images/favicons/windows_tile_144_b.png",
   "news_keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "og:description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "og:image":"https://static-secure.guim.co.uk/sys-images/Guardian/Pix/pixies/2013/7/21/1374433351329/Lynton-Crosby-008.jpg",
   "og:site_name":"the Guardian",
   "og:title":"Tory strategist Lynton Crosby in new lobbying row",
   "og:type":"article",
   "og:url":"http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "resourceName":"tory-strategist-lynton-crosby-lobbying",
   "title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "twitter:app:id:googleplay":"com.guardian",
   "twitter:app:id:iphone":409128287,
   "twitter:app:name:googleplay":"The Guardian",
   "twitter:app:name:iphone":"The Guardian",
   "twitter:app:url:googleplay":"guardian://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "twitter:card":"summary_large_image",
   "twitter:site":"@guardian"
}

We use Apache Tika via php (command line utility) with -j for json :

http://tika.apache.org/

<?php
    shell_exec( 'java -jar tika-app-1.4.jar -j http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying' );
?>

This is a sample output from a random guardian article :

{
   "Content-Encoding":"UTF-8",
   "Content-Length":205599,
   "Content-Type":"text/html; charset\u003dUTF-8",
   "DC.date.issued":"2013-07-21",
   "X-UA-Compatible":"IE\u003dEdge,chrome\u003d1",
   "application-name":"The Guardian",
   "article:author":"http://www.guardian.co.uk/profile/nicholaswatt",
   "article:modified_time":"2013-07-21T22:42:21+01:00",
   "article:published_time":"2013-07-21T22:00:03+01:00",
   "article:section":"Politics",
   "article:tag":[
      "Lynton Crosby",
      "Health policy",
      "NHS",
      "Health",
      "Healthcare industry",
      "Society",
      "Public services policy",
      "Lobbying",
      "Conservatives",
      "David Cameron",
      "Politics",
      "UK news",
      "Business"
   ],
   "content-id":"/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "dc:title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "fb:app_id":180444840287,
   "keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "msapplication-TileColor":"#004983",
   "msapplication-TileImage":"http://static.guim.co.uk/static/a314d63c616d4a06f5ec28ab4fa878a11a692a2a/common/images/favicons/windows_tile_144_b.png",
   "news_keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "og:description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "og:image":"https://static-secure.guim.co.uk/sys-images/Guardian/Pix/pixies/2013/7/21/1374433351329/Lynton-Crosby-008.jpg",
   "og:site_name":"the Guardian",
   "og:title":"Tory strategist Lynton Crosby in new lobbying row",
   "og:type":"article",
   "og:url":"http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "resourceName":"tory-strategist-lynton-crosby-lobbying",
   "title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "twitter:app:id:googleplay":"com.guardian",
   "twitter:app:id:iphone":409128287,
   "twitter:app:name:googleplay":"The Guardian",
   "twitter:app:name:iphone":"The Guardian",
   "twitter:app:url:googleplay":"guardian://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "twitter:card":"summary_large_image",
   "twitter:site":"@guardian"
}

回复收藏 0 原文

薄凉少年不暖心 2025-01-12 03:36:45

试试这个..
它对我有用..

foreach($linkHtml->find('head meta[property=og:url]') as $url)
{   
    echo $url->content.'</br>';
}

try this..
it worked for me..

foreach($linkHtml->find('head meta[property=og:url]') as $url)
{   
    echo $url->content.'</br>';
}

回复收藏 0 原文

~没有更多了~

关于作者

惯饮孤独

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

获取“属性og”使用 PHP 从 URL 获取元标记

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

获取“属性og”使用 PHP 从 URL 获取元标记

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。