像 Hubspot 这样的网站如何跟踪入站链接?

发布于 2024-07-11 08:26:11 字数 296 浏览 5 评论 0原文

所有这些类型的网站都只是非法抓取 Google 或其他搜索引擎吗?
据我所知,没有“合法”的方式来获取商业网站的这些数据。 api ( http://developer.yahoo.com/search/siteexplorer/V1/ inlinkData.html )仅供非商业用途,Yahoo! Boss不允许自动查询等
有任何想法吗?

Are all these types of sites just illegally scraping Google or another search engine?
As far as I can tell ther is no 'legal' way to get this data for a commercial site.. The Yahoo! api ( http://developer.yahoo.com/search/siteexplorer/V1/inlinkData.html ) is only for noncommercial use, Yahoo! Boss does not allow automated queries etc.
Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

去了角落 2024-07-18 08:26:11

例如,如果您想查找所有指向 Google 主页的链接,请搜索

link:http://www.google.com

因此,如果您想查找所有入站链接,您可以简单地遍历您的网站树,并为它找到的每个项目构建一个 URL。 然后向 Google 查询:

link:URL

您将获得 Google 从其他网站到您网站的所有链接的集合。

至于这种收获的合法性,我确信从中获利并不完全合法,但这从来没有阻止过任何人,不是吗?

(所以我不会费心去想他们是否这样做了。只是假设他们这样做了。)

For example, if you wanted to find all the links to Google's homepage, search for

link:http://www.google.com

So if you want to find all the inbound links, you can simply traverse your website's tree, and for each item it finds, build a URL. Then query Google for:

link:URL

And you'll get a collection of all the links that Google has from other websites into your website.

As for the legality of such harvesting, I'm sure it's not-exactly-legal to make a profit from it, but that's never stopped anyone before, has it?

(So I wouldn't bother wondering whether they did it or not. Just assume they do.)

不喜欢何必死缠烂打 2024-07-18 08:26:11

我不知道 hubspot 是做什么的,但是,如果您想找出哪些网站链接到您的网站,并且您没有抓取网络的硬件,您可以做的一件事就是监视访问者的 HTTP_REFERER地点。 例如,谷歌分析(据我所知)就是这样告诉你你的访客来自哪里的。 这并不是 100% 可靠,因为并非所有浏览器都设置它,特别是在“隐私模式”下,但每个链接只需要一位访问者就知道它存在!

这通常是通过将脚本嵌入到每个网页中(通常在公共页眉或页脚中)来完成的。 例如,如果您检查当前正在阅读的页面的源代码,您会发现(在底部)有一个脚本向 Google 报告有关您访问的信息。

现在,这不会告诉您是否存在没有人使用过的链接来访问您的网站,但让我们面对现实吧,它们比人们实际使用的链接要有趣得多。

I don't know what hubspot do, but, if you wanted to find out what sites link to your site, and you don't have the hardware to crawl the web, one thing you can do is monitor the HTTP_REFERER of visitors to your site. This is, for example, how Google Analytics (as far as I know) can tell you where your visitors are arriving from. This is not 100% reliable as not all browsers set it, particularly in "Privacy Mode", but you only need one visitor per link to know that it exists!

This is ofter accomplished by embedding a script into each of your webpages (often in a common header or footer). For example, if you examine the source for the page you are currently reading you will find (right down at the bottom) a script that reports back to Google information about your visit.

Now this won't tell you if there are links out there that no one has ever used to get to your site, but let's face it, they are a lot less interesting than the ones people actually use.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文