阻止 Google 对网站/内容建立索引的方法

发布于 2024-10-04 12:44:25 字数 355 浏览 11 评论 0原文

我手上有一个案例,我必须非常确信谷歌(或任何雅虎/必应)不会索引特定内容,所以冗余越多越好。

据我所知,有 3 种方法可以实现这一点,我想知道是否还有更多方法(冗余是这里的关键):

  1. 将元标记设置为无索引,
  2. 后受影响的 url 结构
  3. 不允许在 robots.txt加载 通过 ajax 获取内容

因此,如果这就是所有方法,那很好,但是如果有人有一些想法如何更加确定,那就太好了:D

我知道这有点疯狂,但是如果内容显示出来在谷歌,不知怎的,对于我的公司来说,它会变得非常昂贵:'-()

I've a case on my Hand where I must be super duper sure that google (or any yahoo / bing for that matter) does not index specific content, so the more redundant, the better.

As far as i know there are 3 Ways to accomplish that, I wonder if there are more (redundancy is key here) :

  1. set meta tag to no-index
  2. disallow affected url structure in robots.txt
  3. post load the content via ajax

So if that are all methods, good, but it would be just dandy if someone has some Idea how to be even more sure :D

(I know thats a little bit insane, but if the content shows up in google somehow it will get really expensive for my company :'-( )

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

悸初 2024-10-11 12:44:25

呃,还有很多

a) 识别 googlebot(与其他机器人的工作原理类似)
http://www.google.com/support /webmasters/bin/answer.py?hl=zh-CN&answer=80553
并且不向他们显示内容

b) 使用 HTTP 404 / HTTP 410 标头而不是 HTTP 200 返回这些页面

c) 仅使用 cookie/会话向客户端显示这些页面

d) 将整个内容呈现为图像(然后禁止image)

e) 将整个内容呈现为图像数据 URL(然后不需要禁止)

f) user pipelines |在URL结构中(在谷歌中有效,不知道其他页面)

g)使用只能工作5分钟的动态URL

,这些只是我脑海中的一些......可能还有更多

uh, there are a lot more

a) identify googlebot (works similar with other bots)
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80553
and don't show them the content

b) return these pages with an HTTP 404 / HTTP 410 header instead of HTTP 200

c) only show these pages to clients with cookies / sesssions

d) render the whole content as image (and then disalow the image)

e) render the whole content as an image data URL (then a disalow is not needed)

f) user pipes | in the URL structure (works in google, don't know about the other pages)

g) use dynamic URLs that only work let say for 5 minutes

and these are just a few on top of my mind ... there are propably more

_失温 2024-10-11 12:44:25

好吧,我想您可能需要某种注册/身份验证才能查看内容。

我们在工作中通过 ajax 方法使用后加载内容,效果非常好。您只需确保如果在没有 xhr 标头的情况下命中相同的 ajax 路由,则不会返回任何内容。 (不过,我们将其与授权结合使用。)

我只是认为,如果不实际锁定某种身份验证背后的数据,就无法完全确定。如果它上市后对你的公司来说会很昂贵,那么你可能需要认真考虑它。

Well, I suppose you could require some sort of registration/authentication to see the content.

We're using the post-load content via ajax method at my work and it works pretty well. You just have to be sure that you're not returning anything if that same ajax route is hit without the xhr header. (We're using it in conjunction with authorization though.)

I just don't think there's anyway to be completely sure without actually locking down the data behind some sort of authentication. And if it's going to be expensive for your company if it gets out there, then you might want to seriously consider it.

野生奥特曼 2024-10-11 12:44:25

如何阻止来自搜索引擎的 IP 以及使用 .htaccess 中的搜索引擎用户代理的请求?

它可能需要对 IP 和用户代理列表进行更多维护,但它会起作用。

What about blocking IPs from search engines and requests with search engine user-agents in .htaccess?

It might need more maintenance of the list of IPs and user-agents but it will work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文