当前位置：文江博客话题详情

如何在 Google 中对网站的一页进行 noindex

发布于 2024-10-25 21:17:58 字数 124 浏览 7 评论 0原文

我感兴趣的是如何防止网站的某一页面不被谷歌或任何其他机器人索引。在我的脚本中，我有包含 TPL 文件、Index.tpl、Header.tpl ...的模板。那么我如何告诉谷歌不要索引页面：login.tpl

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

百思不得你姐 2024-11-01 21:17:59

这是不正确的。 robots.txt 不会告诉爬虫哪些内容应该建立索引，哪些内容不应该建立索引。这就是您使用元机器人标签的目的。让它为 noindex 提供服务就可以了。
请参阅示例和进一步阅读：http://yoast.com/x-robots-tag-play /

回复收藏 0 原文

念﹏祤嫣 2024-11-01 21:17:59

我知道我迟到了，但这也可以帮助其他人
下面是您将看到的更准确的答案。

我正在考虑您的网站正在使用 WordPress。

您可以使用 wordpress“自定义字段”选项。（您可以在此处找到详细信息）

您需要做的第一件事是要做的是将以下代码添加到主题 header.php 模板的 head 部分。

并复制以下代码

<?php
    $noindex = get_post_meta($post->ID, 'noindex-page', true);

    if ($noindex) {
        echo '<meta name="robots" content="noindex,follow" />';
    }
?>

现在您需要做的就是指定一个名为 noindex-page 的自定义字段并为其分配一个值。输入什么并不重要。您需要做的就是确保在字段中输入某些内容，以便自定义字段 noindex-page 在您在标头中指定的代码中返回 true。

请记住这一点，这也适用于帖子

I know i am late for the answers but this could help others also
below is the more precise answer that you will see.

I am considering that you are using wordpress for your site.

You can use wordpress "CUSTOM FIELD" option.(you can find details here)

The first thing you need to do is add the following code to the head section of your theme’s header.php template.

And copy the below code

<?php
    $noindex = get_post_meta($post->ID, 'noindex-page', true);

    if ($noindex) {
        echo '<meta name="robots" content="noindex,follow" />';
    }
?>

Now all you need to do is specify a custom field entitled noindex-page and assign a value to it. It doesn’t matter what you enter. All you need to do is ensure that something is entered in the field so that the custom field noindex-page returns as true in the code you specified in your header.

please keep this in mind, this will also work for posts

回复收藏 0 原文

七秒鱼° 2024-11-01 21:17:58

如果您希望某个特定的 URL（或目录）不被爬网程序索引，一个简单的解决方案是使用 robots.txt 文件——它允许您指定什么可以、什么不可以，被索引。

有关详细信息，请参阅关于 /robots.txt

例如，如果您希望爬网程序不索引 /my-page.php URL，您可以在 robots.txt 文件中使用类似以下内容：

User-agent: *
Disallow: /my-page.php

作为旁注：最终用户不应该看到的文件（例如包含文件、库、非解释模板，...）不应该由您的网络服务器提供：没有人应该可以访问这些。

如果使用 Apache，在给定文件夹中使用 .htaccess 文件（前提是启用此功能），您可以阻止 Apache 提供该文件夹中的任何文件：

Deny from All

注意：无任何内容将由 Apache 从包含具有该内容的 .htaccess 文件的目录提供服务！

If you want a specific URL (or a directory) no not be indexes by crawlers, a simple solution is to use a robots.txt file -- which will allow you to specify what can, and cannot, be indexed.

For more informations, see About /robots.txt

For example, if you want a crawler not to index the /my-page.php URL, you could use something like this in your robots.txt file :

User-agent: *
Disallow: /my-page.php

As a sidenote : files that should not be visible from end-users (like include files, libraries, non-interpreted templates, ...) should not be served by your webserver : no-one should be available to access those.

If using Apache, using a .htaccess file in a given folder (provided this feature is enabled), you can prevent Apache from serving any file from that folder :