如何告诉搜索引擎不要通过辅助域名索引内容?
我有一个网站 a.com(例如)。我还有其他几个域名,但我没有使用它们:b.com 和 c.com。他们目前转发到 a.com。我注意到 Google 使用 b.com/stuff 和 c.com/stuff 而不仅仅是 a.com/stuff 对我网站的内容进行索引。告诉 Google 仅通过 a.com 而不是 b.com 和 c.com 索引内容的正确方法是什么?
通过 htaccess 进行 301 重定向似乎是最好的解决方案,但我不确定如何做到这一点。只有一个 htaccess 文件(每个域没有自己的 htaccess 文件)。
b.com 和 c.com 并不是 a.com 的别名,它们只是我为未来可能的项目保留的其他域名。
I have a website at a.com (for example). I also have a couple of other domain names which I am not using for anything: b.com and c.com. They currently forward to a.com. I have noticed that Google is indexing content from my site using b.com/stuff and c.com/stuff, not just a.com/stuff. What is the proper way to tell Google to only index content via a.com, not b.com and c.com?
It seems as if a 301 redirect via htaccess is the best solution, but I am not sure how to do that. There is only the one htaccess file (each domain does not have its own htaccess file).
b.com and c.com are not meant to be aliases of a.com, they are just other domain names I am reserving for possible future projects.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
robots.txt 是告诉蜘蛛爬行什么和不爬行什么的方法。如果您将以下内容放入网站的根目录 /robots.txt:
行为良好的蜘蛛不会搜索您网站的任何部分。大多数大型网站都有 robots.txt,例如 google
robots.txt is the way to tell spiders what to crawl and what to not crawl. If you put the following in the root of your site at /robots.txt:
A well-behaved spider will not search any part of your site. Most large sites have a robots.txt, like google
您可以简单地使用
.htaccess
文件创建重定向,如下所示:You can simply create a redirect with a
.htaccess
file like this:这很大程度上取决于您想要实现的目标。 301会说内容永久移动(这才是转移PR的正确方式),这是你想要实现的吗?
你希望谷歌表现得好一点吗?您可以使用 robots.txt,但请记住有一个缺点:该文件可以从外部读取,并且每次都位于同一个位置,因此您基本上泄露了您可能想要保护的目录和文件的位置。因此,只有在没有什么值得保护的情况下才使用 robots.txt。
如果有什么值得保护的东西而不是您应该用密码保护目录,那么这将是正确的方法。 Google 不会为受密码保护的目录编制索引。
http://support.google.com/webmasters/bin /answer.py?hl=en&answer=93708
对于最后一种方法,取决于您是否要使用 httpd.conf 文件或 .htaccess。最好的方法是使用 httpd.conf,即使 .htaccess 看起来更容易。
http://httpd.apache.org/docs/2.0/howto/auth.html
It pretty much depends of what you want to achieve. 301 will say that the content is moved permanently (and it is the proper way of transferring PR), is this what you want to achieve?
You want Google to behave? Than you may use robots.txt, but keep in mind there is a downside: this file is readable from outside and every time located in the same place, so you basically give away the location of directories and files that you may want to protect. So use robots.txt only if there is nothing worth protecting.
If there is something worth protecting than you should password protect the directory, this would be the proper way. Google will not index password protected directories.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93708
For the last method it depends if you want to use the httpd.conf file or .htaccess. The best way will be to use httpd.conf, even if .htaccess seems easier.
http://httpd.apache.org/docs/2.0/howto/auth.html
让您的服务器端代码生成一个规范引用,该引用指向被视为“源”的页面。示例 =
参考:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your -canonical.html
- 更新:Ask.com、Microsoft Live Search 和 Yahoo! 目前也支持此链接标签。
Have your server side code generate a canonical reference that point to the page to be considered "source". Example =
Reference:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
- Update: this link-tag is currently also supported by Ask.com, Microsoft Live Search and Yahoo!.