消除昂贵的后端系统调用和 SEO
我们有一个网站,可以对后端系统进行昂贵的调用以显示产品可用性。我想消除这些对非实际客户的页面浏览量的调用。我的第一个想法是过滤用户代理,如果请求者是蜘蛛/搜索引擎爬虫,则显示“呼叫可用性”或一些此类消息(如果后端系统停机,这将与我们显示的消息相同)维护或通常不可用),而不是调用后端系统以获得真正的可用性。
在与人们的讨论中,人们似乎非常担心可用性图标(请注意,一个非常小的图标)在被爬行时与用户查看或请求页面时有所不同 - 我们可能会因隐藏搜索而受到惩罚发动机。
由于我们显示的信息是一个非常小的图像图标,并且我们没有向搜索引擎和实时用户提供截然不同的内容,所以我真的不认为隐藏是一个问题 - 但我想得到一些外部信息看法。
当页面的整体内容没有改变时,模拟搜索引擎的“信息不可用”场景是否可以接受,或者它仍然在某种程度上符合隐藏的条件?
We have a web site which makes expensive calls to a back end system to display product availability. I'd like to eliminate these calls for page views that are not actual customers. My first thought was to filter on user agent and if the requester is a spider / search engine crawler, to display a "Call for availability" or some such message (which would be the same message we would display if the backend systems were down for maintenance or generally unavailable) rather than make a call to the backend system for real availability.
In discussions with folks, there seems to be much concern over the availability icon (a very small icon, mind you) being different when being crawled vs. when a user is viewing or requesting the page - that we might be penalized for cloaking the search engines.
As the information we are displaying is a very small image icon, and we are not offering drastically different content to the search engines vs. live users, I really don't see cloaking as an issue - but I'd like to get some outside perspective.
Is simulating an "information not available" scenario for search engines acceptable practice when the overall content of the page does not change, or would it still qualify in some way as cloaking?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为什么不使用 javascript/ajax 来显示“信息”呢?这样,当页面通过不支持 JavaScript 的浏览器(例如搜索引擎蜘蛛)加载时,就不会进行这种“昂贵的调用”。
或者,您可以将此信息放入页面上的 IFRAME 中。并排除通过 robots.txt 或 META / robots 标记 对 IFRAME 中显示的页面建立索引。
这两种方法都完全是“白帽”,尽管我认为第二种更是如此。
Why don't you make the "information" which you are displaying use javascript / ajax. That way when the page loads through a non-javascript enabled browser (e.g. search engine spider), this "expensive call" isn't made.
Alternatively you could put this information in an IFRAME on your page. And exclude indexing the page shown in the IFRAME through robots.txt or the META / robots tag.
Both approaches are completely "white hat" although I think the 2nd is more so.