googlebot在抓取时会保留会话吗?
当googlebot抓取页面时它有会话吗?例如,我在会话中存储一些变量并在我的网站页面中使用它们。当 googlebot 抓取这些页面时,我还会有会话变量吗?在我的 global.asax 中,我在会话启动时在会话中存储一些变量。我使用 Google bot 会遇到问题吗?
When googlebot crawls pages does it have session? For example I am storing some variables on the session and using them in my site's pages. When googlebot crawls these pages will I still have the session-variables? In my global.asax
I am storing some variables on the session at session start. Will I have any problem with Google bot?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您的一个问题的答案是:是,您在使用 Google bot 时会遇到问题。
一般来说,我们在使用 google bot 时遇到了两种类型的问题:
它有时不会在请求之间保留 HTTP cookie。我们的应用程序依赖于自定义 Cookie,并且捕获到的大量 Google 机器人请求根本不携带 Cookie。
它会在连续请求之间产生长时间的中断。例如,它会检索您的页面并稍后请求其脚本。
两者都会给你的会话带来麻烦。首先 - 您需要在请求之间传递精确的 ASPNETSessionID cookie。 Googlebot 有时可能无法做到这一点。其次,如果请求之间的时间间隔很长,即使 cookie 存在,您的会话也会终止。
The answer to one of your question is: yes, you will have problems with Google bot.
Generally we've encountered two types of issues with google bot:
it sometimes does not retain HTTP cookies between requests. Our application relies on custom cookies and the there were plenty of google bot requests caught to carry no cookies at all.
it makes long breaks between consecutive requests. For example, it retrieves your page and asks for it's scripts later on.
Both will cause troubles with your session. First - you need a precise ASPNETSessionID cookie to be passed between requests. Googlebot will probably sometimes fail to do that. Second - if there's a long timespan between requests, your session is going to terminate even if the cookie is there.
一般来说,答案是否定的,但是其他爬虫(有很多)以其他方式工作。
我应该注意到,我已经看到了 Adwords 的 google 爬虫程序(不是普通的 googlebot)的实例,它确实提供了会话 cookie。
Generally the answer is no, however other crawlers (of which there are plenty) work other ways.
I should note that I have seen an instance of a google crawler for Adwords (not the normal googlebot) which DID present a session cookie.
我认为这不太可能。每次抓取您的网站时,它都应该创建一个新会话。
It's very unlikely, I think. It should create a new session every time it crawls your website.
Googlebot 积极尝试避免会话并且不支持 cookie。来自与 Googlebot 的第一次约会:标头和压缩(2008 年 3 月)
我想大多数常规搜索引擎机器人在这方面都是相似的。 Google 正在尝试建立唯一 URL 的索引。 URL 是标识唯一内容页面的唯一键。当用户单击 SERPS 中的链接时,不会传递 Cookie(和会话)。 Google 主要索引页面,而不是网站。
Googlebot actively tries to avoid sessions and does not support cookies. From First date with the Googlebot: Headers and compression (March 2008)
I imagine most regular search engine bots will be similar in this respect. Google is trying to build an index of unique URLs. The URL is the unique key that identifies a unique page of content. Cookies (and sessions) are not passed when a user clicks a link in the SERPS. Google is primarily indexing pages, not sites.