Java/1.6.0_24 是机器人吗以及如何强制它们刷新链接
我们现在有很多网站都使用 log4net 基本错误日志框架,并且我们从网站附加的任何地方都会收到错误。我们注意到其中一些因为“Bot”而捕获错误,例如 google、bing、yahoo 等。但有些事情我们不确定如何解决。我有两个问题:
- “Java/1.6.0_24”是机器人吗?因为我的问题#2 的用户代理就是关于这个的。
- “Java/1.6.0_24”仍在我们网站上调用不存在的子文件夹!例如,如果我们有一个名为“Page1.aspx”的页面,他不会调用“~/Page1.aspx”,而是将其称为“~/minisite/Page1.aspx”。我怎样才能告诉他他错了?有办法做到吗?
感谢您
We've now got plenty of sites which all use a log4net base error loging framework and we receive error from site from anywhere it append. We've notice that some of them catch error because of "Bot" like google, bing, yahoo, etc. But there's a things we've not sure about how to resolve. I've two questions about it :
- Is "Java/1.6.0_24" a Bot? Because the user-agent of my question #2 is about this.
- The "Java/1.6.0_24" still calling subfolder on our site that just do not exists! Like, if we have a page called "Page1.aspx", instead of calling "~/Page1.aspx", he calls it "~/minisite/Page1.aspx". How can I tell him he's wrong? Is there a way to do it?
Thanks you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它很可能是一个机器人,但它也可能是某种基于 Java 的浏览器,发送用户代理字符串 - 你不能 100% 信任它,但它可以让你估计连接实体是什么是。根据机器人的类型,它可能会忽略你的 robots.txt,所以我只是在某处实施一些处理内容。
那些文件夹曾经存在过吗?如果是这样,您可以使用 HTTP 的永久重定向(代码 301)告诉他不再看看那里 - 但这并不能保证它会这样做。
It's most likely a bot but it could as well be some kind of browser based on Java that sends that user-agent string - you can't trust it 100% but it can give you an estimate idea of what the connecting entity is. Depending on the kind of bot it might as well just ignore your robots.txt so I'd just impement some handling stuff somewhere.
Did those folders ever exist? If so, you could use HTTP's permanent redirect (code 301) to tell him to no longer look there - however that doesn't guarantee it will do so.