Googlebot 导致 .NET System.Web.HttpException

发布于 2024-11-19 07:12:15 字数 565 浏览 4 评论 0原文

我有一个与经典 asp 混合的 ASP.NET 网站（我们正在努力转换为 .NET），我最近从 .NET 1.1 升级到 .NET 4.0，并切换到 IIS 7 中的集成管道。

由于这些更改，ELMAH 报告错误来自几乎没有详细信息的经典asp页面（状态代码404）：

System.Web.HttpException (0x80004005)
   at System.Web.CachedPathData.ValidatePath(String physicalPath)
   at System.Web.HttpApplication.PipelineStepManager.ValidateHelper(HttpContext context)

但是当我自己请求该页面时，没有发生错误。 ELMAH 中显示的所有这些错误都是由 Googlebot 抓取工具（用户代理字符串）引起的。

.NET 为什么会发现经典 asp 页面的错误？这与集成管道有关吗？

有什么想法为什么错误只在 Google 抓取页面时发生，或者我如何获取更多详细信息以找到潜在的错误？

原文

I have an ASP.NET website mixed with classic asp (we are working on a conversion to .NET) and I recently upgraded from .NET 1.1 to .NET 4.0 and switched to integrated pipeline in IIS 7.

Since these changes ELMAH is reporting errors from classic asp pages with practicaly no detail (and status code 404):

System.Web.HttpException (0x80004005)
   at System.Web.CachedPathData.ValidatePath(String physicalPath)
   at System.Web.HttpApplication.PipelineStepManager.ValidateHelper(HttpContext context)

But when I request the page myself, no error occurs. All these errors showing up in ELMAH are caused by the Googlebot crawler (user agent string).

How come .NET picks up errors for classic asp pages? Has this got to do with the integrated pipeline?

Any ideas why the error only happens when Google crawls the page or how I can get more details to find the underlying fault?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风渺 2024-11-26 07:12:16

看起来 Google 抓取工具会遍历不再存在的链接。 IE 您网站上的某些文档可能引用了其他文档，但它们已被删除。

我看起来并不认真，所以你可能会考虑过滤掉这个例外。

回复收藏 0 原文

嘦怹 2024-11-26 07:12:16

这仅适用于您使用 Angular 的情况，但如果

<httpRuntime relaxedUrlToFileSystemMapping="false" /> (as mentioned in the previous answers)

您在图像或脚本标记上使用 src 而不是 ng-src，您就会看到这一点，即

<img src="{{SomeModelValue}}" />

应该是

<img ng-src="{{SomeModelValue}}" />

这也可能会影响您使用 href 而不是 ng- 的 A 标记链接。

This only applies if you are using Angular, but you'll see this if

<httpRuntime relaxedUrlToFileSystemMapping="false" /> (as mentioned in the previous answers)

and you use src instead of ng-src on an image or script tag, i.e

<img src="{{SomeModelValue}}" />

should be

<img ng-src="{{SomeModelValue}}" />

This could also affect A tags where you are using href instead of ng-href.

回复收藏 0 原文

如歌彻婉言 2024-11-26 07:12:15

将其添加到您的 web.config 文件中：

<httpRuntime relaxedUrlToFileSystemMapping="true" />

此禁用默认检查以确保请求的 URL 符合 Windows 路径规则。

要重现该问题，请将 %20（URL 转义空格）添加到 URL 末尾，例如 http://example.org/%20。当搜索爬虫遇到错误输入的带有空格的链接时，很常见这种问题，例如 example 。

HttpContext.Request.Url 属性似乎会修剪尾部空格，这就是为什么像 ELMAH 这样的日志记录工具不会揭示实际问题。

Add this to your web.config file:

<httpRuntime relaxedUrlToFileSystemMapping="true" />

This disables the default check to makes sure that requested URLs conform to Windows path rules.

To reproduce the problem, add %20 (URL-escaped space) to the end of the URL, e.g. http://example.org/%20. It's fairly common to see this problem from search crawlers when they encounter mis-typed links with spaces, e.g. <a href="http://example.org/ ">example</a>.

The HttpContext.Request.Url property seems to trim the trailing space, which is why logging tools like ELMAH don't reveal the actual problem.

回复收藏 0 原文