与 oracle 通信时应用程序崩溃,除非可执行路径包含空格

发布于 2024-07-08 15:08:14 字数 2684 浏览 3 评论 0原文

我们的 .NET 应用程序存在 x 文件问题。 或者更确切地说,混合 Win32 和 .NET 应用程序。

当它尝试与 Oracle 通信时,它就死掉了。 消失了。 前往天空中那片巨大的黑色虚空之中。 没有事件日志消息,没有异常,什么也没有。

如果我们只是要求应用程序与 MS SQL Server 对话,这会产生用 SqlConnection 和相关类替换 OracleConnection 和相关类的使用的效果,它会按预期工作。

今天我们有了突破。

由于某种原因,一位客户发现,通过将所有应用程序文件放在其桌面上的一个目录中,Oracle 也能按预期工作。 将目录向下移动到驱动器的根目录,或者在 C:\Temp 中,或者,好吧,稍微移动一下,就会再次出现崩溃。

基本上,如果从桌面上的目录运行该应用程序,则该应用程序可以 100% 重现;如果从根目录中的目录运行,则该应用程序会失败。

今天我们发现重要的区别在于目录名称中是否有空格。

因此,这些目录可以工作:

C:\Program Files\AppDir\Executable.exe
C:\Temp Lemp\AppDir\Executable.exe
C:\Documents and Settings\someuser\Desktop\AppDir\Executable.exe

而这些则不行:

C:\CompanyName\AppDir\Executable.exe
C:\Programfiler\AppDir\Executable.exe      <-- Program Files in norwegian
C:\Temp\AppDir\Executable.exe

我希望阅读本文的人看到了类似的行为,并且有“啊哈,你需要在 oracle glitz 驱动程序配置上调整 frob”或类似的想法。

任何人?


后续#1:好的,我现在已经处理了 procmon 输出,这两个文件来自我点击尝试打开触发级联失败的窗口的按钮,并且我注意到它们保留了大多数情况下,两个文件的顶部附近有一些细微的差异,并且它们会跟踪很长一段距离。

但是,当一个运行失败时,另一个运行会继续运行,日志输出的接下来几行如下:

ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll    SUCCESS    Offset: 274 432, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O
ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll    SUCCESS    Offset: 233 472, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O

此后,工作运行继续执行,另一个运行在线程关闭之前多次触及 mscorwks.dll 文件,并且应用程序关闭。 因此,失败的运行不会触及上述文件。


后续#2: 我想尝试升级 oracle 客户端驱动程序,但 10.2.0.1 显然是 Windows 2003 服务器和 XP 客户端可用的最高版本。


后续#3:好吧,我们最终得到了一个黑盒解决方案。 基本上我们发现问题与 XPO 和 Oracle 有关。 XPO 有一个它管理的系统表,称为 XPObjectType,包含三列:Oid、TypeName 和 AssemblyName。 由于 Oracle 在我们所讨论的数据库中的配置方式,列名称为 OID、TYPENAME 和 ASSEMBLYNAME。 这通常不会成为问题,除非 XPO 直接与架构信息对话并检查该表是否具有正确的列名,并且 XPO 不处理大小写差异,因此它会看到一个 XPObjectType 表包含三个未知列并且没有列那些它所期望的。

我现在不太清楚 XPO 到底做了什么,但如果我删除了该表,并使用正确的大小写重新创建它,并在所有列名称周围使用双引号以使大小写正确,则问题就不会出现。

我仍然不知道文件夹名称中的空格到底在哪里,但这个问题有两层:

  1. 阻止应用程序在我们的客户处崩溃,短期解决方案
  2. 修复错误,长期解决方案

现在第一层解决后,第 2 层将暂时放回队列并优先处理。 无论如何,我们的数据层都面临着一些更大的变化,所以这可能不是我们需要解决的问题,至少如果我们所有的 Oracle 客户都验证表修复实际上消除了问题。

我会接受 Dave Markle 的答案,因为尽管 Process Monitor(文件监视器的老大哥)没有实际上查明了问题,我能够使用它来确定在 XPO 为该表构建查询的用户代码中的断点之后,在记录应用程序关闭的所有条目之前不会发生 I/O,这让我相信这张桌子是罪魁祸首,或者至少以某种方式影响了问题。

如果我设法找到真正的原因,我会更新这篇文章。

We have an x-files problem with our .NET application. Or, rather, hybrid Win32 and .NET application.

When it attempts to communicate with Oracle, it just dies. Vanishes. Goes to the big black void in the sky. No event log message, no exception, no nothing.

If we simply ask the application to talk to a MS SQL Server instead, which has the effect of replacing the usage of OracleConnection and related classes with SqlConnection and related classes, it works as expected.

Today we had a breakthrough.

For some reason, a customer had figured out that by placing all the application files in a directory on his desktop, it worked as expected with Oracle as well. Moving the directory down to the root of the drive, or in C:\Temp or, well, around a bit, made the crash reappear.

Basically it was 100% reproducable that the application worked if run from directory on desktop, and failed if run from directory in root.

Today we figured out that the difference that counted was wether there was a space in the directory name or not.

So, these directories would work:

C:\Program Files\AppDir\Executable.exe
C:\Temp Lemp\AppDir\Executable.exe
C:\Documents and Settings\someuser\Desktop\AppDir\Executable.exe

whereas these would not:

C:\CompanyName\AppDir\Executable.exe
C:\Programfiler\AppDir\Executable.exe      <-- Program Files in norwegian
C:\Temp\AppDir\Executable.exe

I'm hoping someone reading this has seen similar behavior and have a "aha, you need to twiddle the frob on the oracle glitz driver configuration" or similar.

Anyone?


Followup #1: Ok, I've processed the procmon output now, both files from when I hit the button that attempts to open the window that triggers the cascade failure, and I've noticed that they keep track mostly, there's some smallish differences near the top of both files, and they they keep track a long way down.

However, when one run fails, the other keeps going and the next few lines of the log output are these:

ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll    SUCCESS    Offset: 274 432, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O
ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll    SUCCESS    Offset: 233 472, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O

After this, the working run continues to execute, and the other touches the mscorwks.dll files a few times before threads close down and the app closes. Thus, the failed run does not touch the above files.


Followup #2: Figured I'd try to upgrade the oracle client drivers, but 10.2.0.1 is apparently the highest version available for Windows 2003 server and XP clients.


Followup #3: Well, we've ended up with a black-box solution. Basically we found that the problem is somewhere related to XPO and Oracle. XPO has a system-table it manages, called XPObjectType, with three columns: Oid, TypeName and AssemblyName. Due to how Oracle is configured in the databases we talk to, the column names were OID, TYPENAME and ASSEMBLYNAME. This would ordinarily not be a problem, except that XPO talks to the schema information directly and checks if the table is there with the right column names, and XPO doesn't handle case differences so it sees a XPObjectType table with three unknown columns and none of those it expects.

Exactly what XPO does now I don't really know, but if I dropped this table, and recreated it with the right case, using double quotes around all the column names to get the case right, the problem doesn't crop up.

Exactly where the space in the folder name comes into this, I still have no idea, but this problem had two tiers:

  1. Stop the application from crashing at our customers, short-term solution
  2. Fix the bug, long-term solution

Right now tier 1 is solved, tier 2 will be put back into the queue for now and prioritized. We're facing some bigger changes to our data tier anyway so this might not be a problem we need to solve, at least if all our Oracle-customers verify that the table-fix actually gets rid of the problem.

I'll accept the answer by Dave Markle since though Process Monitor (the big brother of File Monitor) didn't actually pinpoint the problem, I was able to use it to determine that after my breakpoint in user-code where XPO had built up the query for this table, no I/O happened until all the entries for the application closing down was logged, which led me to believe it was this table that was the culprit, or at least influenced the problem somehow.

If I manage to get to the real cause of this, I'll update the post.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

寂寞笑我太脆弱 2024-07-15 15:08:14

这就是我要做的。 首先,三次检查您是否看到了您认为看到的行为。 我可以看到这种情况以相反的方式发生,不使用 System.IO.Path 来连接路径,但不像你看到的那样。 三次检查文件权限是否有意义。

接下来,从 MS 下载 Filemon 并观察文件系统上发生的情况,如下所示你的程序遇到了这些麻烦的地方。 您可以过滤掉特定的文件活动(例如,删除防病毒文件活动),以使所有内容在执行此操作时看起来更干净一些。 使用 FileMon 查找程序的文件访问错误,包括成功案例和错误案例。 这应该会告诉您正在访问哪个文件并导致问题。 例如,如果您在访问无意义的文件名时看到 FILE_NOT_FOUND 错误,则可以确信您或供应商做错了什么,可能会导致您的问题......

Here's what I would do. First, TRIPLE-check that you're seeing the behavior you think you're seeing. I can see this happening the other way around by not using System.IO.Path to concatenate paths, but not like you're seeing it. Triple-check that the file permissions make sense.

Next, download Filemon from MS and watch what's happening on the filesystem as your program hits these troubled spots. You can filter out specific file activity (removing your anti-virus file activity, for example) to make everything look a bit cleaner while you do this. Look for file access errors using FileMon for both the success case and the error case for your program. That should point you to what file's being accessed and causing the problem. For example, if you see a FILE_NOT_FOUND error accessing a nonsense filename, you can be assured that you or the vendor are doing something wrong, possibly leading to your problem...

半岛未凉 2024-07-15 15:08:14

您可能应该看看是否可以使用一个仅尝试打开与 Oracle 的连接的简单应用程序来重现该问题。 这样您就可以 100% 确定问题出在 OracleConnection 或 Oracle 驱动程序上,而不是您自己的代码上。

You should probably see if you can reproduce the problem it with a simple application that only tries to open a connection to Oracle. That way you can be 100% sure that the problem is with OracleConnection or the Oracle driver and not with your own code.

千紇 2024-07-15 15:08:14

你应该因为坚持不懈而获得一枚奖章!

“我不知道 XPO 现在到底做什么
真的知道,但如果我放弃这个
表,并用右侧重新创建它
情况下,全部使用双引号
用于获取案例的列名称
是的,问题并没有出现。

文件夹中空间的具体位置
名字进来了,我还是没有
想法”

对名称中的空格遇到的问题是,它们通常将空格之前的位解释为名称,其余部分解释为参数。如果是这种情况,那么使用普通名称,它可以看到“C\Temp”和它是一个目录,使用空格名称,它会获取“C:\Program Files”,查找“C:\Program”,但它会失败,例如覆盖“C:\Temp”。但会成功写入“C:\Program”。
想知道如果存在名为“C:\Program”的文件或目录,“C:\Program Files”是否仍然会失败

You should get a medal for perseverance for that !.

"Exactly what XPO does now I don't
really know, but if I dropped this
table, and recreated it with the right
case, using double quotes around all
the column names to get the case
right, the problem doesn't crop up.

Exactly where the space in the folder
name comes into this, I still have no
idea"

The issues I get with spaces in names is that they generally interpret the bit before the space as the name and the rest as a parameter. If that is the case, then with the plain name it can see "C\Temp" and it is a directory. With the spaced name, it gets "C:\Program Files", looks for "C:\Program" and that doesn't exist. It would fail, for example, to overwrite "C:\Temp" but would succeed in writing "C:\Program".
Wonder whether it would still fail with "C:\Program Files" if there is a file or directory called "C:\Program"

放低过去 2024-07-15 15:08:14

我怀疑预言机客户端是诚实的。 遇到了一个类似的令人沮丧的问题。

如果我们安装在 64 位计算机上,即使应用程序是 32 位,客户端在连接到 Oracle 时也会在启动时停止。 我们最终追踪到,是某个oracle客户端(Ora 10的路径中存在括号问题,因此在程序文件下运行的程序可以在程序文件(x86)下运行)导致了崩溃。更新机器以使用11G客户端修复了问题,但也有一些来自metalink的补丁不能直接使用,在你的情况下奇怪的是你没有得到例外,但将应用程序移动到新文件夹的行为以类似的方式修复了问题。可能相关

ORA-12154: TNS: 无法解析指定的连接标识符。
或者
ORA-6413: 连接未打开。

有用的链接 http://blogs. msdn.com/debarchan/archive/2009/02/04/good-old-connectivity-issue.aspx

详细信息来自下面的 Metalink。

Metalink Bug 3807408 无法使用用户名中的引号对用户进行外部身份验证

描述
如果外部身份验证的用户名包含 '(',')' 或 '='
那么用户就无法被认证。
此外,如果程序名称/路径包含这些字符
可能无法连接。
例如:
安装在目录“C:\Program Files (x86)”中的 Windows 客户端
无法连接
ORA-12154: TNS: 无法解析指定的连接标识符

此问题的特点是 Net 跟踪(级别 16)显示
问题字符被替换为“?” 在踪迹中。

解决方法
对于认证问题:
更改用户名,
或者
不要对这些用户使用远程操作系统身份验证

对于程序/目录问题:
更改程序/目录名称

I´d suspect the oracle client to be honest. Had a problem which was similar in it´s frustrating nature.

If we installed on 64 bit machines the client would stop at start when connecting to oracle even though the app is 32 bit. We eventually tracked it down to the fact that a certain oracle client (Ora 10 had a problem with brackets in the path so a program running under program files would work under program files (x86) caused the crash. Updating the machine to use the 11G client fixed the problem but there were also some patches available from metalink which are not directly available. Whats strange in your case is that you get no exception but the behaviour of moving the application to a new folder fixes the issue in a similar way so it may be related.

ORA-12154: TNS:could not resolve the connect identifier specified
or
ORA-6413: Connection not open.

Useful links http://blogs.msdn.com/debarchan/archive/2009/02/04/good-old-connectivity-issue.aspx

Details from Metalink below.

Metalink Bug 3807408 Cannot externally authenticate user with quote in username

Description
If an externally authenticated username contains a '(',')' or '='
then the user cannot be authenticated.
Additionally if a program name / path contains these characters it
may not be possible to connect .
eg:
Windows clients installed in a directory "C:\Program Files (x86)"
fail to connect with
ORA-12154: TNS:could not resolve the connect identifier specified

The hallmark of this problem is that the Net trace (level 16) shows
the problem character/s replaced by a "?" in the trace.

Workaround
For the authentication problem:
change username,
or
do not use remote OS authentication for those users

For the program / directory issue:
change the program/directory name

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文