“等待服务连接时超时”重启后出错

发布于 2024-08-16 16:38:22 字数 598 浏览 2 评论 0原文

我有一个自定义编写的 Windows 服务,在许多 Hyper-V VM 上运行。作为某些正在运行的自动化测试的一部分,虚拟机每小时重新启动几次。该服务设置为自动启动,几乎所有时间都可以正常启动。

然而,也许有 5% 的时间,由于我无法辨别任何模式,服务无法启动。当它失败时,我在事件查看器中收到错误消息

等待“我的服务名称”服务连接时超时(30000 毫秒)。

出现这种情况时,我可以手动启动该服务,或者重新启动该服务,该服务将正常启动。

我不明白的是,我的代码中似乎没有发生 30 秒超时。我的服务类的 OnStart() 方法的第一行将“Starting...”记录到其 log4net 日志中。当服务无法启动时,我什至根本没有记录任何内容,这表明 log4net 出于某种原因无法记录,或者在调用 OnStart() 之前发生超时。

该服务运行在各种操作系统上,从 XP 一直到 Win7 和 2008R2,我知道将服务设置为延迟启动可能会解决 Vista 及更高版本的问题,但这似乎是一种黑客行为。

我无法远程调试此问题,因为它在系统启动期间间歇性地发生,并且我不知道进一步的方法来尝试弄清楚发生了什么。有什么想法吗?

I have a custom-written Windows service that I run on a number of Hyper-V VMs. The VMs get rebooted a couple times an hour as part of some automated tests being run. The service is set to automatic start and almost all of the time, it starts up fine.

However, maybe 5% of the time, with no pattern that I can discern, the service fails to start. When it fails, I get an error in Event Viewer saying

A timeout was reached (30000 milliseconds) while waiting for the My Service Name service to connect.

When this occurs, I can start the service manually, or restart again, and the service will start fine.

The thing I can't figure out is that the 30 second timeout doesn't appear to be occurring in my code. The very first line of my service class's OnStart() method logs "Starting..." to its log4net log. When the service fails to start, I don't even get anything logged at all, which indicates to me that either log4net can't log for whatever reason, or the timeout is occurring before my OnStart() gets called.

The service runs on a variety of OSes, from XP all the way up to Win7 and 2008R2, and I know that setting the service to delayed start may solve this for Vista and later, but that seems like a hack.

I haven't been able to remote debug this because of the fact that it happens so intermittently and during system startup, and I'm at a loss as to further ways to try to figure out what's going on. Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

未央 2024-08-23 16:38:22

我的猜测(仅此而已)是磁盘在启动过程中剧烈抖动,以至于 .NET Framework 本身在 Windows 分配给服务启动的 30 秒内无法启动。

一个笨拙的解决方法可能是将服务设置为手动启动,然后用非托管代码(例如 C++、Delphi)编写一个非常小的存根服务来启动该服务。

另一种方法可能是从另一台计算机远程启动服务。 sc 命令应该可以很好地完成这项工作。

My guess - and that's all it is - is that the disk is thrashing hard during startup, to the point where the .NET Framework itself isn't starting in the 30 seconds that Windows allocates for services to start.

A kludgy workaround may be to set the service to start manually, then write a very small stub service in unmanaged code (e.g. C++, Delphi) to start the service.

Another approach may be to start the service remotely from another machine. The sc command should do the job nicely.

無處可尋 2024-08-23 16:38:22

当我尝试使用 powershell 安装服务时,我在事件查看器中看到此错误。

我遇到的问题是,我的 powershell 脚本中的“服务名称”和“服务显示名称”的值与我在控制台应用程序的 program.cs 文件中指定的值不同。

I was seeing this error in the Event Viewer when trying to install a service with powershell.

The problem I had was that I had different values for "Service Name" and "Service Display Name" in my powershell script to those that I had specified in the program.cs file of my Console Application.

谎言月老 2024-08-23 16:38:22

无论如何,我发现我收到此消息(几乎在服务启动后立即),因为我没有在目标计算机上安装 4.5 版本的 .NET 框架。我将我使用的版本回滚到版本 4.0(已安装在目标计算机上)并且该服务按预期工作。

For what it's worth, I discovered that I received this message (almost immediately upon service startup) because I did not have version 4.5 of the .NET framework installed on the target machine. I rolled back the version I was using to version 4.0 (which was already installed on the target machine) and the service worked as expected.

紙鸢 2024-08-23 16:38:22

我想我可能还发现了导致这种重新启动时无法启动错误的另一个因素。

看来,如果 Windows 事件日志设置为覆盖事件 > 7 天.. 大小 512kb .. 但是在此窗口内发生了很多活动,那么事件日志实际上已满,因为它无法覆盖该时间范围内生成的事件数量。如果您将事件日志设置为更大的大小或根据需要覆盖,那么您将不会遇到此问题

I think I may have also found another contributing factor to this kind of does not start on reboot error.

It appears that if the Windows Event Log is set to Overwrite Events > 7days.. size 512kb.. But a lot of activity has occurred within this window, then Event Log is effectively full because it can't overwrite the number of events generated inside that timeframe. If you set the eventlog to a much larger size OR to Overwrite as needed then you won't experience this issue

哭了丶谁疼 2024-08-23 16:38:22

我遇到同样错误的问题是服务器上的 .Net 安装无法正常工作。

为了弄清楚这一点:

我制作了一个小型控制台应用程序,其逻辑与执行服务相同,并且我对整个代码段进行了 try-catch,将其全部转储到控制台。

不知道为什么信息没有冒泡,但我们看到了有关框架错误的有价值的消息,否则我们永远不会看到。

My issue with the same error was that the .Net installation on the server was not working correctly.

To figure this out:

I made a small console app with identical logic as the executing service, and I made a try-catch around the whole code piece, dumping it all out to console.

Not sure why the information didn't bubble up, but we saw the valuable messages about the Framework errors that we would never have seen otherwise.

放肆 2024-08-23 16:38:22

我们在 Windows 2016 Server 上也遇到同样的问题。

似乎有效的修复是将运行服务的用户从本地服务帐户更改为本地管理员(不确定原因是什么)。

We are having the same problem on Windows 2016 Server.

A fix that seems to be working is changing the user under which the service running from Local Service Account to local Administrator (not sure what's the cause).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文