当一个实例重新启动时,Azure WCF 主机会导致通信错误

发布于 2024-09-27 12:48:25 字数 1626 浏览 2 评论 0原文

我对 Azure 中的 WCF 主机有一个相当具体的问题。请耐心听我描述情况。

我们有一个使用网络 TCP 绑定托管在 Azure 辅助角色中的 WCF 主机。我们运行此辅助角色的两个实例以提供冗余。由于与我们的问题无关的原因,我们通过每小时更改配置设置来强制重新启动这些实例。由于升级域,一个实例在第二个实例之前重新启动,这意味着我们始终至少有一个实例在运行。

我们的客户端代码(也在Azure上运行,但我认为它在哪里并不重要)看起来与此非常相似(函数名称更改为夸大这一点):

public BrowseResults Browse(BrowseParameters parameters)
{
    using (Proxy client = CreateProxyWithBindingsAndEndPoints())
    {
        return client.Browse(parameters);
    }
}

private Proxy CreateProxyWithBindingsAndEndPoints()
{
    var binding = new NetTcpBinding(SecurityMode.Transport);

    binding.Security.Transport.ClientCredentialType = TcpClientCredentialType.Certificate;
    binding.Security.Transport.ProtectionLevel = ProtectionLevel.EncryptAndSign;

    var epAddress = new EndpointAddress(
        new Uri("http://myapp.cloudapp.net:1000/myservice"),
        new DnsEndpointIdentity("my identity"),
        new AddressHeaderCollection());

    var client = new Proxy(binding, epAddress);

    client.ClientCredentials.ClientCertificate.Certificate = GetClientCertificate();

    return client;
}

我对此的期望是我们正在创建一个新的代理,每次我们调用此浏览函数时都会有一个新的通道和一个新的连接。

当其中一个实例重新启动时,就会出现问题,我们收到 System.ServiceModel.CommunicationObjectFaultedException: The communications object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communications because it is in the Faulted state 错误。现在,对于每个重新启动的主机,我们只收到其中一个错误,但这仍然是一个我们无法避免的错误。

我当前的工作假设是,WCF 客户端在幕后的某个地方保持着与不再存在的实例的连接,尽管事实上我读到的所有内容都表明它不应该存在。

除了捕获这个特定错误并重试之外,我还能做些什么来避免这个问题吗?是否有重试客户端调用的模式?如果我重试,如何确保这种不可靠的连接确实已被消除?到目前为止,我的重试尝试还不是很成功。

I have a rather specific issue with a WCF host in Azure. Please bear with me as I describe the situation.

We have a WCF host hosted in and Azure worker role using an net TCP binding. We have two instances of this worker role running to provide redundancy. For reasons that are irrelevant to our problem, we force a restart to these instances by changing the config settings every hour. Thanks to the upgrade domains, one instance restarts before the second instance meaning we always have at least one instance running.

Our client code (also running on Azure, but I don't think it would matter where it was) looks very similar to this (function names changed to exaggerate the point):

public BrowseResults Browse(BrowseParameters parameters)
{
    using (Proxy client = CreateProxyWithBindingsAndEndPoints())
    {
        return client.Browse(parameters);
    }
}

private Proxy CreateProxyWithBindingsAndEndPoints()
{
    var binding = new NetTcpBinding(SecurityMode.Transport);

    binding.Security.Transport.ClientCredentialType = TcpClientCredentialType.Certificate;
    binding.Security.Transport.ProtectionLevel = ProtectionLevel.EncryptAndSign;

    var epAddress = new EndpointAddress(
        new Uri("http://myapp.cloudapp.net:1000/myservice"),
        new DnsEndpointIdentity("my identity"),
        new AddressHeaderCollection());

    var client = new Proxy(binding, epAddress);

    client.ClientCredentials.ClientCertificate.Certificate = GetClientCertificate();

    return client;
}

My expectation from this is that we are creating a new Proxy, with a new channel and a new connection every time we call this Browse function.

Our problem occurs when one of the instances is restarted we get System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state errors. Now we only get one of these errors for each of the hosts that restarts, but it's still an error we won't to do without.

My current working hypothesis is that somewhere under the hood the WCF client is holding open a connection to the instance that is no longer there, despite the fact that everything I've read says that it shouldn't be.

Is there anything I can do to avoid this problem other than just catching this particular error and retrying? Are there any patterns for retrying client calls? If I do retry how can I ensure that this dodgy connection really has been done away with? My attempts at retries so far haven't been very successful.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

简单爱 2024-10-04 12:48:25

经过大量调查后,问题似乎不在于客户端,而在于服务器。辅助角色正在 OnRun 中启动 WCF 主机。问题是,当辅助角色到达 OnRun 事件时,它已经向负载均衡器发出信号,表明它已准备好接收网络流量。由于主持人还没有真正开始,所以还没有真正准备好。

解决方案是将启动 WCF 主机的代码移至 OnStart 方法。

我们还创建了一些非常好的 WCF 客户端重试代码。现在我们似乎不需要了。

After quite a bit of investigation the problem appears to be not with the client, but with the server. The worker role was starting the WCF host in the OnRun. The problem is that by the time the worker role gets to the OnRun event, it has already signalled to the load balancer that it's ready to receive network traffic. Seeing as the host hadn't actually started yet, it wasn't really ready.

The solution was to move the code which starts the WCF host to the OnStart method.

We also created some pretty nice WCF client retry code. That now we don't seem to need.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文