当一个实例重新启动时，Azure WCF 主机会导致通信错误

发布于 2024-09-27 12:48:25 字数 1626 浏览 2 评论 0原文

我对 Azure 中的 WCF 主机有一个相当具体的问题。请耐心听我描述情况。

我们有一个使用网络 TCP 绑定托管在 Azure 辅助角色中的 WCF 主机。我们运行此辅助角色的两个实例以提供冗余。由于与我们的问题无关的原因，我们通过每小时更改配置设置来强制重新启动这些实例。由于升级域，一个实例在第二个实例之前重新启动，这意味着我们始终至少有一个实例在运行。

我们的客户端代码（也在Azure上运行，但我认为它在哪里并不重要）看起来与此非常相似（函数名称更改为夸大这一点）：

public BrowseResults Browse(BrowseParameters parameters)
{
    using (Proxy client = CreateProxyWithBindingsAndEndPoints())
    {
        return client.Browse(parameters);
    }
}

private Proxy CreateProxyWithBindingsAndEndPoints()
{
    var binding = new NetTcpBinding(SecurityMode.Transport);

    binding.Security.Transport.ClientCredentialType = TcpClientCredentialType.Certificate;
    binding.Security.Transport.ProtectionLevel = ProtectionLevel.EncryptAndSign;

    var epAddress = new EndpointAddress(
        new Uri("http://myapp.cloudapp.net:1000/myservice"),
        new DnsEndpointIdentity("my identity"),
        new AddressHeaderCollection());

    var client = new Proxy(binding, epAddress);

    client.ClientCredentials.ClientCertificate.Certificate = GetClientCertificate();

    return client;
}

我对此的期望是我们正在创建一个新的代理，每次我们调用此浏览函数时都会有一个新的通道和一个新的连接。

当其中一个实例重新启动时，就会出现问题，我们收到 System.ServiceModel.CommunicationObjectFaultedException: The communications object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communications because it is in the Faulted state 错误。现在，对于每个重新启动的主机，我们只收到其中一个错误，但这仍然是一个我们无法避免的错误。

我当前的工作假设是，WCF 客户端在幕后的某个地方保持着与不再存在的实例的连接，尽管事实上我读到的所有内容都表明它不应该存在。

除了捕获这个特定错误并重试之外，我还能做些什么来避免这个问题吗？是否有重试客户端调用的模式？如果我重试，如何确保这种不可靠的连接确实已被消除？到目前为止，我的重试尝试还不是很成功。

原文

I have a rather specific issue with a WCF host in Azure. Please bear with me as I describe the situation.

We have a WCF host hosted in and Azure worker role using an net TCP binding. We have two instances of this worker role running to provide redundancy. For reasons that are irrelevant to our problem, we force a restart to these instances by changing the config settings every hour. Thanks to the upgrade domains, one instance restarts before the second instance meaning we always have at least one instance running.

Our client code (also running on Azure, but I don't think it would matter where it was) looks very similar to this (function names changed to exaggerate the point):

public BrowseResults Browse(BrowseParameters parameters)
{
    using (Proxy client = CreateProxyWithBindingsAndEndPoints())
    {
        return client.Browse(parameters);
    }
}

private Proxy CreateProxyWithBindingsAndEndPoints()
{
    var binding = new NetTcpBinding(SecurityMode.Transport);

    binding.Security.Transport.ClientCredentialType = TcpClientCredentialType.Certificate;
    binding.Security.Transport.ProtectionLevel = ProtectionLevel.EncryptAndSign;

    var epAddress = new EndpointAddress(
        new Uri("http://myapp.cloudapp.net:1000/myservice"),
        new DnsEndpointIdentity("my identity"),
        new AddressHeaderCollection());

    var client = new Proxy(binding, epAddress);

    client.ClientCredentials.ClientCertificate.Certificate = GetClientCertificate();

    return client;
}

My expectation from this is that we are creating a new Proxy, with a new channel and a new connection every time we call this Browse function.

Our problem occurs when one of the instances is restarted we get System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state errors. Now we only get one of these errors for each of the hosts that restarts, but it's still an error we won't to do without.

My current working hypothesis is that somewhere under the hood the WCF client is holding open a connection to the instance that is no longer there, despite the fact that everything I've read says that it shouldn't be.

Is there anything I can do to avoid this problem other than just catching this particular error and retrying? Are there any patterns for retrying client calls? If I do retry how can I ensure that this dodgy connection really has been done away with? My attempts at retries so far haven't been very successful.

分享到QQ

分享到微博