发送 WCF 消息在负载下延迟
当从自托管 WCF 服务向许多客户端(大约 10 个左右)发送消息时,有时消息的延迟时间比我预期的要长得多(发送到本地网络上的客户端需要几秒钟)。有谁知道为什么会这样以及如何解决它?
一些背景:该应用程序是股票行情式服务。它从第三方服务器接收消息并将它们重新发布到连接到该服务的客户端。尽可能快地发布消息非常重要,在大多数情况下,接收消息到将其发布到所有客户端之间的时间小于 50 毫秒(速度如此之快,以至于接近 DateTime.Now 的分辨率)。
在过去的几周里,我们一直在监控一些消息延迟 2 或 3 秒的情况。几天前,我们遇到了一个大高峰,消息延迟了 40-60 秒。据我所知,消息没有被丢弃(除非整个连接被丢弃)。这些延误似乎并非针对任何一位客户;它会影响所有客户端(包括本地网络上的客户端)。
我通过向线程池发送垃圾邮件来向客户端发送消息。消息到达后,我会为每个客户端的每条消息调用一次 BeginInvoke() 。其理论是,如果任何一个客户端接收消息的速度很慢(因为它正在拨号并下载更新或其他内容),则不会影响其他客户端。但这不是我所观察到的;看来所有客户端(包括本地网络上的客户端)都会受到类似持续时间的延迟影响。
我正在处理的消息量是每秒 100-400 条。消息包含字符串、guid、日期以及 10-30 个整数(具体取决于消息类型)。我观察到它们使用 Wireshark 时每个都小于 1kB。我们随时有 10-20 个客户端连接。
WCF 服务器托管在 Windows 2003 Web Edition Server 上的 Windows 服务中。我正在使用启用了 SSL/TLS 加密和自定义用户名/密码身份验证的 NetTCP 绑定。它具有 4Mbit 互联网连接、双核 CPU 和 1GB RAM,专用于此应用程序。该服务设置为 ConcurrencyMode.Multiple。服务进程即使在高负载下,CPU使用率也很少超过20%。
到目前为止,我已经调整了各种WCF配置选项,例如:
- serviceBehaviors/serviceThrotdling/maxConcurrentSessions(当前为102)
- serviceBehaviors/serviceThrotdling/maxConcurrentCalls(当前为64)
- 绑定/netTcpBinding/binding/maxConnections(当前为100)
- 绑定/netTcpBinding/binding/listenBacklog (当前100)
- 绑定/netTcpBinding/绑定/sendTimeout(当前45秒,虽然我已经尝试过高达3分钟)
在我看来,一旦达到某个阈值,消息就会在WCF内排队(因此我为什么增加节流限制)。但要影响所有客户端,就需要最大化与一两个慢速客户端的所有传出连接。有谁知道 WCF 内部结构是否属实?
当我将传入消息发送给客户端时,我还可以通过合并传入消息来提高效率。然而,我怀疑存在一些潜在的问题,从长远来看,合并并不能解决问题。
WCF配置(公司名称已更改):
<system.serviceModel>
<host>
<baseAddresses>
<add baseAddress="net.tcp://localhost:8100/Publisher"/>
</baseAddresses>
</host>
<endpoint address="ThePublisher"
binding="netTcpBinding"
bindingConfiguration="Tcp"
contract="Company.Product.Server.Publisher.IPublisher" />
</behavior>
用于发送消息的代码:
Private Sub HandleDataBackground(ByVal sender As Object, ByVal e As Timers.ElapsedEventArgs)
If Me._FeedDataQueue.Count > 0 Then
' Dequeue any items received in last 50ms.
While True
Dim dataAndReceivedTime As DataWithReceivedTimeArg
SyncLock Me._FeedDataQueue
If Me._FeedDataQueue.Count = 0 Then Exit While
dataAndReceivedTime = Me._FeedDataQueue.Dequeue()
End SyncLock
' Publish data to all clients.
Me.SendDataToClients(dataAndReceivedTime)
End While
End If
End Sub
Private Sub SendDataToClients(ByVal data As DataWithReceivedTimeArg)
Dim clientsToReceive As IEnumerable(Of ClientInformation)
SyncLock Me._ClientInformation
clientsToReceive = Me._ClientInformation.Values.Where(Function(c) Contract.CollectionContains(c.ContractSubscriptions, data.Data.Contract) AndAlso c.IsUsable).ToList()
End SyncLock
For Each clientInfo In clientsToReceive
Dim futureChangeMethod As New InvokeClientCallbackDelegate(Of DataItem)(AddressOf Me.InvokeClientCallback)
futureChangeMethod.BeginInvoke(clientInfo, data.Data, AddressOf Me.SendDataToClient)
Next
End Sub
Private Sub SendDataToClient(ByVal callback As IFusionIndicatorClientCallback, ByVal data As DataItem)
' Send
callback.ReceiveData(data)
End Sub
Private Sub InvokeClientCallback(Of DataT)(ByVal client As ClientInformation, ByVal data As DataT, ByVal method As InvokeClientCallbackMethodDelegate(Of DataT))
Try
' Send
If client.IsUsable Then
method(client.CallbackObject, data)
client.LastContact = DateTime.Now
Else
' Make sure the callback channel has been removed.
SyncLock Me._ClientInformation
Me._ClientInformation.Remove(client.SessionId)
End SyncLock
End If
Catch ex As CommunicationException
....
Catch ex As ObjectDisposedException
....
Catch ex As TimeoutException
....
Catch ex As Exception
....
End Try
End Sub
其中一种消息类型的示例:
<DataContract(), KnownType(GetType(DateTimeOffset)), KnownType(GetType(DataItemDepth)), KnownType(GetType(DataItemDepthDetail)), KnownType(GetType(DataItemHistory))> _
Public MustInherit Class DataItem
Implements ICloneable
Protected _Contract As String
Protected _MessageId As Guid
Protected _TradeDate As DateTime
<DataMember()> _
Public Property Contract() As String
...
End Property
<DataMember()> _
Public Property MessageId() As Guid
...
End Property
<DataMember()> _
Public Property TradeDate() As DateTime
...
End Property
Public MustOverride Function Clone() As Object Implements System.ICloneable.Clone
End Class
<DataContract()> _
Public Class DataItemDepth
Inherits DataItem
Protected _VolumnPriceDetail As IList(Of DataItemDepthItem)
<DataMember()> _
Public Property VolumnPriceDetail() As IList(Of DataItemDepthItem)
...
End Property
Public Overrides Function Clone() As Object
...
End Function
End Class
<DataContract()> _
Public Class DataItemDepthItem
Protected _Volume As Int32
Protected _Price As Int32
Protected _BidOrAsk As BidOrAsk ' BidOrAsk is an Int32 enum
Protected _Level As Int32
<DataMember()> _
Public Property Volume() As Int32
...
End Property
<DataMember()> _
Public Property Price() As Int32
...
End Property
<DataMember()> _
Public Property BidOrAsk() As BidOrAsk ' BidOrAsk is an Int32 enum
...
End Property
<DataMember()> _
Public Property Level() As Int32
...
End Property
End Class
When sending messages from a self hosted WCF service to many clients (about 10 or so), sometimes messages are being delayed significantly longer than I'd expect (several seconds to send to a client on local network). Does anyone have an idea why this would be and how to fix it?
Some background: the application is a stock ticker style service. It receives messages from a 3rd party server and re-publishes them to clients that connect to the service. It's very important that messages are published as quickly as possible, and in most cases the time between receiving a message and publishing it to all clients is less than 50ms (it's so quick it approaches the resolution of DateTime.Now).
Over the past few weeks, we've been monitoring some occasions when messages are delayed by 2 or 3 seconds. A few days ago, we got a big spike and messages were being delayed by 40-60 seconds. Messages are not being dropped as far as I can tell (unless the entire connection is dropped). The delays does not appear to be specific to any one client; it affects all clients (including ones on the local network).
I send messages to the clients by spamming the ThreadPool. As quickly as messages arrive I call BeginInvoke() once per message per client. The theory being that if any one client is slow to receive a message (because it's on dialup and downloading updates or something) that it won't impact other clients. That isn't what I'm observing though; it appears that all clients (including ones on the local network) are impacted by the delay by a similar duration.
The volume of messages I'm dealing with is 100-400 per second. Messages contain a string, a guid, a date and, depending on the message type, 10-30 integers. I've observed them using Wireshark as being less than 1kB each. We have 10-20 clients connected at any one time.
The WCF server is being hosted in a Windows service on a Windows 2003 Web Edition Server. I'm using the NetTCP binding with SSL/TLS encryption enabled and a custom username / password authentication. It has a 4Mbit internet connection, dual core CPU and 1GB ram and is dedicated to this application. The service is set to ConcurrencyMode.Multiple. The service process, even under high load, rarely exceeds 20% CPU usage.
So far, I've tweaked various WCF configuration options such as:
- serviceBehaviors/serviceThrottling/maxConcurrentSessions (currently 102)
- serviceBehaviors/serviceThrottling/maxConcurrentCalls (currently 64)
- bindings/netTcpBinding/binding/maxConnections (currently 100)
- bindings/netTcpBinding/binding/listenBacklog (currently 100)
- bindings/netTcpBinding/binding/sendTimeout (currently 45s, although I've tried it as high as 3 minutes)
It appears to me like the messages are being queued inside WCF once some threshold is reached (hence why I've being increasing the throttling limits). But to affect all clients it would need to max out all outgoing connections with one or two slow clients. Does anyone know if this is true of the WCF internals?
I can also improve efficiency by coalescing incoming messages when I send them to the client. However, I suspect there's something underlying going on and coalescing won't fix the problem in the long term.
WCF Config (with company names changed):
<system.serviceModel>
<host>
<baseAddresses>
<add baseAddress="net.tcp://localhost:8100/Publisher"/>
</baseAddresses>
</host>
<endpoint address="ThePublisher"
binding="netTcpBinding"
bindingConfiguration="Tcp"
contract="Company.Product.Server.Publisher.IPublisher" />
</behavior>
Code used to send messages:
Private Sub HandleDataBackground(ByVal sender As Object, ByVal e As Timers.ElapsedEventArgs)
If Me._FeedDataQueue.Count > 0 Then
' Dequeue any items received in last 50ms.
While True
Dim dataAndReceivedTime As DataWithReceivedTimeArg
SyncLock Me._FeedDataQueue
If Me._FeedDataQueue.Count = 0 Then Exit While
dataAndReceivedTime = Me._FeedDataQueue.Dequeue()
End SyncLock
' Publish data to all clients.
Me.SendDataToClients(dataAndReceivedTime)
End While
End If
End Sub
Private Sub SendDataToClients(ByVal data As DataWithReceivedTimeArg)
Dim clientsToReceive As IEnumerable(Of ClientInformation)
SyncLock Me._ClientInformation
clientsToReceive = Me._ClientInformation.Values.Where(Function(c) Contract.CollectionContains(c.ContractSubscriptions, data.Data.Contract) AndAlso c.IsUsable).ToList()
End SyncLock
For Each clientInfo In clientsToReceive
Dim futureChangeMethod As New InvokeClientCallbackDelegate(Of DataItem)(AddressOf Me.InvokeClientCallback)
futureChangeMethod.BeginInvoke(clientInfo, data.Data, AddressOf Me.SendDataToClient)
Next
End Sub
Private Sub SendDataToClient(ByVal callback As IFusionIndicatorClientCallback, ByVal data As DataItem)
' Send
callback.ReceiveData(data)
End Sub
Private Sub InvokeClientCallback(Of DataT)(ByVal client As ClientInformation, ByVal data As DataT, ByVal method As InvokeClientCallbackMethodDelegate(Of DataT))
Try
' Send
If client.IsUsable Then
method(client.CallbackObject, data)
client.LastContact = DateTime.Now
Else
' Make sure the callback channel has been removed.
SyncLock Me._ClientInformation
Me._ClientInformation.Remove(client.SessionId)
End SyncLock
End If
Catch ex As CommunicationException
....
Catch ex As ObjectDisposedException
....
Catch ex As TimeoutException
....
Catch ex As Exception
....
End Try
End Sub
A sample of one of the message types:
<DataContract(), KnownType(GetType(DateTimeOffset)), KnownType(GetType(DataItemDepth)), KnownType(GetType(DataItemDepthDetail)), KnownType(GetType(DataItemHistory))> _
Public MustInherit Class DataItem
Implements ICloneable
Protected _Contract As String
Protected _MessageId As Guid
Protected _TradeDate As DateTime
<DataMember()> _
Public Property Contract() As String
...
End Property
<DataMember()> _
Public Property MessageId() As Guid
...
End Property
<DataMember()> _
Public Property TradeDate() As DateTime
...
End Property
Public MustOverride Function Clone() As Object Implements System.ICloneable.Clone
End Class
<DataContract()> _
Public Class DataItemDepth
Inherits DataItem
Protected _VolumnPriceDetail As IList(Of DataItemDepthItem)
<DataMember()> _
Public Property VolumnPriceDetail() As IList(Of DataItemDepthItem)
...
End Property
Public Overrides Function Clone() As Object
...
End Function
End Class
<DataContract()> _
Public Class DataItemDepthItem
Protected _Volume As Int32
Protected _Price As Int32
Protected _BidOrAsk As BidOrAsk ' BidOrAsk is an Int32 enum
Protected _Level As Int32
<DataMember()> _
Public Property Volume() As Int32
...
End Property
<DataMember()> _
Public Property Price() As Int32
...
End Property
<DataMember()> _
Public Property BidOrAsk() As BidOrAsk ' BidOrAsk is an Int32 enum
...
End Property
<DataMember()> _
Public Property Level() As Int32
...
End Property
End Class
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
经过向 Microsoft 支持人员提出长期支持请求后,我们设法确定了该问题。
使用 Begin/End Invoke 委托模式调用 WCF 通道方法实际上会变成同步调用,而不是异步。
异步调用 WCF 方法的正确方法是除异步委托之外的任何方法,其中可能包括线程池、原始线程或 WCF 异步回调。
最后我使用了 WCF 异步回调(可应用于一个回调接口,尽管我找不到具体的例子)。
以下链接使这一点更加明确:
https://learn.microsoft.com/en-us/存档/博客/drnick/begininvoke-bugs
After a long support request with Microsoft support, we managed to identify the issue.
Calling WCF channel methods using Begin/End Invoke delegate pattern actually turns into synchronous calls, not asynchronous.
The correct way to asynchronously call WCF methods is by any way except async delegates, which may include the thread pool, raw threads or WCF async callbacks.
In the end I used WCF async callbacks (which can be applied to a callback interface, although I couldn't find specific examples of that).
The following link makes this more explicit:
https://learn.microsoft.com/en-us/archive/blogs/drnick/begininvoke-bugs