解决 appfabric 问题所需的帮助
我的应用程序在 Windows Web 5 服务器的生产网络场中使用 AppFabric 作为分布式缓存模型。该应用程序是一个 .net4 c# Web 应用程序。我们在使用 AppFabric 时遇到了一些问题,并对此类设置有一些疑问。我们遇到的主要问题是,如果其中一台 Web 5 服务器重新启动,其他服务器上的站点也会在短时间内关闭,并在事件日志中出现如下所示的 appfabric 异常:
- 消息:ErrorCode:SubStatus:出现暂时故障。请稍后重试。
- 错误代码:子状态:引用的区域不存在。使用 CreateRegion API 修复错误。
我们有一个缓存提供程序包装类,它创建 datacachefactory 对象等,并用作 Web 应用程序和 appfabric 之间的中介。这是一个单例类,因此仅在该类的 Init 上创建 datacachefactory 对象的一个实例。
上面的第二个错误我相信我已经找到了原因,在我们的代码中,该区域是在一开始就在 Init ie 上创建的,但是如果一个节点来自在其内存中包含该区域的集群,那么上面的错误错误是一个结果。要解决此问题,应尝试在每个请求 appfabric 上创建该区域 - 但仅在它不存在时创建它 - 这听起来正确吗?
关于另一个错误,我认为可能是配置问题。这是集群配置 xml 文件:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="dataCache" type="Microsoft.ApplicationServer.Caching.DataCacheSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
</configSections>
<dataCache size="Small">
<caches>
<cache consistency="StrongConsistency" name="App1Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App2Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App3Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="default">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
</caches>
<hosts>
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="724664608" size="1228" leadHost="true" account="SERVER1\user"
cacheHostName="AppFabricCachingService" name="SERVER1"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="598646137" size="1228" leadHost="true" account="SERVER2\user"
cacheHostName="AppFabricCachingService" name="SERVER2"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="358039700" size="1228" leadHost="true" account="SERVER3\user"
cacheHostName="AppFabricCachingService" name="SERVER3"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="929915039" size="1228" leadHost="false" account="SERVER4\user"
cacheHostName="AppFabricCachingService" name="SERVER4"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="1752630351" size="1228" leadHost="false" account="SERVER5\user"
cacheHostName="AppFabricCachingService" name="SERVER5"
cachePort="22233" />
</hosts>
<advancedProperties>
<securityProperties>
<authorization>
<allow users="everyone" />
</authorization>
</securityProperties>
</advancedProperties>
</dataCache>
</configuration>
注意:我们设置了多个缓存,因为我们有多个使用 appfabric 的应用程序,并且它们都遇到相同的问题。
这是每台服务器上应用程序中的 web.config 条目:
<dataCacheClient requestTimeout="15000" channelOpenTimeout="3000" maxConnectionsToServer="1">
<localCache isEnabled="true" sync="TimeoutBased" ttlValue="300" objectCount="10000" />
<clientNotification pollInterval="300" maxQueueLength="10000" />
<hosts>
<host name="SERVER1" cachePort="22233" />
<host name="SERVER2" cachePort="22233" />
<host name="SERVER3" cachePort="22233" />
<host name="SERVER4" cachePort="22233" />
<host name="SERVER5" cachePort="22233" />
</hosts>
<transportProperties connectionBufferSize="131072" maxBufferPoolSize="268435456" maxBufferSize="8388608" maxOutputDelay="2" channelInitializationTimeout="60000" receiveTimeout="600000" /></dataCacheClient>
有人发现上述问题吗?正如您所看到的,我们有 3 个主要主机和 2 个辅助主机。
我随之而来的一些问题是:
- 我读过有关拥有本地缓存的内容 - 这样做的技术优势是什么? IE。这会为每个节点提供数据的本地副本吗?
- 关于端口的最佳实践是什么?上述端口是否正确,或者是否与正在使用的相同端口存在冲突?
- 3 个主要主机和 2 个辅助主机,这是推荐的分配方式吗?这是否意味着数据有3个副本?
当我们重新启动服务器时,我们尝试永远不要同时重新启动主导主机。
感谢您对此的任何反馈!
My application is using AppFabric for our distributed caching model in a production web farm of windows web 5 servers. The application is a .net4 c# web application. We are encountering some problems with AppFabric and have some questions regarding the setup of such. The main issue we have is if one of the web 5 servers is restarted, the site on the other servers will also go down for a short period of time with appfabric exceptions like the following appearing in our event logs:
- Message: ErrorCode:SubStatus:There is a temporary failure. Please retry later.
- ErrorCode:SubStatus:Region referred to does not exist. Use CreateRegion API to fix the error.
We have a cache provider wrapper class that creates the datacachefactory object etc and is used as the intermediatory between the web application and appfabric. This is a singleton class so only one instance of the datacachefactory object is created on the Init of the class.
The second error above I believe I have found the reason for, in our code the region was being created on the Init ie at the very start, but if a node comes out of the cluster that contains the region in its memorary, then the above error is a result. To resolve this issue, the region should be attempted to be created on every request appfabric - but only creating it if it does not exist - does this sound correct?
Regarding the other error, I believe it may be down to the configruation. This is the cluster config xml file:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="dataCache" type="Microsoft.ApplicationServer.Caching.DataCacheSection, Microsoft.ApplicationServer.Caching.Core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
</configSections>
<dataCache size="Small">
<caches>
<cache consistency="StrongConsistency" name="App1Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App2Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="App3Cache"
secondaries="1">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
<cache consistency="StrongConsistency" name="default">
<policy>
<eviction type="Lru" />
<expiration defaultTTL="10" isExpirable="true" />
</policy>
</cache>
</caches>
<hosts>
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="724664608" size="1228" leadHost="true" account="SERVER1\user"
cacheHostName="AppFabricCachingService" name="SERVER1"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="598646137" size="1228" leadHost="true" account="SERVER2\user"
cacheHostName="AppFabricCachingService" name="SERVER2"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="358039700" size="1228" leadHost="true" account="SERVER3\user"
cacheHostName="AppFabricCachingService" name="SERVER3"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="929915039" size="1228" leadHost="false" account="SERVER4\user"
cacheHostName="AppFabricCachingService" name="SERVER4"
cachePort="22233" />
<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234"
hostId="1752630351" size="1228" leadHost="false" account="SERVER5\user"
cacheHostName="AppFabricCachingService" name="SERVER5"
cachePort="22233" />
</hosts>
<advancedProperties>
<securityProperties>
<authorization>
<allow users="everyone" />
</authorization>
</securityProperties>
</advancedProperties>
</dataCache>
</configuration>
Note: we have multiple we caches set up as we have multiple applications using appfabric, and seeing same issues with them all.
And this is the web.config entry in the application on each of the servers:
<dataCacheClient requestTimeout="15000" channelOpenTimeout="3000" maxConnectionsToServer="1">
<localCache isEnabled="true" sync="TimeoutBased" ttlValue="300" objectCount="10000" />
<clientNotification pollInterval="300" maxQueueLength="10000" />
<hosts>
<host name="SERVER1" cachePort="22233" />
<host name="SERVER2" cachePort="22233" />
<host name="SERVER3" cachePort="22233" />
<host name="SERVER4" cachePort="22233" />
<host name="SERVER5" cachePort="22233" />
</hosts>
<transportProperties connectionBufferSize="131072" maxBufferPoolSize="268435456" maxBufferSize="8388608" maxOutputDelay="2" channelInitializationTimeout="60000" receiveTimeout="600000" /></dataCacheClient>
Anyone see a problem with the above? As you can see we have 3 lead hosts and 2 secondaries.
Some questions I have following on from this are:
- I have read about having a local cache - what is the technical benefit of this? ie. will this give a local copy of the data per node.
- What is the best practice regarding ports? Are the above ports correct or could there be conflicts with the same ports being used?
- The 3 lead hosts and 2 secondaries, is this a recommended split? Does it mean there are 3 copies of the data?
When we are restarting the servers, we attempt to never restart the lead hosts at the same time.
Thanks for any feedback on this!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们广泛使用 AppFabric 缓存。你会
经常看到这种情况。最好为自己编写一个围绕 AppFabric 的包装器,以便在引发此错误时自动重试。您确实想使用指数退避,但如果失败的话,随机化重试周期可能就足够了。
Web.config 文件中的缓存配置仅用于创建缓存工厂。它将联系其中一台主机并从中获取集群配置。在 Web.config 中列出所有主机的唯一好处是,如果主机出现故障,它可以联系另一台主机。即使您只列出了一个主机,只要该主机存在,您的缓存就能正常工作。
如果读取对象的频率高于写入对象的频率,则使用本地缓存可能会提高性能。您必须通过实验来调整其大小。
We make extensive use of AppFabric caching. You are going to see the
fairly often. It's probably best to write yourself a wrapper around AppFabric that automates retries when this error is thrown. You really want to use exponential backoff, but failing that randomizing the retry period may be enough.
The cache configuration in the Web.config file is only used to create the cache factory. It will contact one of the hosts and obtain the cluster configuration from that. The only benefit to listing all hosts in your Web.config is so that if a host is down it can contact another host. Even if you only listed a single host, provided that was present your caching would work fine.
Using a local cache is likely to improve performance if you read objects more frequently than you write them. You're going to have to tune the size of that by experimentation.