生产中 .NET 应用程序的持续性能监控？

发布于 2024-09-14 05:41:01 字数 854 浏览 13 评论 0原文

给定 SOA 环境中相对典型的 .NET 4 系统（即 Windows Server 2008 R2、IIS 7 上的 RESTful Web 服务、NServiceBus 消息传递的 Windows 服务、SQL Server 2008 R2 等），最佳实践或事实上的解决方案是什么（没有企业价格标签）用于在生产中执行 24x7 性能监控？

不一定消耗多少 CPU/内存/磁盘 IO，而是例如每分钟进行了多少次 createAccount() 调用，generateResponse() 方法花费的平均时间是多少，并检测例如generateResponseStarted 和generateResponseComplete 之间的异常增量峰值（方法被调用（进而可以调用第 3 方）并且响应已准备好分别返回）。

经过一番谷歌搜索后，似乎可以选择低级别分析器（如 dotTrace）和实现性能计数器并使用 PerfMon 或其他 OpManager 类型产品来使用这些计数器。

你会推荐什么？为实时应用程序实施性能计数器是否会显着降低生产系统的性能？如果没有，是否有任何好的库可以简化 .NET 中的实现？如果是，除了内存-磁盘-CPU 之外，人们如何监控应用程序的性能？

@Ryan Hayes

谢谢，我正在寻找一种方法来查看生产系统上异常的减速或峰值。例如，在压力测试期间一切都很好，但由于某种原因，我们依赖的第 3 方出现了一些问题，或者 DB 由于线程锁定而变慢，或者 SAN 让位，或者任何其他意外情况。低级分析的开销太大，而仅在出现问题时才打开计数器为时已晚。另外，我们将缺少历史数据来进行比较（当增量超出可接受的阈值时，我需要某种警报系统）。我想知道人们如何监控其生产系统的性能，以及根据他们的经验，非内存/CPU/服务器相关类型的监控的最佳方法是什么。

原文

Given a relatively typical .NET 4 system in an SOA environment (i.e. Windows Server 2008 R2, RESTful Web Services on IIS 7, Windows Services for NServiceBus messaging, SQL Server 2008 R2, etc) what are the best practices or de facto solutions (without enterprise price tag) for performing 24x7 performance monitoring in production?

Not necessarily how much CPU/Memory/Disk IO it consumes but rather for example how many createAccount() calls per minute were made, what is the average time generateResponse() method takes and detect unusual delta spikes between for example generateResponseStarted and generateResponseComplete (method was invoked (which in turn can call 3rd party) and response is ready to be returned respectively).

After some googling it seems options are for low level profilers (like dotTrace) and implementing Performance Counters and consuming those with PerfMon or some other OpManager type product.

What would you recommend? Would implementing performance counters for a live application significantly degrade performance on production system? If not, are there any good libraries that streamline the implementation in .NET? If yes, how do people monitor their applications' performance other than memory-disk-cpu?

@Ryan Hayes

Thanks, I'm looking for a way to see an unusual slowing down or spikes on production systems. For example all was good during stress testing but for some reason 3rd party we rely on is having some problems or DB is slowing down due to thread locking, or SAN is giving way, or any other unexpected scenarios. Low level profiling is too much of an overhead while turning counters on only when there is a problem is too late at that point. Plus we'll be missing historical data to compare it to (I would need some sort of alert system for when delta is outside of an acceptable threshold). I'm wondering how people monitor performance of their production systems and in their experience what would be the best approach for non memory/cpu/server related kind of monitoring.

分享到QQ

分享到微博