IIS 和 SQL Server 的监控仪表板
我们开发了一个使用 SQL Server 作为后端的 .NET Web 应用程序。现在我们想为技术支持团队提供一个监控仪表板应用程序。这个监控应用程序的想法是,该监控应用程序将显示托管应用程序的 Web 服务器和保存数据的数据库服务器的“健康状况”的全局情况。这个“健康”度量应该反映每台机器的工作负载,并且是根据我需要确定的一些输入计算得出的数字(假设在 0 到 100 之间)。
对于 Web 服务器,我想必须考虑每个时间单位的 HTTP 请求,也许还需要考虑带宽消耗。
对于数据库服务器,我认为应该使用每个时间单位的事务数以及锁或其他一些指标或数据库并发性。
此外,还应考虑其他一些通用输入,例如 CPU 负载、内存使用情况和磁盘队列长度。
应根据需要权衡所有这些因素,以获得每台服务器的最终“健康”数据。
编辑。这个想法是,“健康”测量为技术人员提供了服务器工作负载的全局视图。如果服务器的“运行状况”较低,技术人员将能够深入查看机器的详细信息,以了解哪些特定输入导致“运行状况”较低。
我的问题是:
- 您认为这个“健康”措施有意义吗?
- 我正在考虑使用性能计数器来捕获输入数据。这是最好的选择吗?
- 您能为 Web 服务器 (IIS 7) 和数据库服务器 (SQL Server 2008) 建议适当的输入吗?
谢谢。
We have developed a .NET web application that uses SQL Server as a backend. Now we would like to provide a monitoring dashboard app for the tech support team. The idea is that this monitoring app will show a global picture of the "health" of the web servers hosting the application and the database servers holding the data. This "health" measure should reflect the workload of each machine, and would be a number (between 0 and 100, let's say) computed from some inputs that I need to determine.
For the web servers, I imagine that HTTP requests per time unit must be considered, and perhaps bandwidth consumed.
For the database servers, I reckon that transactions per time unit and maybe locks or some other indicator or database concurrency should be used.
In addition, some other generic inputs, such as CPU load, memory usage and disk queue length should also be taken into account.
All these factors should be weighed as necessary to obtain the final "health" figure for each server.
Edit. The idea is that the "health" measure gives the technician a global picture view of a server's workload. If a server appears with low "health", the technician will be able to drill down and look at the details of the machine to see what specific inputs are causing the low "health".
My questions are:
- Do you think this "health" measure makes sense?
- I am thinking of using performance counters to capture the input data. Is this the best option?
- Can you suggest appropriate inputs for the web servers (IIS 7) and the database servers (SQL Server 2008)?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不会。如果您的单身号码已关闭,有人会问的第一件事是“怎么了?”另外,请考虑趋势分析对于早期错误检测非常重要的事实。
我认为这将是一个很好的起点。
这是论坛帖子的一个大主题,答案在很大程度上取决于您的应用程序的详细信息。从广义上讲,您需要查看错误条件的频率、每个子系统吞吐量的某种意义/度量、进程外调用超过性能阈值的频率等。通常最好显示当前数字以及历史和趋势。
您可能想看看 Microsoft 在该领域的产品:Service Center Operations Manager (SCOM),以了解它们所做的事情类型。
No. The first thing someone will ask if your single number is off is "what's wrong?" Also, consider the fact that trend analysis can be very important for early error detection.
I think that would be an excellent starting point.
This is a big subject for a forum post, and the answer depends heavily on the details of your app. In broad terms, you want to look at things like the frequency of error conditions, some sense/measure of throughput for each subsystem, counts for how often out-of-process calls exceed performance thresholds, etc. It's usually a good idea to show current numbers as well as historical and trends.
You might want to have a look at Microsoft's product in this area: Service Center Operations Manager (SCOM), to see the types of things they do.
首先,我认为您设计的仪表板与您告诉我们的仪表板不同,技术支持人员想知道机器是否正常运行以及出现问题时该怎么办。
每秒的请求和事务对于容量规划和/或系统和应用程序调整很有用,但不适用于技术支持。
另外,我认为单一的数字毫无意义,对任何人都没有帮助,因为 87.75% 意味着什么?
因此,我相信您需要一个供系统管理员和应用程序开发人员使用的仪表板(这种类型的测量有意义),以调整操作系统或了解何时添加新计算机或哪个查询使 SQL Server 陷入困境。
也就是说,性能计数器已经存储了您想要呈现的大部分信息,因此这确实有意义。此外,您可以使用 SQL Server 跟踪来测量有关查询的性能数据,跟踪不应持续运行,而应按定义的时间间隔运行。
现在,如果您确实想要一个仪表板来提供技术支持,那么两种类型的监视器就足够了:服务器启动/关闭 - 应用程序响应/无响应
First of all, I think you are designing a different dashboard than what you are telling us, tech support wants to know if machines are up/down and what to do when there is a problem.
Requests and transactions per second are useful for capacity planning and/or system and application tuning, not for tech support.
Also, I believe a single figure makes no sense and helps nobody, because what would 87,75% mean?
So, I believe you want a dashboard for sysadmins and app developers, where this type of measurement makes sense, to tune the OS or know when to add a new machine or which query is bogging down SQL Server.
That said, performance counters already store much of the information you want to present so that does make sense. Additionally you can use SQL Server traces to measure performance data about the queries, the traces should not be run constantly, but at defined intervals.
Now, if you really wanted a dashboard for tech support, two type of monitors would be enough: Server up/down - Application responsive/unresponsive
SQL Server 2008 附带了开箱即用的性能收集和数据仓库,请参阅 SQL Server 2008 数据收集和管理数据仓库。 SQL 2005也有类似的 绩效仪表板。我并不是说您一定应该使用这些作为您的仪表板(尽管您可以),但您应该查看这两个 SQL 仪表板,以了解 MS 团队认为在仪表板中放置的重要内容。
SQL Server 2008 comes with a performance collection and data warehouse out-of-the-box, see SQL Server 2008 Data Collections and the Management Data Warehouse. Also SQL 2005 has a similar Performance Dashboard. I'm not saying you should use these as your dashboard necessarily (although you could), but you should look at these two SQL dashboards to see what the MS team considered important to put in a dashboard.