非常简单的企业应用程序架构 - 使其可扩展

发布于 2024-09-16 08:03:23 字数 838 浏览 6 评论 0 原文

我正在为我的一个 Intranet 企业应用程序使用一种非常简单的体系结构。

客户端:

  • 每台计算机上运行 1 个代理,发送系统配置数据(一次)、报告(每 2 到 5 分钟)=>从客户端流向服务器的数据大小为几百字节,很少达到 KB。

服务器:

  • 1 个 Web 应用程序(管理客户端、查看报告的前端)
  • 一个用于接收所有传入数据(它只是将其转储到表中)的 Web 服务
  • 一个系统服务,用于每隔几秒读取转储并执行相关查询 - 插入、用于报告的实际表的更新(此步骤可能与 ETL 进行比较)

由于数千个客户端同时向服务器发送数据,服务器只需将这些传入数据转储到临时表中(每个发送数据的客户端插入一个)。在后台运行的系统服务不断刷新此临时表 - 从某种意义上说 - 每 10 秒,它从转储表中读取前 100 行,将这些数据组织到用于报告的相关表中,并从转储中删除这 100 行等等。

到目前为止,我已经在 2,000 台计算机的网络中运行了我的应用程序,并且看起来运行良好。现在我需要扩展它以支持包含 25,000 个客户端的网络。我将以每秒 25,000 个请求运行模拟测试,并检查该架构是否良好。

该服务器基于.NET。 ASP .NET Web 应用程序 - 前端、转储数据的 Web 服务。基于.NET 的系统服务来执行ETL。 SQL Server 2005/2008 作为数据库服务器。

希望得到stackoverflow社区一些建设性的批评和指导,以改进这个架构。您认为使用单个服务器与 25,000 个客户端一起工作的方式足够好吗?您认为随着并发活动的增加,最有可能发生故障的组件是什么?它有根本性的缺陷吗?欢迎各界人士指导。谢谢。

I'm using a very simple architecture for one of my intranet enterprise applications.

Client:

  • 1 agent running on each computer sending system config data (one time), reports (every 2 to 5 min) => size of the data flowing from client to server is a few hundred bytes and rarely touches a kB.

Server:

  • 1 web application (front-end to manage clients, view reports)
  • a web service to receive all incoming data (which it simply dumps in a table)
  • a system service to read the dump every few seconds and execute relevant queries - inserts, updates on the actual tables used for reporting (this step could probably be compared to ETL)

With thousands of clients sending data concurrently to the server, the server simply dumps this incoming data in a temporary table (one insert for each client sending data). A system service running in the background keeps flushing this temporary table - in the sense - every 10 seconds, it reads the top 100 rows from the dump table, organizes this data into the relevant tables used for reporting and removes these 100 rows from the dump and so on.

So far I've run my app in a network of 2,000 computers and it seems to be working well. Now I need to scale this to support a network of 25,000 clients. I'm gonna run simulation tests with 25,000 requests per second and check if the architecture holds good.

The server is .NET based. ASP .NET web application - front-end, web service to dump data. .NET based system service to perform ETL. SQL Server 2005/2008 as the db server.

Hope to get some constructive criticism and guidance from the stackoverflow community so as to improve this architecture. Do you feel it's good enough the way it is to work with 25,000 clients using a single server? What do you think would be the component most likely to break down with increasing concurrent activity? Is it fundamentally flawed? All sorts of guidance welcome. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

柠檬色的秋千 2024-09-23 08:03:23

均匀分布,“最坏情况”为 12500 事务/分钟,即每秒 209 事务。

您最好做的可能是前端的负载平衡。

如果您有 4 台机器,则每台机器上的传输速度将降至每秒 52 次。每台机器将其传输数据存储在本地,然后批量插入到后端最终数据库中。这使得主数据库上的交易量保持在较低水平。插入 1 行和 50 行(取决于行大小)之间的差异非常小。在某些时候,它是“相同的”,具体取决于网络开销等。

因此,如果我们向下舍入到 50(为了便于数学计算),则前端计算机每 5 秒向后端数据库插入 250 行。这并不是一个低量(同样取决于行大小)。

您提到在后端每个进程轮询 100 个建议。无论您在此处使用什么数字,加上处理时间,都需要小于您的总流量和所需的完成时间。

具体来说,短期内后端处理速度比前端插入速度慢是可以的,只要从长远来看,你的后端能赶上。例如,也许您的大部分流量来自上午 8 点至下午 5 点,但所有后端处理都将在晚上 9 点前完成。

否则,后端永远不会跟上,你总是落后,积压的工作量只会越来越大。所以你需要确保你也能正确处理这个问题。

如果您的报告查询成本很高,最好也卸载这些查询。让前层机器将原始数据发送到单个中间层机器,然后让第三台机器将大量(可能是每天)批量导出到本地报告数据库以供数据库查询。

另外,请考虑故障和可用性场景(即,如果您失去一台负载平衡的前端计算机,您是否仍能跟上流量等)。这里有很大的失败空间。

最后,通常来说,更新往往比删除便宜,因此如果您可以在停机时间而不是在主流处理期间进行删除,那么如果您需要的话,您可能会在那里找到一些性能。

Evenly distribute, "worst case" you're at 12500 trans/minute, which is 209 trans per sec.

What you should probably best do is load balance the front end.

If you had 4 machines, you're down to 52 trans per sec on each machine. Each machine stores their trans data locally, and then, in batches, makes bulk inserts in to the back end, final database. This keeps the trans volume low on the main database. The difference between inserting 1 row and 50 rows (depending on row size) is pretty minor. At some point it's "the same" depending on network overhead etc.

So, if we round down to 50 (for easy math), every 5 secs the front end machines insert 250 rows in to the back end database. That's not a low of volume (again depending on the row size).

You mention polling 100 recs per process on the back end. Whatever number you use here, combined with processing time, needs to be less than your total traffic and desired finish time.

Specifically, it's all right for the backend processing to be slower than the front end insertion rate in the short run, as long as in the long run, your backend catches up. For example, perhaps most of your traffic is from 8am-5pm, but all said and done your backend processing will be caught up by 9pm.

Otherwise, the backend never catches up, you're always behind, and the backlog is just getting larger and larger. So you need to make sure you can handle that properly as well.

If your report queries are expensive, it's best to offload those as well. Have the front tier machines send raw data to the single middle tier machine, then have a 3rd machine make large (perhaps, daily) bulk exports in to local reporting database for your database queries.

Also, consider failure and availability scenarios (i.e. if you lose one of your load balanced front tier machines, can you still keep up with traffic, etc.). Lots of room for failure here.

Finally, as a rule, updates tend to be cheaper than deletes, so if you can delete on your down time rather than during the mainstream processing, you'll probably find some performance there if you need it.

⒈起吃苦の倖褔 2024-09-23 08:03:23

在最坏的情况下,这意味着您的系统每分钟需要搅动 5000-13000 个请求。您需要计算系统利用率为 60-70% 的系统的粗略吞吐量(假设当前有 2000 个客户端) - 如果 Web 服务每个请求大约需要 50 毫秒,则意味着它每分钟最多可以支持 1200 个请求。 .NET 服务也可以进行类似的计算。随着负载增加,吞吐量可能会减少,因此实际数量会更少。
根据此类计算,您需要决定是否必须扩展系统。您可以在多个服务器上运行服务,并且负载将被分配。如果数据库服务器成为瓶颈,可以通过集群方式利用。您唯一需要检查的是您的 .NET 服务实现是否允许并行(IMO,Web 服务将是无状态的,并且应该可以扩展而不会出现问题) - 例如,您是否需要按照收到的顺序插入记录ETC。

In worst case, it means that your system needs to churn 5000-13000 requests per minute. You need to compute rough through-put of your system with 60-70% system utilization(with say current 2000 clients) - if web service take approx 50 millisecond per request then it means it can support max 1200 request per minute. Similar calculation can be done for .NET service. As load increases, through-put is likely to decrease, so actual number would be less.
Based on such calculations, you need to decide whether you have to scale out your system. You can run your services on multiple servers and the load will get divided. If db server become bottleneck, it can be utilized in clustered way. Only thing that you need to check is that can your implementation of .NET service allows parallelism (IMO, web service would be state less and should scale w/o issue) - for example, do you need to insert records in the order you received etc.

梦太阳 2024-09-23 08:03:23

运行模拟并查看其效果如何。瓶颈可能是网络,也可能是磁盘 I/O。在这种情况下,我可以提出一些建议。

首先,我希望你使用的是 UDP 而不是 TCP?

尝试让该服务侦听多个 NIC。使应用程序的多个实例运行并访问该表。我不知道你正在使用什么数据库,但 sqlite 对于这种类型的应用程序来说是完美的...并且它有一些功能可能有助于提高性能,而不会太频繁地接触磁盘。

您的服务器中有大量内存。

假设所有这些都已完成,如果仍然没有执行,那么

下一步将是拥有一系列中间服务器,每个服务器收集数千个客户端的结果,然后通过更高速的链接将它们转发到主服务器进行处理。您甚至可以将它们批量发送到主服务器,并通过该链接压缩数据。或者只是将它们发送给它并批量导入结果。

无论如何,只是我的想法。我正在做类似的事情,但我的数据量将在一系列不同的高端服务器上几乎连续地达到最大 1 - 2Gbit 链接..所以中间服务器就是我们正在做的,

Run the simulation and see how it holds out. What's likely to be a bottleneck is the network and possibly the disk i/o. In which case I can suggest a couple of things.

1st off, I hope you're using UDP not TCP??

Try making the service listen on multiple NIC's. Make multiple instances of the apps run and access the table. I don't know what database you're using but sqlite would be perfect for this type of app... and it has some features that might help with performance without touching the disk too often.

Lots of memory in your server.

Assuming all that is done and if it still doesn't perform then

Next step would be to have an series of intermediary servers which collect the results for several thousand clients each and then forward them on over a higher speed link to the main server for processing. You might even be able to batch send them to the main server and have the data compressed over that link. Or just SCP them over to it and import the results in a batch.

Anyway, just my thoughts. I'm working on something similar but my volume of data is going to be maxing out almost continuously 1 - 2Gbit links over a range of different high end servers.. so the intermediary server is what we're doing,

饮湿 2024-09-23 08:03:23

您需要横向扩展每秒 25k 个请求(即使每分钟 25k 个请求,每秒 25k 个实际上是一个巨大负载,您需要许多服务器来处理它)。您必须有一个 WWW 服务服务器群,每个服务器都将请求转储到本地存储(队列)中。您不能让 WWW 场直接与后端对话,它会因争用而死亡(由于客户端请求尝试在数据库中的同一位置插入/更新而锁定排除)。 WWW 服务只是将请求转储到本地,然后返回 HTTP 响应并继续。来自中间层 WWW 服务器的这些请求必须被聚合并加载到中央服务器中。这种加载必须可靠、易于配置并且速度相当快。不要落入“我自己用重试逻辑编写一个复制实用程序”的陷阱,这条路是由人铺成的。 SQL Server Express 实例是本地存储的一个很好的候选者,而 Service Broker 是聚合和加载的一个很好的候选者。我知道这种架构是有效的,因为我已经完成了使用它的项目,请参阅 大容量连续实时审核和 ETL。我知道有一些项目使用这种架构来扩展它(非常,请参阅疯狂三月点播使用 SQL Server 2008 R2 StreamInsight 进行实时分析 关于如何收集 Silverlight 媒体流运行时情报(两个链接的重点是不同的技术,但由于我碰巧非常了解该项目,所以我知道他们如何从 WWW Web 服务收集数据到他们的后端)。

25k requests per second you need to scale out (even at 25k per minute, 25k per second is actually a huge load and you'll need many a servers to handle it). You must have a park of WWW service servers, each dumping the request into a local storage (a queue). You can't have the WWW farm talk straight inot the back end, it will die because of contention (lock exclusion due to client requests attempting to insert/update in the same spot in the database). The WWW service just dumps the requests locally, and then returns the HTTP response and continues. From the mid tier WWW servers these requests have to be aggregated and loaded into the central servers. This loading has to be reliable, easily configurable, and quite fast. Don't fall for the trap of 'I'll just write a copy utility myself with retry logic', that road is paved with bodies. A good candidate for this local storage is a SQL Server Express instance and a good candidate for the aggregation and loading is Service Broker. I know this architecture works because I've done projects that use it, see High Volume Contiguos Real Time Audit and ETL. And I know of projects that use this architecture to scale it (really high, see March Madness on Demand or Real Time Analytics with SQL Server 2008 R2 StreamInsight about how the Silverlight media streaming runtime intelligence is collected (the emphasys on both links is on different technologies, but sinc eI happen to know that project quite well I know how they collect the data from the WWW web-services to their back end).

深白境迁sunset 2024-09-23 08:03:23

根据我的计算,在最坏的情况下,每 120 秒就有 25000 次插入。每 10 秒您读取 100 行,这意味着 120 秒内您读取了 1200 行。这意味着您的临时表将不断积累数据。

要扩展系统,您需要做的是考虑如何向系统添加组件来处理负载。

将 Web 服务设计为能够向负责将数据插入临时表的“从属”发出请求。临时表名称列表需要保存在一些通用的命名服务中(像另一个名称表一样简单也可以)。

以类似的方式设计系统 ETL 服务,以选择临时表、读取其所有行、完成其工作并将临时表标记为已处理并返回睡眠状态。

通过这种方式,您可以为插入和 ETL 添加额外的流程。

最后,您的报告存储库将以惊人的速度增长。希望那里的数据可以每周或每月清理一次?

By my calcs, at worst case you have 25000 inserts every 120 seconds. Every 10 seconds you read 100 rows, which means in 120 seconds you have read 1200 rows. This means your temp table will keep accumulating data.

What you would need to do for scaling a system is to think in terms of how you can add components to the system to handle load.

Design the web service to be able to fire off requests to "slaves" responsible for inserting the data into temp tables. The list of temp table names will need to be kept in some common naming service (something as simple as another table of names would also be ok).

Design the system ETL service in a similar fashion to pick of a temp table, read all its rows, do its job and mark the temp table as processed and go back to sleep.

This way you can add additional processes for the inserts and for the ETL.

Finally, your report repository is going to grow at an alarming rate. Hopefully the data there can be cleaned out every week or month?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文