NEventStore 3.0 - 吞吐量/性能
我一直在尝试将 JOliver 的 Event Store 3.0 作为项目中的潜在组件,并尝试通过 Event Store 测量事件的吞吐量。
我开始使用一个简单的工具,它本质上通过 for 循环进行迭代,创建一个新流,并将一个非常简单的事件(包含 GUID id 和字符串属性)提交到 MSSQL2K8 R2 DB。调度员本质上是一个空操作。
这种方法成功地在 8 路 HP G6 DL380 上实现了约 3K 次操作/秒,而 DB 在单独的 32 路 G7 DL580 上运行。测试机器不受资源限制,阻塞看起来是我的情况的限制。
有人有测量 Event Store 吞吐量的经验吗?达到了什么样的数字?我希望吞吐量至少提高 1 个数量级,以使其成为一个可行的选择。
I have been experimenting with JOliver's Event Store 3.0 as a potential component in a project and have been trying to measure the throughput of events through the Event Store.
I started using a simple harness which essentially iterated through a for loop creating a new stream and committing a very simple event comprising of a GUID id and a string property to a MSSQL2K8 R2 DB. The dispatcher was essentially a no-op.
This approach managed to achieve ~3K operations/second running on an 8 way HP G6 DL380 with the DB on a separate 32 way G7 DL580. The test machines were not resource bound, blocking looks to be the limit in my case.
Has anyone got any experience of measuring the throughput of the Event Store and what sort of figures have been achieved? I was hoping to get at least 1 order of magnitude more throughput in order to make it a viable option.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我同意阻塞 IO 将是最大的瓶颈。我在基准测试中看到的问题之一是您正在针对单个流进行操作。您的域中有多少聚合根,每秒有 3K+ 事件? EventStore 的主要设计是针对多个聚合的多线程操作,从而减少读取世界应用程序的争用和锁定。
另外,您使用什么序列化机制? JSON.NET?我还没有实现 Protocol Buffers,但每个基准测试都表明 PB 在性能方面明显更快。针对您的应用程序运行分析器来查看最大的瓶颈在哪里会很有趣。
我注意到的另一件事是,您在方程中引入了网络跃点,这会增加任何单个流的延迟(和阻塞时间)。如果您正在写入使用固态驱动器的本地 SQL 实例,我可以看到与运行磁驱动器且数据和日志文件位于同一盘上的远程 SQL 实例相比,该数字要高得多。
最后,您的基准测试应用程序是否使用 System.Transactions 还是默认没有事务? (EventStore 在不使用 System.Transactions 或任何类型的 SQL 事务的情况下是安全的。)
现在,综上所述,我毫不怀疑 EventStore 中的某些区域可以通过一点点关注进行显着优化。事实上,我正在对 3.1 版本进行一些向后兼容的架构修订,以减少在单个提交操作期间在 SQL Server(以及一般的 RDBMS 引擎)中执行的写入次数。
在开始作为 3.x 基础的 2.x 重写时,我面临的最大设计问题之一是异步、非阻塞 IO 的想法。我们都知道,node.js 和其他非阻塞 Web 服务器击败了线程 Web 服务器一个数量级。然而,调用者引入复杂性的可能性会增加,必须认真考虑这一点,因为这是大多数程序和库操作方式的根本转变。如果我们确实转向事件化、非阻塞模型,那么在 4.x 的时间范围内会更好。
底线:发布您的基准测试,以便我们可以看到瓶颈在哪里。
I would agree that blocking IO is going to be the biggest bottleneck. One of the issues that I can see with the benchmark is that you're operating against a single stream. How many aggregate roots do you have in your domain with 3K+ events per second? The primary design of the EventStore is for multithreaded operations against multiple aggregates which reduces contention and locks for read-world applications.
Also, what serialization mechanism are you using? JSON.NET? I don't have a Protocol Buffers implementation (yet), but every benchmark shows that PB is significantly faster in terms of performance. It would be interesting to run a profiler against your application to see where the biggest bottlenecks are.
Another thing I noticed was that you're introducing a network hop into the equation which increases latency (and blocking time) against any single stream. If you were writing to a local SQL instance which uses solid state drives, I could see the numbers being much higher as compared to a remote SQL instance running magnetic drives and which have the data and log files on the same platter.
Lastly, did your benchmark application use System.Transactions or did it default to no transactions? (The EventStore is safe without use of System.Transactions or any kind of SQL transaction.)
Now, with all of that being said, I have no doubt that there are areas in the EventStore that could be dramatically optimized with a little bit of attention. As a matter of fact, I'm kicking around a few backward-compatible schema revisions for the 3.1 release to reduce the number writes performed within SQL Server (and RDBMS engines in general) during a single commit operation.
One of the biggest design questions I faced when starting on the 2.x rewrite that serves as the foundation for 3.x is the idea of async, non-blocking IO. We all know that node.js and other non-blocking web servers beat threaded web servers by an order of magnitude. However, the potential for complexity introduced on the caller is increased and is something that must be strongly considered because it is a fundamental shift in the way most programs and libraries operate. If and when we do move to an evented, non-blocking model, it would be more in a 4.x time frame.
Bottom line: publish your benchmarks so that we can see where the bottlenecks are.
马特(Matt)提出了很好的问题(+1),我看到奥利弗先生本人回答了答案(+1)!
我想采用一种与我自己正在使用的略有不同的方法来帮助解决您所看到的每秒 3,000 次提交的瓶颈。
大多数使用 JOliver 的 EventStore 的人似乎都试图遵循 CQRS 模式,它允许许多“横向扩展”子模式。人们通常排队的第一个是事件提交本身,您会看到其中存在瓶颈。“排队”意味着从实际提交中卸载并将它们插入到一些写优化的非阻塞 I/O 进程中,或者“队列”。
我的宽松解释是:
命令广播 ->命令处理程序 ->事件广播->事件处理程序 ->事件存储
在这些模式中实际上有两个扩展点:命令处理程序和事件处理程序。如上所述,大多数都是从扩展事件处理程序部分开始,或者在您的情况下将其提交到 EventStore 库,因为这通常是最大的瓶颈,因为需要将其保存在某个地方(例如 Microsoft SQL Server 数据库)。
我自己正在使用一些不同的提供程序来测试“排队”这些提交的最佳性能。 CouchDB 和 .NET 的 AppFabric 缓存(具有出色的 GetAndLock() 功能)。 ,只要至少有 1 个服务器启动并运行,您的缓存就会保持活动状态。[/OT]
[OT]我真的很喜欢 AppFabric 的持久缓存功能,它允许您创建冗余缓存服务器来跨多台计算机备份您的区域 -因此 ,假设您的事件处理程序不直接将提交写入 EventStore。相反,您有一个处理程序将它们插入“队列”系统,例如 Windows Azure 队列、CouchDB、Memcache、AppFabric Cache 等。重点是选择一个几乎没有块的系统来对事件进行排队,但有些东西这是持久的,内置冗余(Memcache 是我最不喜欢的冗余选项)。您必须具有这种冗余,以防万一服务器掉线,您仍然可以让事件排队。
要最终从此“排队事件”提交,有多种选择。我喜欢 Windows Azure 的队列模式,因为您可以不断地在队列中寻找许多“工作人员”。但它不一定是 Windows Azure - 我在本地代码中使用在后台线程中运行的“队列”和“工作角色”模仿了 Azure 的队列模式。它的伸缩性非常好。
假设您有 10 个工作人员不断地在这个“队列”中查找任何用户更新事件(我通常为每个事件类型编写一个工作人员角色,这样当您监控每种类型的统计数据时,可以更轻松地进行扩展)。两个事件被插入到队列中,前两个工作人员立即各自获取一条消息,并同时将它们直接插入(提交)到您的 EventStore 中 - 多线程,正如 Jonathan 在他的回答中提到的那样。该模式的瓶颈将是您选择的任何数据库/事件存储支持。假设您的 EventStore 使用 MSSQL,瓶颈仍然是 3,000 RPS。这很好,因为系统的设计目的是在 RPS 下降到 20,000 次突发后下降到 50 RPS 时“赶上”。这是 CQRS 允许的自然模式:“最终一致性”。
我说过 CQRS 模式还有其他原生的横向扩展模式。正如我上面提到的,另一个是命令处理程序(或命令事件)。这也是我所做的,特别是如果您像我的一位客户一样拥有非常丰富的域(每个命令都有数十个处理器密集型验证检查)。在这种情况下,我实际上会将命令本身排队,以便由某些辅助角色在后台处理。这也为您提供了一个很好的横向扩展模式,因为现在您的整个后端(包括事件的 EvetnStore 提交)都可以线程化。
显然,这样做的缺点是您失去了一些实时验证检查。我通过在构建域时通常将验证分为两类来解决这个问题。一种是域中的 Ajax 或实时“轻量级”验证(有点像命令前检查)。其他的是硬故障验证检查,仅在域中完成,但不可用于实时检查。然后,您需要在域模型中针对失败进行编码。意思是,如果出现问题,请始终编写出路,通常以通知电子邮件的形式返回给用户,说明出现了问题。由于用户不再被此排队命令阻止,因此如果命令失败,则需要通知他们。
需要进入“后端”的验证检查将进入您的查询或“只读”数据库,对吗?不要进入 EventStore 检查是否有唯一的电子邮件地址等。您将针对前端查询的高可用只读数据存储进行验证。哎呀,让一个 CouchDB 文档专门用于系统中所有电子邮件地址的列表,作为 CQRS 的查询部分。
CQRS 只是建议...如果您确实需要实时检查繁重的验证方法,那么您可以围绕它构建一个查询(只读)存储,并加快验证速度 - 在 PreCommand 阶段,然后将其插入到队列。有很大的灵活性。我什至认为,验证空用户名和空电子邮件等内容甚至不是域问题,而是 UI 责任(减轻在域中进行实时验证的需要)。我构建了一些项目,在这些项目中,我在 MVC/MVVM ViewModel 上进行了非常丰富的 UI 验证。当然,我的域名经过了非常严格的验证,以确保在处理之前它是有效的。但是,将平庸的输入验证检查(或者我所说的“轻量级”验证)移至 ViewModel 层,可以为最终用户提供近乎即时的反馈,而无需进入我的领域。 (也有一些技巧可以使其与您的域保持同步)。
因此总而言之,可能会考虑在提交这些事件之前对其进行排队。正如 Jonathan 在他的回答中提到的那样,这非常适合 EventStore 的多线程功能。
Excellent question Matt (+1), and I see Mr Oliver himself replied as the answer (+1)!
I wanted to throw in a slightly different approach that I myself am playing with to help with the 3,000 commits-per-second bottleneck you are seeing.
The CQRS Pattern, that most people who use JOliver's EventStore seem to be attempting to follow, allows for a number of "scale out" sub-patterns. The first one people usually queue off is the Event commits themselves, which you are seeing a bottleneck in. "Queue off" meaning offloaded from the actual commits and inserting them into some write-optimized, non-blocking I/O process, or "queue".
My loose interpretation is:
Command broadcast -> Command Handlers -> Event broadcast -> Event Handlers -> Event Store
There are actually two scale-out points here in these patterns: the Command Handlers and Event Handlers. As noted above, most start with scaling out the Event Handler portions, or the Commits in your case to the EventStore library, because this is usually the biggest bottleneck due to the need to persist it somewhere (e.g. Microsoft SQL Server database).
I myself am using a few different providers to test for the best performance to "queue up" these commits. CouchDB and .NET's AppFabric Cache (which has a great GetAndLock() feature). [OT]I really like AppFabric's durable-cache features that lets you create redundant cache servers that backup your regions across multiple machines - therefore, your cache stays alive as long as there is at least 1 server up and running.[/OT]
So, imagine your Event Handlers do not write the commits to the EventStore directly. Instead, you have a handler insert them into a "queue" system, such as Windows Azure Queue, CouchDB, Memcache, AppFabric Cache, etc. The point is to pick a system with little to no blocks to queue up the events, but something that is durable with redundancy built-in (Memcache being my least favorite for redundancy options). You must have that redundancy, in the case that if a server drops, you still have the event queued up.
To finally commit from this "Queued Event", there are several options. I like Windows Azure's Queue pattern for this, because of the many "workers" you can have constantly looking for work in the queue. But it doesn't have to be Windows Azure - I've mimicked Azure's Queue pattern in local code using a "Queue" and "Worker Roles" running in background threads. It scales really nicely.
Say you have 10 workers constantly looking into this "queue" for any User Updated events (I usually write a single worker role per Event type, makes scaling out easier as you get to monitor the stats of each type). Two events get inserted into the queue, the first two workers instantly pick up a message each, and insert them (Commit them) directly into your EventStore at the same time - multithreading, as Jonathan mentioned in his answer. Your bottleneck with that pattern would be whatever database/eventstore backing you select. Say your EventStore is using MSSQL and the bottleneck is still 3,000 RPS. That is fine, because the system is built to 'catch up' when those RPS drops down to, say 50 RPS after a 20,000 burst. This is the natural pattern CQRS allows for: "Eventual Consistency."
I said there was other scale-out patterns native to the CQRS patterns. Another, as I mentioned above, is the Command Handlers (or Command Events). This is one I have done as well, especially if you have a very rich domain domain as one of my clients does (dozens of processor-intensive validation checks on every Command). In that case, I'll actually queue off the Commands themselves, to be processed in the background by some worker roles. This gives you a nice scale out pattern as well, because now your entire backend, including the EvetnStore commits of the Events, can be threaded.
Obviously, the downside to that is that you loose some real-time validation checks. I solve that by usually segmenting validation into two categories when structuring my domain. One is Ajax or real-time "lightweight" validations in the domain (kind of like a Pre-Command check). And the others are hard-failure validation checks, that are only done in the domain but not available for realtime checking. You would then need to code-for-failure in Domain model. Meaning, always code for a way out if something fails, usually in the form of a notification email back to the user that something went wrong. Because the user is no longer blocked by this queued Command, they need to be notified if the command fails.
And your validation checks that need to go to the 'backend' is going to your Query or "read-only" database, riiiight? Don't go into the EventStore to check for, say, a unique Email address. You'd be doing your validation against your highly-available read-only datastore for the Queries of your front end. Heck, have a single CouchDB document be dedicated to only a list of all email addresses in the system as your Query portion of CQRS.
CQRS is just suggestions... If you really need realtime checking of a heavy validation method, then you can build a Query (read-only) store around that, and speed up the validation - on the PreCommand stage, before it gets inserted into the queue. Lots of flexibility. And I would even argue that validating things like empty Usernames and empty Emails is not even a domain concern, but a UI responsiblity (off-loading the need to do real-time validation in the domain). I've architected a few projects where I had very rich UI validation on my MVC/MVVM ViewModels. Of course my Domain had very strict validation, to ensure it is valid before processing. But moving the mediocre input-validation checks, or what I call "light-weight" validation, up into the ViewModel layers gives that near-instant feedback to the end-user, without reaching into my domain. (There are tricks to keep that in sync with your domain as well).
So in summary, possibly look into queuing off those Events before they are committed. This fits nicely with EventStore's multi-threading features as Jonathan mentions in his answer.
我们使用 Erlang/Elixir 构建了一个用于大规模并发的小型样板,https://github.com/使用 Eventstore 的 work-capital/elixir-cqrs-eventsource。我们仍然需要优化数据库连接、池化等...但是每个聚合有一个进程与多个数据库连接的想法符合您的需求。
We built a small boilerplate for massive concurrency using Erlang/Elixir, https://github.com/work-capital/elixir-cqrs-eventsourcing using Eventstore. We still have to optimize db connections, pooling, etc... but the idea of having one process per aggregate with multiple db connections is aligned with your needs.