C# 中透明 SMTP 代理服务器的良好设计
如果您要在 C# (.net 4) 中设计透明 SMTP 代理以满足以下初始要求
- 良好的可扩展性
- 将所有流量记录到 数据库
- 可以轻松扩展用于病毒扫描附件
考虑到这些因素,一般来说,您的设计会是什么样子?您会创建 Listener、Sender 和 logger 具体类还是更抽象的类?您会使用回调、线程还是进程,为什么?
If you were to design a transparent SMTP proxy in C# (.net 4) to meet the following initial requirements
- Scales well
- Logs all traffic to a
database - Can be extended easily say for virus scanning attachments
Considering these factors broadly speaking how would your design look? Would you create Listener, Sender and logger concrete classes or something more abstract? And would you use callbacks, threads or processes and why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个不平凡的应用程序。一些应该有所帮助的想法:
SMTP 可扩展性
一般来说,扩展网络应用程序意味着能够横向扩展(如在更多机器中)而不是向上扩展(更大的昂贵机器)。这意味着能够让多个服务器能够处理 SMTP 请求。请注意,这可能需要网络级别的支持(可以将消息分发到“SMTP 场”的路由器)。
是的,为了实现 SMTP 扩展和执行,您可能需要利用多个线程(可能来自某种线程池)。请注意,多线程套接字的实现并不简单。
就进程而言,我认为每个 SMTP 服务器具有多个线程的一个进程(可能是 Windows 服务)是一种好方法。
数据库可扩展性
请记住,数据库也可能是可扩展性的瓶颈。为了设计大负载,您还希望能够水平扩展数据层。这意味着能够写入多个数据库服务器。这使得能够从一组数据库服务器进行报告(这比从一个数据库服务器进行报告要复杂得多)。
SMTP 可靠性
这是一个问题/要求吗?如果是这样,这就是支持服务器场(好吧,如果我们有多个服务器以确保可靠性,我们可以将其称为集群)而不是仅支持一个服务器的另一个原因。请注意,场必须有一种方法让集群知道它已经失败(可能通过某种心跳机制)。
数据库可靠性
为了使数据库可靠,您还必须进行一些集群。这既不便宜也不简单(但已经在许多数据库平台上做过多次)。
排队
处理服务器负载激增的一种方法是对消息进行排队。这样,服务器可以继续传递消息,但您不必等待可扩展模块链完成其处理。请注意,这会增加系统的另一层复杂性和故障点。
可扩展性
添加数据库日志记录和附件扫描等功能的一种方法是添加“MessageInsepctors”或“MessageHandlers”链。您可能希望允许按特定顺序配置这些内容(例如,在记录之前进行病毒扫描,这样您就不会记录受感染的项目)。
另一个需要考虑的方面是哪些插件可以阻止消息通过(例如病毒扫描程序)以及哪些插件可以在消息通过后执行(日志记录)。
在添加插件支持方面,您可以使用 MEF(托管扩展性框架)之类的东西。
重新发明轮子
将所有这些功能落实到位将需要大量的开发时间。购买一个现成的解决方案可能会更便宜/更快/更容易,它可以为您完成所有这些工作(这个问题已经被解决了很多次)。
This is a non-trivial application. Some ideas that should help:
SMTP Scalability
In general, scaling network application means being able to scale out (as in more machines) rather than up (a bigger expensive machine). This means being able to have multiple servers be able to handle SMTP requests. Note that this will likely need to have support at the network level (routers that can distribute messages to an 'SMTP farm').
Yes, to make an SMTP scale and peform, you'll likely want to utilize multiple threads (likely from some sort of thread pool). Note that a multithreaded sockets implementation is not trivial.
In terms of processes, I think one process (likely a Windows Service) with multiple threads for each SMTP server is a good way to go.
Database Scalability
Keep in mind that the database can be a scalability bottleneck as well. To design for large loads, you would want to be able to horizontally scale your data tier as well. That means being able write to more than one db server. That leads to being able to report from a set of database servers (which is much more complicated than reporting from one).
SMTP Reliability
Is this a concern / requirement? If so, this is another reason for supporting a farm (well, if we have multiple server for reliability we might call it a cluster) of servers instead of just one. Note that the farm would have to have a way of letting the cluster know that it has failed (through some sort of heartbeat mechanism perhaps).
Database Reliability
To make the database reliable, you would have to do some clustering as well. This is neither cheap or trivial (but has been done a number of times with a number of database platforms).
Queuing
One way to handle surges in server load is to queue messages. This way, the server can keep passing messages through, but you're not waiting for the chain of extensible modules to finish their processing. Note that this adds another layer of complexity and a point of failure to the system.
Extensibility
One way to approach adding functionality such as database logging and attachment scanning is to add a chain of "MessageInsepctors" or "MessageHandlers". You would probably want to allow configuration of these in a particular order (e.g. virus scan before logging so you don't log infected items).
Another aspect to consider is which plug ins can block a message from passing through (such as a virus scanner) and a plug in that can execute after the message has passed (logging).
In terms of adding the plug in support, you could use something like MEF (Managed Extensibility Framework).
Reinventing the Wheel
Putting all of this functionality into place would take a considerable amount of development time. It might be cheaper / faster / easier to just purchase a solution off the shelf that does all of this for you (this problem has already been solved a number of times).
在我看来,您想要设计一些面向未来的东西,但不立即执行完整的数据分区、集群等操作。
分区
建议提前考虑对数据负载进行分区的方法。最初,您可以应用分区逻辑,将所有内容简单地路由到同一目的地,但这将使您可以在需要时轻松划分负载 - 并能够提前验证其是否有效。
排队
我强烈推荐排队解决方案,因为它允许您将接收消息的工作量与实际发送消息的工作量分开。排队对于可靠的消息传递也非常有用,因为您将希望保证一次性传递。查看比较 MSMQ 与 Service Broker 的一些问题,因为它们服务于不同的受众,并且都有不同的注意事项。
SMTP
大多数电子邮件服务器允许您通过单个连接发送多封电子邮件(我指的不仅仅是向多个收件人发送同一封邮件)。这可以极大地增加您可以推送到远程邮件服务器的电子邮件数量,但这取决于它的配置方式。如果您没有配置它们或者不知道允许的值,我建议您采用探测策略,首先尝试传递两封邮件并记录远程服务器的结果。如果有效的话下次尝试加倍,如果失败则减半。有点像当可靠传输时 TCP 窗口如何增加。
可扩展性
我不会考虑这么多。实现可扩展性和可靠性要困难得多,因此正确实现也更加重要。可扩展性只是一路上挂钩附加步骤的能力,当核心系统就位时,或者当您开始添加您认为应该是可选但内置的功能时,您可以添加接缝来执行此操作。
It sounds to me like you want to design something that is future proof, but without immediately doing the full monty of data partitioning, clustering, etc.
Partitioning
It is advisable to consider ways to partition your data load in advance. Initially you can apply partitioning logic that simply routes everything to the same destination, but this will allow you to easily divide the load when the need arises - and the ability to verify that it works in advance.
Queuing
I would highly recommend a queuing solution, as it allows you to separate the workload of receiving messages and the actual sending of it. Queuing is also great for reliable messaging, in that you will want to guarantee once-only delivery. Look at some of the questions comparing MSMQ with Service Broker, as they serve different audiences and both of them have various caveats.
SMTP
Most email servers allow you to deliver more than a single email over a single connection (and I don't just mean the same mail with multiple recipients). This can dramatically boost the number of emails you can push to the remote mail server, but depends on how it has been configured. If you didn't configure them or don't know the value allow, I'd recommend a probing strategy where you start out with trying to deliver two mails and log the result for the remote server. Try the double next time if it works and reduce to half if it fails. Sort of like how the TCP window increases when you transmit reliably.
Extensibility
I wouldn't give this much thought. Achieving scalability and reliability is MUCH harder and hence much more important to get right. Extensibility is simply the ability to hook in additional steps along the way, and you can add the seams for doing that when the core system is in place, or when you start adding functionality that you feel should be optional yet built-in.