Play 框架:乔布斯对无状态模型的影响
Play 框架的优点之一是它完全无状态并且仅面向请求/响应。这真的很好,因为它允许我将应用程序部署到云并扩展负载均衡器后面的播放实例数量,而不必担心状态(会话)复制...
但是,最近我需要执行一些应用程序逻辑在 HTTP 请求之外,发现 Play 可以定义完全由框架管理的作业。听起来很棒,但它提出了一个问题:这些作业如何适应 Play 使用的无状态模型?
假设我有一项需要每小时运行一次的维护任务,我为此定义了一个计划作业。如果我随后在负载均衡器后面部署多个 Play 实例,该作业是否会在每个实例上同时启动?如果是这样,处理需要“独占”运行的作业的好方法是什么?
我正在考虑在非集群服务器上创建一个新的 play 实例,重新使用现有(集群)实例的 JPA 模型(从而连接到同一数据库)。这个新实例将仅包含维护作业,并且由于它托管在非集群服务器上,因此不存在作业同时运行的风险。同时,这将使我能够保持现有的集群实例完全无状态并且易于托管/负载平衡。这是一个好方法吗?
One of the great things about the Play framework is that it is fully stateless and only request/response-oriented. This is really nice since it allows me to deploy my app to the cloud and scale the number of play instances behind my load balancer without having to worry about state (session) replication...
Recently, however, I needed to execute some application logic outside of an HTTP request and found out that Play has the possibility to define Jobs which are fully managed by the framework. Sounds brilliant but it raises the question: how do these jobs fit into the stateless model that is used by Play?
Say I have a maintenance task that needs to run every hour and I define a scheduled job for that. If I then deploy multiple Play instances behind a load balancer, will that job be started at the same time on each instance? And if so, what would be a good approach to handle jobs that need to run "exclusively"?
I was thinking of creating a new play instance on a non-clustered server, re-using the JPA model of the existing (clustered) instance (and thus connecting to the same database). This new instance would contain only the maintenance jobs and since it's hosted on a non-clustered server, there is no risk of a job running simultaneously. At the same time, this would allow me to keep my existing, clustered instance completely stateless and easy to host / load balance. Would this be a good approach?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我也建议将工作集群化。您可以在数据库中设置一个信号量以确保只有一个作业正在运行。
另一个想法是看看 Akka-Framework,它将包含在 Play 2.0 中。我认为它有内置机制来处理这个问题,但我不确定。我没有使用akka的经验。
I would recommend to cluster the job too. You could set a semaphore in the database to ensure that only one job is running.
Another idea is to have a look at the Akka-Framework, which will be included in Play 2.0. I think it has build in mechanism with handle this problem, but I'm not sure. I haven't experiences with akka.
正如尼尔斯提到的,在数据库中保留一个标志有助于查明作业是否已经在运行。我使用带有其他标志的数据库信号量来为我提供作业状态和额外信息。
您可以做的另一件事是使用 Play.id 来计算并定义哪个实例应该运行作业。
我们使用“play start --%prod”、“play start --%prod1”...来启动应用程序以及我的 doJob() 方法中的以下内容:
As neils mentioned keeping a flag in the DB helps to find out if the job is already running. I use a db semaphore with other flags to give me the job status and extra info.
Another thing you could do is to use the Play.id to work out and define which instance should be running the jobs.
We use "play start --%prod", "play start --%prod1"... to start the apps and the following in my doJob() method:
快速浏览了 Play Framework 的源代码(类
Job
和JobsPlugin
)后,我认为这些不适合在集群环境中使用,因为当重要的是作业仅在某个时间间隔运行一次(不引入丑陋的黑客)。我看到三种可能的解决方案:
使用支持集群的作业调度程序。显而易见的选择是 Quartz。 Play 还使用 Quartz 的部分(用于解析 CRON 表达式),但不使用执行调度的部分。
使用 Play 2 时,可能会选择 Akka,它提供 调度程序。
更改您的作业,使其在运行两次时无关紧要(对于某些用例可能)。
Having had a quick look into the source code of the Play Framework (classes
Job
andJobsPlugin
) I think these are not suitable to use in a cluster environment when it's important that the Job only runs once per some time interval (without introduction of ugly hacks).I see three possible solutions:
Use a job scheduler which supports clustering. The obvious choice is Quartz. Play also uses parts of Quartz (to parse the CRON expressions), but not the part which does the scheduling.
When using Play 2, possibly go for Akka, which offers a scheduler.
Change your job such that it doesn't matter when it's being run twice (possible for some use cases).