从 Azure 辅助角色向 Azure Web 角色发送通知 - 最佳实践
情况
用户可以上传文档,队列消息将被放入带有文档 ID 的队列中。辅助角色将拾取此信息并获取文档。用Lucene彻底解析它。解析完成后,Webrole 上的 Lucene IndexSearcher 应该被更新。
在Web角色上,我保留一个静态Lucene IndexSearcher,因为否则你必须为每个搜索请求创建一个新的IndexSearch,这会产生大量开销等。
我想要做的是将通知从Worker Role发送到Web Role他需要更新他的 IndexSearcher。
可能的解决方案
- 制作某种通知队列。 Web 角色启动一个无休止的任务,不断检查通知队列。如果他找到一条消息,那么他应该更新 IndexSearch。
- 在辅助角色上启动 WCF 服务并与 Web 角色连接。从辅助角色进行回调,并通过服务告诉 Web 角色他需要更新他的 IndexSearcher。
- 只需定期更新
什么是最好的解决方案,或者是否有其他解决方案?
非常感谢!
Situation
Users can upload Documents, a queue message will be placed onto the queue with the documents ID. The Worker Role will pick this up and get the document. Parse it completely with Lucene. After the parsing is complete the Lucene IndexSearcher on the Webrole should be updated.
On the Web role I'm keeping a static Lucene IndexSearcher because otherwise you have to make a new IndexSearch every search request and this gives a lot of overhead etc.
What I want do to is send a notice from the Worker Role to the Web Role that he needs to update his IndexSearcher.
Possible Solutions
- Make some sort of notice queue. The Web Role starts an endless task that keeps checking the notice queue. If he finds a message then he should update the IndexSearch.
- Start a WCF Service on the Worker Role and connect with the Web Role. Do a callback from the Worker Role and tell the Web Role through the Service that he needs to update his IndexSearcher.
- Just update it on a regular interval
What would be the best solution or is there any other solution for this?
Many thanks !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您的辅助角色使用
(DateTime.MaxValue - DateTime.UtcNow).Ticks.ToString("d19")
之类的 PK 将每个已完成作业的详细信息写入表中,您将获得一个排序列表已处理的最新作业。将您的 Web 角色设置为轮询表,如下所示:对于执行索引工作的辅助角色,这非常有用,因为它们可以不加区别地写入表,而不必担心冲突。对于您来说,您还有他们正在处理的作业的审核日志(假设您在其中添加了一些详细信息)。
但是,您还有一个问题:听起来您有 1 个更新索引的 Web 角色。当然,这个 Web 角色可以以您选择的任何频率轮询该表(只需跟踪 LastIndexTime 以便稍后搜索)。您的问题是,如果您有多个 Web 角色,如何控制 Web 角色的并发性。每个 Web 角色是否维护其自己的索引,或者是否为所有人都存储了一个索引?抱歉,如果这是显而易见的,我不是 Lucene 的专家。
无论如何,如果您的 WebRole 中有多个实例以及所有人都可以看到的单个索引,则需要防止多个角色反复更新索引。您可以通过租用索引(如果存储在 Blob 存储中)来实现此目的。
根据评论更新:
如果每个WebRole实例都有自己的索引,那么您不必担心租赁。仅当它们一起共享 blob 资源时才会出现这种情况。因此,这种技术应该按原样工作正常,您唯一的潜在障碍是网络角色的轮询间隔可能会稍微不同步,导致在所有更新之前出现一些不同的结果(取决于您点击的实例)。每 30 秒在桌面上轮询一次,这将是您的最大不同步时间。每个 Web 角色实例只需要跟踪它上次更新的时间,并从该点开始进行增量搜索。
If your worker roles write each finished job's details to a table using a PK of something like
(DateTime.MaxValue - DateTime.UtcNow).Ticks.ToString("d19")
, you will have a sorted list of the latest jobs that have been processed. Set your web role to poll the table like so:For worker roles that do the indexing work, this is great because they can write indiscriminately to the table without worry of conflict. For you, you also have an audit log of the jobs they are processing (assuming you put some details in there).
However, you have one remaining problem: it sounds like you have 1 web role that updates the index. This one web role can of course poll this table on whatever frequency you choose (just track the LastIndexTime for searching later). Your issue is how to control concurrency of the web role(s) if you have more than one. Does each web role maintain it's own index or do you have one stored somewhere for all? Sorry, but I am not an expert in Lucene if that should be obvious.
Anyhow, if you have multiple instances in your WebRole and a single index that all can see, you need to prevent multiple roles from updating the index over and over. You can do this through leasing the index (if stored in blob storage).
Update based on comment:
If each WebRole instance has its own index, then you don't have to worry about leasing. That is only if they are sharing a blob resource together. So, this technique should work fine as-is and your only potential obstacle is that the polling intervals for the web roles could be slightly out of sync, causing somewhat different results until all update (depending on which instance you hit). Poll every 30 seconds on the table and that will be your max out of sync. Each web role instance simply needs to track the last time it updated and do incremental searches from that point.
根据上传频率,您可能会发现队列消息导致您不需要的更新。例如,如果您收到十几个上传并在很短的时间内处理它们,那么您现在将有十几个队列消息,每条消息都告诉您的网络角色进行更新。保留单个信号(可能是表行或 SQL Azure 行)更有意义。您可以简单地将行值设置为 1,表示需要更新。当您的 Web 角色检测到此更改时,重置为 0 并开始更新。注意:如果使用 Azure 表行,您需要轮询更新(并且根据流量,您可能会开始累积大量事务)。您也可以将 AppFabric 缓存用于此信号。
您可以在 Web 角色的内部端点上使用 WCF 服务。但是,您仍然遇到突发问题(例如,如果您在网络角色更新时收到十几次上传,您就不想再进行十几次更新)。
Depending on upload frequency, you may find queue messages to cause you unneeded updates. For instance, if you get a dozen uploads and process them in close time proximity, you'd now have a dozen queue messages, each telling your web role to update. It would make more sense to keep a single signal (maybe a table row or SQL Azure row). You could simply set a row value to 1, signaling the need to update. When your web role detects this change, reset to 0 and start the update. Note: If using an Azure Table row, you'd need to poll for updates (and depending on traffic, you could start accumulating a large number of transactions). You could use the AppFabric Cache for this signal as well.
You could use a WCF service on an internal endpoint on your Web Role. However, you still have the burst issue (if you get, say, a dozen uploads while the webrole is updating, you don't want to then do another dozen updates).