Kinesis 与 SQS,哪个最适合这种特殊情况?

发布于 2025-01-20 22:24:12 字数 544 浏览 0 评论 0 原文

我一直在阅读有关 Kinesis 与 SQS 的差异以及何时使用它们的信息,但我很难知道哪种解决方案适合这个特定问题:

  • 类似 Strava 的应用程序,用户记录其运行
  • 每秒 50 次传入运行
  • 每次运行的处理正好需要 1 分钟
  • 我希望用户在 5 分钟内得到结果
  • 一次运行只是一个 guid,处理它的作业将从 S3 获取所有信息

如果我在 kinesis 中理解正确,每个分片可以有 1 个工作人员, 正确的?这意味着每分钟运行 1 次。由于我每分钟有 3000 个传入运行,要满足 5 分钟的截止日期意味着我需要有 600 个分片,每个分片有 1 个工作人员。

这个假设正确吗?

使用 SQS,我可以只拥有 1 个队列和任意数量的工作人员,最多可达 SQS 120,000 条传输消息的限制。

  • 如果在处理过程中出现 1 个运行错误,我想再重新处理几次,然后将其存储以供进一步检查。

  • 我不需要按顺序处理消息,重复消息也完全没问题。

I have been reading about Kinesis vs SQS differences and when to use each but I'm struggling to know which is the appropriate solution for this particular problem:

  • Strava-like app where users record their runs
  • 50 incoming runs per second
  • The processing of each run takes exactly 1 minute
  • I want the user to have their results in less than 5 minutes
  • A run is just a guid, the job that processes it will get al the info from S3

If i understand correctly in kinesis you can have 1 worker per shard, correct? That would mean 1 runs per minute. Since i have 3000 incoming runs per minute, to meet the 5 minute deadline would mean i would need to have 600 shards with 1 worker each.

Is this assumption correct?

With SQS I can just have 1 queue and as many workers as I like, up to SQS's limit of 120,000 inflight messages.

  • If 1 run errors during processing I want to reprocess it a few more times and then store it for further inspection.

  • I don't need to process messages in order, and duplicates are totally fine.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

海螺姑娘 2025-01-27 22:24:13

每条消息的1个工人,经过处理后,我不再关心消息

,在这种情况下,应使用 sqs之类的排队服务。 Kinesis是一项流媒体服务,它坚持数据。这意味着,只要有效,就可以从流中读取消息。您的工人中的非工人将能够从流中删除消息。

另外,使用SQS,您可以设置这将使您在预定义的试验次数后捕获未能处理的消息。

1 worker per message, after it's processed i no longer care about the message

In that case, a queuing services such as SQS should be used. Kinesis is a streaming service, which persist a data. This means that multiple works can read messages from a stream for as long as they are valid. Non of your workers would be able to remove the message from the stream.

Also with SQS you can setup dead-letter queues which would allow you capture messages with fail to process after a pre-defined number of trials.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文