扩展数据库中的临时记录以计算总数以阻止过度订阅
我正在研究一个系统,人们可以在其中回复免费的网络会议,这些网络会议可能会变得非常忙碌,有很多兴趣,而且席位有限,所以必须有某种方法来确保它不会过度 -已订阅...因此,我们需要从他们预订座位到完成并确认订单的时间进行倒计时。这样,实际上点击“参加”的前 X 个人就获得了会议席位。如果用户没有完成预订,他们分配的座位将被放回池中供其他人抢用...
发生的情况概述(我如何看待这个最佳工作方式)
1)用户在事件上单击“RSVP”,这使得 发送到 /rsvp/{event_id}/ 的 AJAX 请求
并存储唯一的 id(在本问题的其余部分中称为令牌)以及时间戳。它也将该令牌存储在会话中。
然后通知用户他们有 X 时间(假设 5 分钟)来完成其余详细信息(姓名/电子邮件/出生日期等)
但是:在将任何内容放入数据库之前,它会检查待处理的数量是否订单少于可用座位总数,如果不是(会议没有剩余座位),则返回“抱歉,没有可用座位了,请继续检查,因为预订尚未完成,有更多可用座位”
2)如果用户填写此内容及时,它将他们的详细信息存储在数据库中作为“参加”......
但是,如果他们未能及时填写表格,则每秒都会运行一个 cron 作业,并检查并删除任何时间戳超过的令牌5 分钟前,他们将失去参加会议的机会,会议上的这个“席位”随后会被放回池中。 (他们会收到通知,他们不成功并返回到第一步)
所有这些都很容易编写,只需将记录放入数据库中,执行 COUNT(*) FROMending_bookings WHEREconference_id = {x}
计算出有多少席位已确认或待定订单,然后从会议可用席位总数中减去。
但我不认为使用 MySQL 在这方面具有很大的可扩展性 - 这些已经(而且我确信还会有)超过 20 万人试图抢占大约 200 个席位,为每个席位执行 COUNT(*)这些人会变得相当昂贵,而且我们无法做任何像样的缓存,因为它需要实时检查有多少人在这个过程中。
我已经研究过使用 Amazon SimpleDB 来实现此目的,只是因为它具有部署即用的可扩展性,但是我以前用过它,发现 COUNT() 不一定准确(我猜它是可扩展性的本质)-出于明显的原因,COUNT 需要 100% 准确,我需要能够添加记录和时间戳对它和能够删除超过五分钟的记录。
I'm looking at a system where people RSVP to a free web conference, these web conferences can get really busy with a lot of interest and there are limited places, so there has to be some way to ensure it doesn't get over-subscribed... So, we need to have a countdown from the time they book their seat to the time that their order is completed and confirmed. This way it's the first X amount of people to actually click 'Attending' that get the seats for the conference. If the user does not complete their booking their allocated seat is put back in the pool for someone else to grab...
An overview of what happens (how I see this best working)
1) User clicks "RSVP" on event, this makes an AJAX request to /rsvp/{event_id}/
This goes and stores a unique id (refered to as token in the rest of this question) along with a timestamp. It stores that token in a session too.
The user is then notified they have X amount of time (let's say 5 mins) to complete the rest of their details (name/email/d.o.b, etc)
However: before anything is put in the database, it checks if the amount of pending orders is less than the total seats available, if not (there are no seats left for the conference) then it returns saying "sorry, no seats available any more, keep checking as bookings aren't completed and more seats become available"
2) If the user fills this out in time, it stores their details in database as "attending"....
If however they fail to fill the form out in time, there is a cron job running every second and go through and delete any tokens that have a timestamp of over 5 minutes ago so they will loose their chance to attend, this 'seat' on the conference is then put back in to the pool. (They'll be notified they were unsuccessful and taken back to the first step)
All easy enough to write with simply putting records in a database, doing a COUNT(*) FROM pending_bookings WHERE conference_id = {x}
and working out how many seats are either confirmed or orders pending, then subtracting from the total seats available for the conference.
But I don't feel that using MySQL would be very scalable on this - these have (and I'm sure will have again) in excess of 200k people trying to grab around 200 seats, doing a COUNT(*) for each one of those people will get pretty expensive, and we can't do any decent caching as it needs to be checking in real time how many people are in the process.
I've looked into using Amazon SimpleDB for this, just for it's deploy-and-go scalability, but I've used it before and seen that COUNT() isn't necessarily accurate (nature of it's scalability I guess) - for obvious reasons that COUNT needs to be 100% accurate, I need to be able to add records along with a timestamp to it and be able to delete records from it older than five minutes.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在我看来,所有 20 万人都可以得到一个“令牌”来开始,但只有 200 人可以完成令牌。
所以我想到了两件事;
1) 为什么不跟踪活动本身的“剩余座位”,这样您就不会进行计数,并且不会发生超额预订,因为更新活动所需的锁定将防止其低于零。
2) 在个人订阅期间的任何时候,如果剩余席位降至零,所有剩余代币将变得无效,并且用户被“踢”出注册过程(对此表示友好并道歉,但所有席位现在都已满等)
Seems to me that all 200k people could get a "token" to start with but only 200 can complete the token.
So two things come to mind;
1) why not keep track of "Seats Remaining" on the event itself that way you're not doing a count and over bookings cannot occur because the lock needed to update the event would prevent it ever going below zero.
2) At anytime during the persons subscription if the seats remaining drops to zero all the remaining tokens become invalid and users are "kicked" out of the signup process (being nice about it and apologizing but all seats are now full etc)
你发现了吗,或者这是一种直觉?我宁愿对此进行基准测试(以及您的网络服务器是否可以承载该负载),而不是事先假设它。索引(以及正确选择存储引擎)在这里可能会有很大帮助。
此外,让人们在注册活动之前注册并输入帐户数据不是更方便用户吗?因为现在你似乎更喜欢打字速度快的人,或者使用不错的浏览器的人。
Have you found that out, or is it a gut feeling? I'd rather benchmark that (and also whether your web servers can carry that load) than assume it on beforehand. Indexing (and properly selecting a storage engine) might help a lot here.
Furthermore, wouldn't it be a lot more user-friendly to let people register and enter their account data before registering for an event? Because now you seem to favor people who can type real quick, or who use a decent browser.
您可以缓存数据库中的可用计数,并在每次发出(注册开始)、确认(注册完成)或撤销(超过 5 分钟)令牌时更新该计数。但如果您需要这种性能,那么这确实不是 SQL 数据库的用途。
您可以使用 堆 实现一个相当简单的令牌代理(基于 expiring-soonest )。每次请求令牌时,它都会检查最早过期的令牌并查看它是否已过期。如果是这样,则将其从分配给的人手中撤销,然后将其交给新人。如果没有,请告诉他们稍后再试。
注册完成后,您必须从堆中删除该令牌(这是一个相对昂贵的操作,但您只执行了 200 次)。因此,您的堆中的令牌始终等于空闲+待处理插槽的数量。
每秒处理数千个 find-soonest-expiring 请求,例如 Fibonacci 堆即使在普通的硬件上也是微不足道的。
我没有检查,但我怀疑其他人已经解决了大部分(如果不是全部)这个问题,您也许可以下载所述令牌代理。
You could cache the count available in the database, and update that whenever you issue (registration started), confirm (registration finished), or revoke (over 5 minutes) a token. But if you need that kind of performance, this really isn't what a SQL database is for.
You could implement a fairly simple token broker using a heap (based on expiring-soonest). Every time a token is requested, it'd check the soonest-expiring token and see if it has expired. If so, revoke it from whoever it was assigned to, and give it to the new person. If not, tell them to try again later.
When a registration is completed, you'd have to remove that token from the heap (a relatively expensive operation, but you're only doing that 200 times). So your heap would always have tokens equal to the number of free+pending slots.
Handling even thousands-per-second of find-soonest-expiring requests against e.g., a Fibonacci heap is trivial on even modest hardware.
I didn't check, but I suspect that someone else has already solved most if not all of this problem, you may be able to download said token broker.
抱歉误读了问题。在这种情况下,我建议使用一个缓存表(是的,你可以),该表将使用事务所在表上的触发器(ON INSERT / DELETE)进行更新。
当您允许用户进入事务时,您将他的哈希值和过期时间戳插入该表中。它触发一个触发器,该触发器将根据 - +1(插入)/ -1(删除)更新缓存值(即缓存表中的值)。
当检查是否有空闲座位时,您检查缓存表。
这种缓存可以工作吗? :)
Sorry for misreading the question. In this case I would suggest having a cache table (yes, you can), that will be updated using a trigger (ON INSERT / DELETE) on the table where the transactions are.
When you allow a user to go into the transaction you insert his hash and expiration timestamp into that table. It triggers a trigger, that will update the cached value (i.e. value in the cache table) according to that - +1 for insert / -1 for delete.
When checking wheteher there are free seats, you check the cache table.
Would this sort of caching work? :)
Redis http://redis.io 非常适合您想要做的事情。您可以保留可用座位的柜台,存储临时数据并使其自动过期......超级通用。
Redis http://redis.io is a good fit for what you're wanting to do. You can keep a counter of available seats, store the temporary data and have it auto-expire... Super versatile.
我同意 xQbert 的观点,每次成功出席填写完成后,您只需减少该活动的数据库行(或相邻表)中的总数即可。
如果您担心预订冲突,那么只需在填写考勤信息时将其递减即可;然后向临时表添加一行,表示具有此会话 ID 的人员正在填写此活动的出席表,并且从该时间戳开始他们有 5 分钟的时间来完成。当他们完成表格后,他们的条目将从临时表中删除。
然后在此表上运行一个 cron 作业,查询小于或等于当前时间(5 分钟)的时间戳。任何出现的事件都将从表中删除,并且他们的事件将重新获得席位。然后,当他们在超过 5 分钟的时间范围内提交表格时,系统会检查以确保他们在表中仍有条目,然后再永久保存座位。
如果没有,请检查是否还有剩余座位。如果有则减少座位数并插入其预订记录。如果没有,那么给他们一个错误,表明他们没有在规定的时间内完成表格,失去了排队的位置,并且你的座位用完了。
它非常简单,并且可以在 PHP 端完成所有操作。您可能想在他们的页面上放置一个 JS 计时器,以便他们知道还剩下多少时间。
I agree with xQbert, with each successful attendance fill out completion you just decrement the total in the database row for the event (or an adjoining table).
If your worried about reservation collisions then just decrement it at the beginning of filling out the attendance info; Then add a row to a temp table saying the person with this session id is filling out the attendance form for this event and they have 5 minutes from this timestamp to complete it. When they complete the form their entry is removed from this temp table.
Then run a cron job on this table that queries for timestamps that are less than or equal to the current time - 5 min. Any that come up will be removed from the table and their events will get a seat incremented back on it. Then when they submit the form past the 5 min timeframe it will check to make sure they still have an entry in the table before saving their seat permanently.
If they don't then check to see if there are any seats remaining. If there are then decrement the seat count and insert their reservation record. If not then give them an error that says that they didn't complete the form in the allotted time, lost their position in line and you ran out of seats.
Its simple enough and can be done all on the PHP side. You will probably want to put a JS timer on their page so that they know how much time they have left though.
也许值得重新审视您的设计:
剩余座位数是座位总数减去预订座位数。
该数字仅在特定时刻有效,因为预订座位数会发生变化。
要预订座位,您需要进行一项交易:用户发出信号表示她想要预订一个或多个座位,处理注册,然后最终预订(或不预订)。
只要这笔交易还没有完成,你计算出的剩余座位数很可能是错误的。
这里有两个选项:
对于第一个选项,每次用户启动事务并告诉应用程序应该预订 X 个席位时,这些 X 个席位都会被锁定。这并不意味着这些座位已经被预订,它们只是被锁定,因此用户有足够的时间来完成预订交易。如果交易完成,这些锁定的座位将被用户预订。
对于第二个,乐观锁定,用户只是开始注册,但没有任何内容被锁定。预订交易结束时,如果不再有足够的座位,交易可能会失败。
宽容锁定可能会阻止某些用户在开始事务之前进行订阅 - 即使他们可以在一天后进行订阅 - 乐观锁定可能会阻止某些用户在事务结束时进行订阅。
您需要找出最适合您的情况的方法。通常乐观锁定对用户来说更友好(因为只有一些在最后失败),但是宽容锁定将帮助您在事务结束时不会让用户感到沮丧。如果票证总是出现挤兑,也许宽松锁定是最好的。
您可以考虑如何通过发挥可用性来减少用户的挫败感。例如,使用乐观锁定,事务中的每个页面都可以在顶部有一个 AJAX 剩余计数器,提供当前剩余席位数,因此您可以在席位用完时尽早通知用户。因此,即使他们可能已经将某些内容放入表格中,他们也可以看到他们的做法是多么幸运(或足够快)。
顺便说一句,我不会限制用户注册的时间。这会给用户带来压力。通过乐观锁定和 AJAX bar,如果座位用完,用户将承受足够的压力。只是你的系统不需要关心注册。
如果您想放心地允许注册,则需要选择宽松锁定。然后你需要一个超时,但我会为每个用户操作执行此操作,因此如果用户处于活动状态,超时将再延长 15 分钟,直到完成。我会在这里选择一个较高的值,以免用户因超时而感到沮丧。
对于那些想要在所有可用席位都被注册交易锁定的情况下注册的用户,您应该提供一个后备列表,并告诉用户当前 X 个席位已被注册锁定,但他们可能会幸运地稍后获得另一个席位。或者,您允许超额预订锁定席位的数量,以便这些用户排队,以防其他用户未成功完成其交易。
顺便说一句,数据库设计应该反映您预先定义的过程,我认为只要您知道要实现的目标,您就不会遇到真正的问题。由于一切都是在商业交易中完成的,您甚至可以对每个事件的总座位数、预订座位数和锁定座位数进行简单的统计。这是一个简单的查询,不需要像
COUNT(*)
那样进行聚合。触发器和存储过程也很有帮助。It's probably worth to re-visit your design:
The remaining seats are the total number of seats minus booked seats.
This number is only valid in a certain moment of time as the number of booked seats change.
To book a seat, you're doing a transaction: A user signals that she wants to book one or multiple seats, processes the registration and then finally books (or not).
As long as this transaction has not been finished, the remaining seats number that you can calculate is probably wrong.
You have two options here:
With the first one, each time a user starts a transaction and tells the application that X seats should be booked, these X seats are locked. It does not mean that those seats are already booked, they are just locked, so the user has enough time to complete the booking transaction. If the transaction finishes, those locked seats are booked by the user.
With the second one, Optimistic Locking the user just starts to register but nothing is locked. At the end of the booking transaction, the transaction may fail if not enough seats are available any longer.
Permissive Locking might prevent some users from subscribing before they start a transaction - even if they could a day later - Optimistic Locking might prevent some users from subscribing at the end of the transaction.
You need to find out what's working best for your case. Normally optimistic locking is more nice to users (as only some fail at the end), however permissive locking will help you to not frustrate users at the end of a transaction. If there is always a run on the tickets, maybe permissive locking is best.
You can think about how to make things less frustrating for users by bringing usability into play. For example with optimistic locking, each page in the transaction could have a AJAX remaining counter at the top giving the current numbers of remaining seats, so you can inform users early if seats have run out. So even they might already have put something into forms, they can see how lucky (or fast enough) they are with their doing.
I would not time-limit users to sign-up btw.. That puts stress on the user. With optimistic locking and the AJAX bar users will be stressed enough if seats run out. It's just that your system does not need to care about sign-ups.
If you want to allow sign-ups with a peace of mind, you need to choose permissive locking. Then you need a timeout, but I would do this per user action, so if the user is active, the timeout will be extended with another 15 minutes until completed. I would choose a high value here to not frustrate the user of getting a timeout.
For those users who want to sign-up while all available seats are locked by signup transactions, you should offer a back-list and tell the users that currently X seats are locked with sign-ups but they might be lucky to catch another seat later. Or you allow to over-book the number of locked seats so those users are queued in case another user does not finish her transaction successfully.
BTW, the database design should reflect the procedure you define upfront, I don't think you run into real problems here as long as you know what you try to achieve. As everything is done in buisiness transactions you can even keep a simple count per each event of total, booked and locked seats. That's one simple query, no need for aggregation like with
COUNT(*)
. Triggers and stored procedures can be helpful, too.我认为应用于 200k 条记录的 COUNT(*) 可以相当快地检索到。我刚刚尝试查询 243k 条记录,花费了不到 1 秒的时间。
但我认为你甚至不需要 COUNT,除了最后的确认步骤。
我认为你也不需要任何“cron jobs”。这肯定比仅仅查询数据库更加密集。
更不用说如果活动已经订满,您甚至不必让用户检查有多少人参加。一旦达到 200 名与会者,您就可以更改另一个 db 值。如果您没有剩余座位,首页甚至不应该显示可点击的按钮。
另外,如果还有足够的座位,为什么要限制用户 5 分钟呢?为什么用“cron job”清理他们的会话并强迫他们重新开始?如果还有100个座位怎么办?相反,当 200 个座位被预订但尚未全部确认时,最新的答题器将旧的一一排挤。
在首页上有“注册按钮”,不需要执行 COUNT(*) 查询。
你可以让它这样做:
一旦所有座位都被确认,event_status 就可以更新。
然后使用 PHP 或其他任何东西,您要么显示按钮,要么显示“抱歉,活动已满”。
一旦 200 个座位全部被预订,但尚未全部确认,您就开始将人们赶出去。当新人单击该按钮时,您将执行此操作。
如果这给了你任何结果,那么你就可以将你的人从等候名单中剔除。只需将他的会话标记为“已过期”,如果他确实单击“提交”,请首先检查该事件是否已完全确认。如果是这样,请告诉他抱歉,他来得太晚了。如果没有,则重复上面的查询,将其他人从列表底部踢出,并确认他,除非所有等待列表上的人都小于5分钟,这种情况下你可以让他等几分钟,然后看看是否有空闲的地方。
每次成功确认后,您可以查询已确认客人的 COUNT,如果数量超过 200 人,您可以将 event_status 更改为“full”并防止其他人点击该按钮。他们只会看到活动已满的消息。您应该能够将这两个查询包装在一个事务中,以确保多人无法同时预订,从而导致您最终超额预订。
这只是我的2分钱。
I think COUNT(*) applied to 200k records could be retrieved fairly quickly. I just tried a query on 243k records and it took well under 1 second.
But I don't think you even need COUNT, except at the final confirmation step.
I don't think you need any "cron jobs", either. That would certainly be more intensive than just querying the database.
Not to mention the fact that you don't even have to let users check how many people are attending if the event is already fully booked. Once you hit 200 attendees, you can just change another db value. The front page should not even display a clickable button if you don't have seats left.
Also, if there are plenty of seats left, why limit a user to 5 minutes? Why clean out their session with a "cron job" and force them to start all over? What if there are still 100 seats left? Instead, when 200 seats are reserved, but not all confirmed, you have the newest clickers crowd out the older ones one-by-one.
On your front page, where you have the "sign up button", it doesn't need to perform the COUNT(*) query.
You can just have it do this:
The event_status can be updated as soon as all the seats are confirmed.
Then using your PHP or whatever, you either display the button, or display "Sorry, Event is Full".
Once all 200 seats are reserved, but not all are confirmed, you start kicking people out. When a new person clicks the button, you do this.
If that gives you any results, then you have your guy to kick off the waiting list. Just flag his session as "Expired" and if he ever does click submit, 1st check to see if the event is fully confirmed. If so, tell him sorry he was too late. If not, then repeat the above query, kick someone else off the bottom of the list, and confirm him, unless all people on the waiting list are less than 5 minutes old, in which case you can ask him to wait a few minutes and see if any spots free up.
After every successful confirmation, you can query the COUNT of confirmed guests and if the number is 200+, you change the event_status to "full" and prevent anyone else from clicking that button. They will just see the message that the even is full. You should be able to wrap those 2 queries in a transaction to ensure that multiple people cannot book at the exact same time and you end up overbooked.
That's just my 2 cents.