通过分页将动态提要集成到数据库结果中
我有两个主要数据源需要在 php 中进行混搭。第一个是 SQL 数据库,另一个是提要。两个来源都包含类似的项目,需要按“名称”或“重量”排序。每个源中都有超过 1000 个项目。简单地将提要存储在数据库中并不是一种选择,因为提要变化太频繁。
我想要实现的是两者的准确分页输出。例如:如果我显示包含按“名称”排序的 50 个项目的结果的第二页,我不会读取源和数据库中的所有数据,而只会读取两个源中的必要项目。
我目前认为这需要数据库数据成为基本数据,并以某种方式将提要数据合并到其中,但是:1)我可能是错的,2)无论我如何努力,我似乎都无法理解这一点。
关于如何实现这一目标有什么建议吗?
PS 结果不一定完全准确。他们只需要看起来像这样。
I have two main data sources I need to do a mashup with in php. The first one is an SQL database and the other is a feed. Both sources contain similar items that need to be sorted by "name" or "weight". There are more than 1k items in each of the sources. Simply storing the feed in the db is not an option, as the feed changes far too often.
What I want to achieve is an accurate paginated output of the two. For example: if I display page two of the results containing 50 items sorted by "name" I don't read all data from feed and database but only the necessary items from both sources.
I currently assume this requires the database data to be fundamental data and to somehow merge the feed data into it, but: 1) I may be wrong, 2) I cannot seem to get my head around this no matter how hard I seem to try.
Any suggestions on how to achieve this?
P.S. The results do not have to be completely accurate. They only have to seem like it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
考虑偶尔缓存提要结果。然后你就可以计算出你需要的物品的大致位置。
对于用户请求第二页的示例,使用缓存,您认为(在缓存时)准确的结果将是数据库中的项目 #20-50 和源中的项目 #30-50。由于提要经常更改,因此您不能只阅读相同的项目并将其显示给用户。相反,您可以读取额外的项目,例如双面 10。对于上面的内容,它将是来自数据库的项目 #10-60 和来自 feed 的项目 #20-60。使用额外的项目并查找与缓存的匹配项,您将能够提供更准确的结果,然后您将通过请求相同的 #20-50 和 #30-50 来提供结果。
这样,结果将不会完全准确,正如您在“PS”中允许的那样。但用户请求的次数越多,答案就越准确,因为每个请求都会更新一点缓存。您还必须实现缓存过期跟踪,以防用户长时间未请求某些页面,从而完全过时。
上述的可能实现是一个额外的表,其中每个“页面”-“排序类型”组合都有一行。在您的示例中,对于每页 50 个项目和总共 2000 个项目,将有 40 行用于“按名称排序”,40 行用于“按重量排序”。每行包含有关上次访问特定页面时从两个源获取哪些项目的信息,以及该信息的上次更新时间戳。
PS您没有发布任何细节,这使得很难给出更具体的答案,但我希望我写的内容能有所帮助!
Consider caching feed results once in a while. Then you can figure approximate position of items you need.
For your example of user requesting second page, using your cache you figure that (at the moment of caching) accurate results would be items #20-50 from database and items #30-50 from feed. Since feed changes often, you can't just read same items and show them to user. Instead, you can read extra items, for example 10 both sides. For above, it would be items #10-60 from database and #20-60 from feed. Using extra items and finding matches with cache, you will be able to provide more accurate results, then you would by requesting same #20-50 and #30-50.
This way results won't be completely accurate, as you allow in "P.S.". But the more often users do requests, the more accurate the answer will be, since every request will update the cache a little bit. You will have to implement cache expiration tracking also, in case some pages were not requested by users for a long time and therefore are completely outdated.
Possible implementation of the above would be an extra table with a row for every "page"-"sort type" combination. In your example, for 50 items per page and 2000 items total, there will be 40 rows for "sort by name" and 40 rows for "sort by weight". Each row consists information about which items from both sources were taken last time that particular page was accessed, and that information's last update timestamp.
P.S. You don't post any details, which make it hard to come out with the more specific answer, but I hope what I wrote will help!