Redis 集和示例
已经使用了几周了,给我留下了深刻的印象;量 我已经实现了并且仅使用最小的就节省了时间 命令集很棒。
使用维基百科作为我的数据,我制作了一个小蜘蛛来抓取所有 维基百科上的页面并下载它们。
我使用 redis 来简单地记录已下载的页面, 以防止重复。
下载每个页面后,我执行:
sadd wiki pagename
并检查每个页面是否存在:
sismember wiki pagename
哇,抱歉过度解释。我的问题是,以下内容做什么 命令的作用以及它们何时可能被使用或有用。
sdiff
sinter
sunion
sdiff = 减去多个集合..
Been using for a few weeks now, and I'm so impressed; the amount
I have achieved and the time saved just from using the smallest
set of commands is great.
Using Wikipedia as my data, I made a small spider to grab all the
pages on wikipedia and download them..
I use redis to simply keep a record of which pages have been downloaded,
to prevent duplicates.
As each page is downloaded I execute:
sadd wiki pagename
And check each page for existence with:
sismember wiki pagename
Wow, sorry for the over explanation.. My question is , what do the following
commands do and when would they be likely used or be useful.
sdiff
sinter
sunion
sdiff = subtract multiple sets..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为 sdiff,sinter和sunion用redis中的例子进行了合理的解释命令文档。这些是经典的 group 数学运算,在需要操作的各种情况下非常有用多个集合中的数据可能由相似或相同的项目组成。
I think sdiff, sinter and sunion are reasonably explained with examples in redis commands documentation. These are classic group math operations which are useful in various situations where you need to manipulate data among multiple sets which might consist of similar or same items.
假设您有一家书店,并且您想要弄清楚哪些类型是相关的,以便您可以推荐相关类型的书籍。不像现在经典的“买了这个的顾客也买了X”,而更像是向对科幻小说感兴趣的人推荐奇幻书籍。
实现此目的的一种方法是为每个客户分配一个 ID,并且对于每本购买的书籍,将该 ID 放入代表一种类型的集合中。如果您想知道哪些流派相关,则可以使用集合运算来查找有趣的指标。其中一个是 Jacard 指数,即交集的大小除以并集的大小——换句话说,就是购买过每种类型至少一本书的顾客数量,除以购买过某一类型书籍的顾客数量。任何类型的书。索引越低意味着相似性越小,索引越高意味着相似性越近。指数为零意味着没有人购买过两种类型的书籍,指数为 1 意味着每个购买一种类型书籍的人也购买了另一种类型的书籍。
您还可以使用设定的差异来计算购买了一种类型的书籍而未购买另一种类型的书籍的客户数量(如果两种类型相似,也许建议他们尝试阅读来自另一种类型的书籍)其他类型)。
Say you have a bookstore and you want to figure out what genres are related so you can make recommendations for books in related genres. Not quite the now classic "customers who bought this also bought X", but more like recommending fantasy books to people who are interested in science fiction, say.
One way to do this would be to assign a ID to every customer, and for each purchased book put that ID in a set that represents a genre. If you want to know which genres are related you can then use set operations to find interesting metrics. One of these is the Jacard index, the size of the intersection divided by the size of the union -- in other words the number of customers who have bought at least a book from each genre, divided by the number of customers who have bough a book in any of the genres. A lower index means less similarity, a higher index means closer similarity. An index of zero means no one has bought a book from both genres, and an index of one means that everyone who bought a book in one genre also bought a book in another.
You could also use the set difference to calculate the number of customers who bought a book in one genre that did not buy a book in the other (and if the two genres are similar, perhaps suggest to these that they should try a book from the other genre).