匹配多个离散集合中的特定项目
我有一个问题,我有几个离散的 ID 列表,例如。
列表 (A) 1,2,3,4,5,7,8
列表 (B) 2,3,4,5
列表 (C) 4,2,8,9,1
等等...
然后我有另一个ID集合...
例如:1,2,4
我需要尝试将一个匹配到每个列表中。如果我可以完美匹配辅助集合中的所有 ID(一个集合 ID 与每个列表中的 ID 匹配),那么我会得到一个真实的结果......
我发现它变得很复杂,因为如果您只是迭代与您遇到的第一个集合/列表对可能会导致您进一步排除可能的组合,从而返回假阴性结果。
例如:
列表 (A) 1,2,3,4
列表 (B) 1,2,3,4
列表 (C) 3,4
集合为:3,1,2
集合 (3) 中的第一个 ID 与列表 A 中的条目匹配,集合 (1) 中的第二个 ID 与列表 B 中的项目匹配,但是集合中的最终 ID (2) 不匹配列表 C 中的任何条目,但是如果您将集合的顺序重新排列为:2,1,3,则找到匹配项......因此我正在寻找某种形式的逻辑为了以有效的方式尝试匹配所有可能的组合(?)
为了使它更复杂,ID实际上是GUID,所以不能只是按升序排序
我希望我已经描述得足够好,以清楚地表明我正在尝试什么如果幸运的话,有人能够告诉我,我需要做的事情非常简单,而我错过了一些真正简单的东西!
我被迫在 VB6 中编写此代码,但任何方法或伪代码都很棒。它的后端是 SQL 服务器,因此如果可以使用 TSQL 的解决方案,那就更好了,因为所有 ID 都已保存在表中。
非常感谢。
I have a problem whereby I have several discrete lists of ID's eg.
List (A) 1,2,3,4,5,7,8
List (B) 2,3,4,5
List (C) 4,2,8,9,1
etc...
I then have another collection of ID's...
For example: 1,2,4
I need to try and match one into each list. If I can perfectly match all ID's in my secondary collection (one collection ID matched with an ID from each list) then I get a true result....
I have found that it becomes complicated because if you simply iterate over the lists matching the first collection/list pair that you encounter it may result in you precluding a possible combination further on down the line hence returning a false negative result.
For example:
List (A) 1,2,3,4
List (B) 1,2,3,4
List (C) 3,4
Collection is: 3,1,2
The first ID from the collection (3) matches with an entry in list A, the second ID in the collection (1) matches an item in list B, however the final ID in the collection (2) DOESNT match any entry in list C however if you rearrange the order of the collection to be: 2,1,3 then a match is found.... Therefore I am looking for some form of logic for attempting a match on all possible combinations in an efficient manner(?)
To make it more complicated the ID's are actually GUID's so cant just be sorted in ascending order
I hope I have described this well enough to make it clear what I am attempting and with a bit of luck somebody will be able to tell me that what I need to do is very easy and I am missing something real simple!
I am forced to code this in VB6 but any methods or pseudo code would be great. The backend of this is SQL server so if a solution using TSQL was possible this would be even better as all of the ID's are held in tables already.
Many thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Jake,是的,列表和集合都包含 GUIDS。我使用普通整数来稍微简化问题。
一旦列表匹配,就无法再次搜索,因此我试图解释排序问题。如果您将列表设置为“已匹配”,则将不会执行进一步的匹配尝试。正是这种行为可能导致漏报。
以每种可能的订单组合“发送”收藏品是可行的,但这将是一项艰巨的工作......
我觉得我一定在这里错过了一个非常简单的概念或解决方案?!!
感谢您迄今为止的帮助。
Jake, yep the lists and the collection both contain GUIDS. I used plain integers to simplify the problem a bit.
Once a list has been matched it cant be searched again, hence the ordering problem that I tried to explain. If you say that a list as 'matched' then no further attempts to match this will be performed. It is this very behaviour that can cause a false negative.
'Sending' the collection in in every possible combination of orders would work but would be a massive job .....
I feel I must be missing a really straightforward concept or solution here??!!
Thanks for your assistance so far.
我没有找到一种方法来对照集合中的每个 GUID 检查列表中包含的每个 GUID。您必须记录集合中每个 GUID 出现在哪个列表中。
使用集合 (3, 1, 2) 的示例,3 出现在列表 A、B 和 C 中。
您基本上将剩下这个数据集。
将其提炼为该数据集后,您可以确定列表中是否存在任何出现次数为零的 GUID,这会导致负值。
我一点也不精通算法,但这就是我之后的处理方式:
从第一组(A、B、C)开始,然后检查它在数据集中出现的次数。在这种情况下,没有发现任何情况。
继续进行下一组(A,B),如果发现该组出现的次数大于该组的长度,即出现两次以上,则结果为负。如果出现的次数与长度完全匹配,就像这里的情况一样,则可以从任何进一步的考虑中删除集合 (A, B)。
我想您会继续重复该过程,直到识别出否定结果或排除所有出现的情况。对于这类问题可能有一个公认的算法,但我在这方面的知识有点缺乏。 :(
I don't see a way around checking each GUID contained in the lists against each GUID in the collection. You would have to keep record of in which lists each GUID in the collection occurs.
To use your example of the Collection (3, 1, 2), 3 occurs in List A, B and C.
You will basically be left with this dataset.
Once you have distilled it down to this dataset you can determine whether there are any GUIDs with zero occurrences in the lists which would result in a negative.
I am not at all well versed in algorithms, but this is how I would proceed after that :
Start with the first set (A, B, C), and check how many times it occurs further on in the dataset. In this case no occurrences are found.
Moving on to the next set (A, B), if the number of occurrences of this set is found to be greater than the length of this set, i.e. more than two occurrences, would result in a negative. If the number of occurrences match the length exactly, as is the case here, the set (A, B) can be removed from any further consideration.
I guess you would continue to repeat the process until a negative is identified or all the occurrences have been excluded. There is probably a recognized algorithm for this sort of problem, but my knowledge is a bit lacking in that respect. :(