python中多维数组元素的唯一ID

发布于 2024-12-21 00:20:44 字数 265 浏览 5 评论 0原文

我有一个多维数组，其中的元素可以完全随机。例如，

[
    [ [1, 2], [2, 1], [3, 1], [4, 2] ],
    [ [2, 1], [4, 3], [3, 4], [1, 3] ]
]

我想为每个唯一元素分配一个 ID（如 [1,2] 中的元素，而不是其中的元素），以便稍后当该数组更大时我可以识别它，但我不能似乎明白了。我已经在互联网上搜索了一段时间，但没有运气，所以如果有人能推动我朝正确的方向前进，我将非常感激。

原文

I have a multidimensional array with elements that can be completely random. For example,

[
    [ [1, 2], [2, 1], [3, 1], [4, 2] ],
    [ [2, 1], [4, 3], [3, 4], [1, 3] ]
]

I'd like to assign an ID to each unique element (as in [1,2], not the elements within those) so that I can recognize it later on when this array is much larger, but I can't seem to figure it out. I've been searching the internet for a while now with no luck, so if someone could give me a push in the right direction I'd really appreciate it.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

∞琼窗梦回ˉ 2024-12-28 00:20:44

使用这样的东西怎么样？

class ItemUniqifier(object):
    def __init__(self):
        self.id = 0
        self.element_map = {}
        self.reverse_map = {}

    def getIdFor(self, obj):
        obj_id = self.element_map.get(obj)
        if obj_id is None:
            obj_id = self.id
            self.element_map[obj] = obj_id
            self.reverse_map[obj_id] = obj
            self.id += 1
        return obj_id

    def getObj(self, id):
        return self.reverse_map.get(id)

uniqifier = ItemUniqifier()
print uniqifier.getIdFor((1,2))
print uniqifier.getIdFor((1,2))
print uniqifier.getIdFor("hello")
print uniqifier.getObj(0)
print uniqifier.getObj(1)

打印：

0
0
1
(1, 2)
hello

例如，要创建一个大数组，您可以执行如下操作：

uniqifier = ItemUniqifier()
sample_array = []
for j in range(3):
    inside_array = []
    for i in range(10):
        inside_array.append(uniqifier.getIdFor((i, i+1)))
    sample_array.append(inside_array)

import pprint
pprint.pprint(sample_array)

for inside in sample_array:
    for elem in inside:
        print uniqifier.getObj(elem),
    print

打印：

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)

How about using something like this?

class ItemUniqifier(object):
    def __init__(self):
        self.id = 0
        self.element_map = {}
        self.reverse_map = {}

    def getIdFor(self, obj):
        obj_id = self.element_map.get(obj)
        if obj_id is None:
            obj_id = self.id
            self.element_map[obj] = obj_id
            self.reverse_map[obj_id] = obj
            self.id += 1
        return obj_id

    def getObj(self, id):
        return self.reverse_map.get(id)

uniqifier = ItemUniqifier()
print uniqifier.getIdFor((1,2))
print uniqifier.getIdFor((1,2))
print uniqifier.getIdFor("hello")
print uniqifier.getObj(0)
print uniqifier.getObj(1)

This prints:

0
0
1
(1, 2)
hello

So, for example, to create a large array, you can do something like this:

uniqifier = ItemUniqifier()
sample_array = []
for j in range(3):
    inside_array = []
    for i in range(10):
        inside_array.append(uniqifier.getIdFor((i, i+1)))
    sample_array.append(inside_array)

import pprint
pprint.pprint(sample_array)

for inside in sample_array:
    for elem in inside:
        print uniqifier.getObj(elem),
    print

This prints:

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)
(0, 1) (1, 2) (2, 3) (3, 4) (4, 5) (5, 6) (6, 7) (7, 8) (8, 9) (9, 10)

回复收藏 0 原文

岁月如刀 2024-12-28 00:20:44

最简单的方法是使用字典，如下所示：

id_map = { 'some_id'  : example_array[0][0][0], # maps 'some_id'  to [1, 2]
           'other_id' : example_array[0][1][3], # maps 'other_id' to [3, 4]
           # add more if wanted...
         }

虽然字典可以同时使用字母键和数字键，但不建议使用数字键来引用索引，因为它可能会导致混乱带有列表索引编号。

此外，字典可以按需添加新键，如下所示：

id_map[new_key] = new_pair

既然您说列表是动态生成的，那么这是最好的选择。

由于每个数字对都是通过 3 个索引调用来访问的，也许您应该将 id 设为 3 位长？例如，[1, 2] 将映射到 id '000'，[3, 4] 映射到 id '013'。

字典 - Python 文档

The easiest way would be to use a dictionary, like so:

id_map = { 'some_id'  : example_array[0][0][0], # maps 'some_id'  to [1, 2]
           'other_id' : example_array[0][1][3], # maps 'other_id' to [3, 4]
           # add more if wanted...
         }

While a dictionary CAN use both alphabetical and number keys, it is not recommended to use number keys to refer to indices, since it may lead to confusion with list index numbering.

In addition, dictionaries can add new keys on demand, like so:

id_map[new_key] = new_pair

Since you said the lists were dynamically generated, this is the best choice.

Since each number pair is accessed through 3 index calls, perhaps you should make the ids 3 digits long? For example, [1, 2] would map to id '000' and [3, 4] to id '013'.

Dictionaries - Python Documentation

回复收藏 0 原文

撩发小公举 2024-12-28 00:20:44

如果每个“元素”都是两个单位数以 10 为基数的整数的序列，则可以根据其内容为每个元素生成一个唯一的 id，如下所示：

def uniqueID(elem):
    return elem[0]*10 + elem[1]

基本思想是找出某种方法来使用元素的内容来生成 ID 。当然，具体如何完成取决于内容是什么。

If each "element" is sequence of two single-digit base 10 integers, you could generate a unique id for each one from its contents like this:

def uniqueID(elem):
    return elem[0]*10 + elem[1]

The basic idea is figure out some way to use the contents of an element to generate a ID. Exactly how the might be done would depend on what that content is, of course.

回复收藏 0 原文

别念他 2024-12-28 00:20:44

这是另一个可以处理混合类型的答案——即列表，元组，&字符串——可变长度（甚至零长度）序列。

class EOS(object): pass  # end-of-sequence marker
EOS = EOS()  # singleton instance

class SeqID(object):
    """ Create or find a unique ID number for a given sequence. """

    class TreeNode(dict):
        """ Branch or leaf node of tree """
        def __missing__(self, key):
            ret = self[key] = self.__class__()
            return ret

    def __init__(self, first_ID=1):
        self._next_ID = first_ID
        self._root = self.__class__.TreeNode()

    def __getitem__(self, seq):
        # search tree for a leaf node corresponding
        # to given sequence and creates one if not found
        node = self._root
        for term in seq:
            node = node[term]
        if EOS not in node:  # first time seq encountered?
            node[EOS] = self._next_ID
            self._next_ID += 1
        return node[EOS]


elements = [
    [ [1, 2], [1, 3], [2, 1], [3, 1], [4, 2] ],
    [ [], [2, 1], [4, 3], [3, 4], (1, 3) ],
    [ [2, 2], [9, 5, 7], [1, 2], [2, 1, 6] ],
    [ 'ABC', [2, 1], [3, 4], [2, 3], [9, 5, 7] ]
]

IDs = SeqID(1000)
print '['
for row in elements:
    print '  [ ',
    for seq in row:
        print '%r: %s,' % (seq, IDs[seq]),
    print ' ],'
print ']'

测试用例中显示的多维数组的元素与您的示例类似，但添加了一些内容，将生成以下输出。请注意，生成的 ID 号已强制从 1000 开始，以便更容易在输出中识别它们。

[
  [  [1, 2]: 1000, [1, 3]: 1001, [2, 1]: 1002, [3, 1]: 1003, [4, 2]: 1004,  ],
  [  []: 1005, [2, 1]: 1002, [4, 3]: 1006, [3, 4]: 1007, [1, 3]: 1001,  ],
  [  [2, 2]: 1008, [9, 5, 7]: 1009, [1, 2]: 1000, [2, 1, 6]: 1010,  ],
  [  'ABC': 1011, [2, 1]: 1002, [3, 4]: 1007, [2, 3]: 1012, [9, 5, 7]: 1009,  ],
]

该代码的工作原理是根据每个序列中元素出现的顺序以及它们的内容在内部构造一个多分支搜索树。

一个潜在的警告是，生成的 ID 取决于第一次看到每个唯一序列的顺序，因为每个新 ID 只比上一个 ID 多 1。

另请注意，保存在不同容器中的相同元素的序列将生成相同的 ID，因为所示代码中忽略了序列的类型，但也可以更改它以将类型考虑在内。

Here's another answer that can handle mixed types -- i.e. lists, tuples, & strings -- of variable-length (even zero-length) sequences.

class EOS(object): pass  # end-of-sequence marker
EOS = EOS()  # singleton instance

class SeqID(object):
    """ Create or find a unique ID number for a given sequence. """

    class TreeNode(dict):
        """ Branch or leaf node of tree """
        def __missing__(self, key):
            ret = self[key] = self.__class__()
            return ret

    def __init__(self, first_ID=1):
        self._next_ID = first_ID
        self._root = self.__class__.TreeNode()

    def __getitem__(self, seq):
        # search tree for a leaf node corresponding
        # to given sequence and creates one if not found
        node = self._root
        for term in seq:
            node = node[term]
        if EOS not in node:  # first time seq encountered?
            node[EOS] = self._next_ID
            self._next_ID += 1
        return node[EOS]


elements = [
    [ [1, 2], [1, 3], [2, 1], [3, 1], [4, 2] ],
    [ [], [2, 1], [4, 3], [3, 4], (1, 3) ],
    [ [2, 2], [9, 5, 7], [1, 2], [2, 1, 6] ],
    [ 'ABC', [2, 1], [3, 4], [2, 3], [9, 5, 7] ]
]

IDs = SeqID(1000)
print '['
for row in elements:
    print '  [ ',
    for seq in row:
        print '%r: %s,' % (seq, IDs[seq]),
    print ' ],'
print ']'

With the elements of the multidimensional array shown in the test case, which are similar to those your example but with several additions, the following output is produced. Note that the ID numbers generated have been forced to start at 1000 to make them easier to spot in the output.

[
  [  [1, 2]: 1000, [1, 3]: 1001, [2, 1]: 1002, [3, 1]: 1003, [4, 2]: 1004,  ],
  [  []: 1005, [2, 1]: 1002, [4, 3]: 1006, [3, 4]: 1007, [1, 3]: 1001,  ],
  [  [2, 2]: 1008, [9, 5, 7]: 1009, [1, 2]: 1000, [2, 1, 6]: 1010,  ],
  [  'ABC': 1011, [2, 1]: 1002, [3, 4]: 1007, [2, 3]: 1012, [9, 5, 7]: 1009,  ],
]

The code works by internally constructing a multi-branched search tree based on the order the elements in each sequence occur and what they are.

A potential caveat is that the IDs produced are dependent on the order in which each unique sequence is first seen since each new ID is simply one more than the last one.

Also note that sequences of the same elements held in different containers will generate the same ID since the type of sequence is ignored in the code shown -- but it could be changed to take type into account, too.

回复收藏 0 原文

~没有更多了~