如何从无序列表创建字典,其中列表包含键,然后后面跟着多个值?

发布于 2025-01-17 16:15:05 字数 1141 浏览 5 评论 0原文

我有多个列表,其排序如下:

['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']

我需要将此列表转换为字典,其中末尾带有“:”的单词作为键。该列表正在发生变化,因此有时会添加带有“:”的新单词。相应的值始终位于列表中带“:”的单词之后的下一个位置。

当我开始迭代这个列表时,很快就会感到沮丧,因为目前我有太多的可能性。所以我想问,是否有人知道从这样的列表到字典的快速转换。

我尝试了多个迭代过程,例如这里的一个迭代过程,用于访问带有“:”的单词:

checkwords = []
for charnum_list in df_new.char_num:
    try:
        for charnum in charnum_list:
            math.isnan(charnum)        
    except:
        new_charnum_list = []
        for charnum in charnum_list:
            charnum_new = charnum.replace('HP:','HP')
            charnum_new = charnum_new.replace('<','').replace('>','').split(' ')
            for word in charnum_new:
                checkwords.append(word)
diagnosis_dictionaries = list(set([word for word in checkwords if ':' in word]))

输出:

diagnosis_dictionaries:

['HPO:', 'ICD9CM:', 'SNOMEDCT:', 'UMLS:', 'ICD10CM:']

然后我尝试再次迭代以将列表与值和键与列表与键(上面)进行比较,但此时我是真的很绝望,因为我的想法都没有成功。

如果有人有比我更好的想法或更好的解决方案,那就太好了。

I have multiple list which are ordered like the following list:

['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']

I need to transform this list into a dictionary with the words with ":" at the end as keys. The lists are changing, so that sometimes new words with ":" are added. The corresponding values are always at the next position after the word with ":" in the list.

When I start iterating about the list it gets frustrating very quickly because there are to much possibilities for me at the moment. So I would like to ask, if anyone knows a fast transformation from such a list into a dictionary.

I tried multiple iterating processes like the one here to access the words with ':':

checkwords = []
for charnum_list in df_new.char_num:
    try:
        for charnum in charnum_list:
            math.isnan(charnum)        
    except:
        new_charnum_list = []
        for charnum in charnum_list:
            charnum_new = charnum.replace('HP:','HP')
            charnum_new = charnum_new.replace('<','').replace('>','').split(' ')
            for word in charnum_new:
                checkwords.append(word)
diagnosis_dictionaries = list(set([word for word in checkwords if ':' in word]))

output:

diagnosis_dictionaries:

['HPO:', 'ICD9CM:', 'SNOMEDCT:', 'UMLS:', 'ICD10CM:']

Then I tried to iterate again to compare the lists with the values and keys with the list with the keys (above) but at this point i am really desperate, because none of my ideas worked out well.

It would be very nice, if someone has a good idea or a better solution than mine.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

你如我软肋 2025-01-24 16:15:05

If I interpret your question correctly then I think you're looking to do this:

lst = ['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']

dct = dict()
k = None
for e in lst:
    if e[-1] == ':':
        k = e[:-1]
    else:
        if k is not None:
            dct.setdefault(k, []).append(e)
    
print(dct)

Output:

{'SNOMEDCT': ['263681008,', '771269000'], 'UMLS': ['C0443147,', 'C1867440', 'C0443147'], 'HPO': ['HP0000006', 'HP0000006']}

Note:

The test if k is not None is问题中的示例数据不需要。但是,如果列表进行了修改,并且第一个元素不会以结肠结束,则该元素将被忽略。没有检查元素数据类型的检查 - 即,假定它们是字符串

If I interpret your question correctly then I think you're looking to do this:

lst = ['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']

dct = dict()
k = None
for e in lst:
    if e[-1] == ':':
        k = e[:-1]
    else:
        if k is not None:
            dct.setdefault(k, []).append(e)
    
print(dct)

Output:

{'SNOMEDCT': ['263681008,', '771269000'], 'UMLS': ['C0443147,', 'C1867440', 'C0443147'], 'HPO': ['HP0000006', 'HP0000006']}

Note:

The test if k is not None is not necessary for the sample data in the question. However, if the list is modified and the first element does not end with colon, that element will be ignored. There is no check for the element data types - i.e., it is assumed they are strings

电影里的梦 2025-01-24 16:15:05

您可以使用itertools.groupby来创建字典。例如:

from itertools import groupby


lst = ['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']


out = {}
for k, g in groupby(lst, lambda i: i.endswith(":")):
    if k:
        out.setdefault(key := next(g).strip(":"), [])
    else:
        out[key].extend(map(lambda s: s.strip(","), g))

print(out)

打印:

{
    "SNOMEDCT": ["263681008", "771269000"],
    "UMLS": ["C0443147", "C1867440", "C0443147"],
    "HPO": ["HP0000006", "HP0000006"],
}

You can use itertools.groupby to create the dictionary. For example:

from itertools import groupby


lst = ['SNOMEDCT:', '263681008,', '771269000', 'UMLS:', 'C0443147,', 'C1867440', 'HPO:', 'HP0000006', 'HPO:', 'HP0000006', 'UMLS:', 'C0443147']


out = {}
for k, g in groupby(lst, lambda i: i.endswith(":")):
    if k:
        out.setdefault(key := next(g).strip(":"), [])
    else:
        out[key].extend(map(lambda s: s.strip(","), g))

print(out)

Prints:

{
    "SNOMEDCT": ["263681008", "771269000"],
    "UMLS": ["C0443147", "C1867440", "C0443147"],
    "HPO": ["HP0000006", "HP0000006"],
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文