将字典列表的字典转换为数据框
我有以下字典列表词典的子样本(来自数百万个项目的较大字典):
bool_dict = {0: [{0: 4680}, {1: 1185}],
1: [{0: 172}, {1: 9}],
2: [{0: 149}, {1: 1282}],
3: [{0: 20}, {1: 127}],
4: [{0: 0}, {1: 0}]}
我将其转换为表单的数据框架:
0 1
0 {0: 4680} {1: 1185}
1 {0: 172} {1: 9}
2 {0: 149} {1: 1282}
3 {0: 20} {1: 127}
4 {0: 0} {1: 0}
通过执行以下操作:
test=pd.DataFrame(bool_dict.values(),columns['0','1'],index=bool_dict.keys()).sort_index()
问题是我只需要每个单元格的值,而不是每个单元格的值钥匙,在数据框架中。因此,所需的输出是:
0 1
0 4680 1185
1 172 9
2 149 1282
3 20 127
4 0 0
我尝试了以下操作:
test['0'] = test['0'].apply(lambda x: x[0])
但是后来我会在我认为是字典上遇到一个密钥错误。
为了确保它确实是一本词典,然后我尝试了
from ast import literal_eval
test['0']=test['0'].apply(lambda x: literal_eval(str(x)))
再次尝试
test['0'] = test['0'].apply(lambda x: x[0])
(我也尝试过键为'0')。
更新:为了确保lambda是问题所在,这很好:
test['0'].head():
0 {0: 4680}
1 {0: 247}
2 {0: 0}
3 {0: 0}
4 {0: 104}
我可以用:
拆分的黑客做事,然后删除外部内容,但由于许多原因,这只是觉得错误。
I have the following subsample of dictionary of lists of dictionaries (from a larger dictionary of millions of items):
bool_dict = {0: [{0: 4680}, {1: 1185}],
1: [{0: 172}, {1: 9}],
2: [{0: 149}, {1: 1282}],
3: [{0: 20}, {1: 127}],
4: [{0: 0}, {1: 0}]}
which I converted to a dataframe of the form:
0 1
0 {0: 4680} {1: 1185}
1 {0: 172} {1: 9}
2 {0: 149} {1: 1282}
3 {0: 20} {1: 127}
4 {0: 0} {1: 0}
by doing the following:
test=pd.DataFrame(bool_dict.values(),columns['0','1'],index=bool_dict.keys()).sort_index()
The problem is that I only need each cell's value, not the key, in the dataframe. So, the desired output is:
0 1
0 4680 1185
1 172 9
2 149 1282
3 20 127
4 0 0
I tried the following:
test['0'] = test['0'].apply(lambda x: x[0])
but then I get a key error on what I thought was a dictionary.
To make sure it indeed was a dictionary, I then tried
from ast import literal_eval
test['0']=test['0'].apply(lambda x: literal_eval(str(x)))
then tried this again
test['0'] = test['0'].apply(lambda x: x[0])
with no success (I also tried the key as '0').
UPDATE: to make sure the lambda was the issue, this works just fine:
test['0'].head():
0 {0: 4680}
1 {0: 247}
2 {0: 0}
3 {0: 0}
4 {0: 104}
I could do the hacky thing of a split by the :
and then remove extraneous stuff, but that just feels wrong for so many reasons.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种方法是将内部列表转换为字典,然后将其传递给 DataFrame 构造函数:
另一种选择是通过使用列名和键与每列匹配的事实,在列上应用
str
访问器:输出:
One way is to convert the inner list into a dictionary then pass it to the DataFrame constructor:
Another option is to apply
str
accessor on the columns by using the fact that column names and keys match for each column:Output:
您可以通过第一个 lambda 迭代每一行,并使用第二个 lambda 迭代该行中的每个单元格并读取字典的值:
You can iterate through each row by first lambda and iterate through each cell in that row with the second lambda and read the values of the dictionary: