如何解析多层嵌套 (5/6) JSON 对象并将其转换为数据帧?
问题:我有一个多层嵌套的 JSON 文件,需要解析它并将其转换为 pandas 数据帧,其中每个字段都是一列。我采取了两种方法:
- 将原始文件转换为数据字典
- 将原始文件转换为 JSON 对象
对于我尝试过的数据字典:
df = pd.json_normalize(data)
- 这给我留下了一个解析并加载到
field12
的 DF(请参阅下面的虚拟 JSON),其余数据将加载到该单个单元格中。还尝试添加max_level=
,无论使用什么数字,我都会得到相同的结果
- 这给我留下了一个解析并加载到
我尝试过
dt = dt.explode('field12')
< /p>- 这使得第三个嵌套中的一列标题没有数据,其余数据丢失
For我尝试过的 JSON 对象:
pd.read_json(json_object)
- 这给我留下了一个 4x9 的表格,其中第一列是第二个嵌套中的标题,各列是第一个嵌套中的标题,其余数据再次存储在单个单元格中
请帮忙,对这个非常迷失!
下面是一个虚拟 JSON 对象,它是我正在使用的文件的精确结构副本:
{
"field1":"dummyString",
"field2": null,
"field3":" dummyString",
"field4":{
"field5":" dummyString",
"field6":" dummyString",
"field7":" dummyString",
"field8":" dummyString",
"field9":"dummyNumber",
"field10":null,
"field11":" dummyString",
"field12":[
{
"field13":" dummyString",
"field14":" dummyString",
"field15":{
"field16":" dummyString",
"field17":" dummyString",
"field18":" dummyString"
},
"field19":" dummyString",
"field20":" dummyString",
"field21":" dummyString",
"field22":" dummyString",
"field23":" dummyString",
"field24": "dummyNumber",
"field25": "dummyNumber",
"field26":null,
"field27":" dummyString",
"field28":" dummyString",
"field29":" dummyString",
"field30": "dummyBoolean",
"field31": "dummyNumber",
"field32":" dummyString",
"field33":null,
"field34":[
{
"field35":"dummyString",
"field36":null,
"field37":{
"field38":"dummyString",
"field39":"dummyString",
"field40":" dummyString",
"field41":" dummyString",
"field42":" dummyString",
"field43":" dummyString",
"field44":null,
"field45":" dummyString",
"field46":" dummyString"
},
"field47":null,
"field48":null,
"field49":null,
"field50":null,
"field51": "dummyNumber",
"field52":"dummyString",
"field53":"dummyString",
"field54":"dummyBoolean",
"field55": "dummyBoolean",
"field56":"dummyString",
"field57":null,
"field58":null
},{
"field59":{
"field60":null,
"field61":null,
"field62":null
}
}
]
}
],
"field63":"dummyStringl"
}
}
Problem: I have a multi nested JSON file that I need to parse and convert to a pandas dataframe where every field is a column. I've taken 2 approaches:
- Convert raw file to data dictionary
- Convert raw file to JSON object
For data dictionary I've tried:
df = pd.json_normalize(data)
- This leaves me with a DF that parses and loads up to
field12
(see dummy JSON below), the rest of the data is loaded into that single cell. Have also tried addingmax_level=
and I get the same result no matter what number I use
- This leaves me with a DF that parses and loads up to
I've tried
dt = dt.explode('field12')
- This leaves me with a single column of the headers in the 3rd nest with no data and the rest of the data is missing
For JSON object I've tried:
pd.read_json(json_object)
- This leaves me with a 4x9 table with the 1st column being the headers in the 2nd nest and the columns being the headers in the 1st nest, the rest of the data is stored in a single cell again
Please help, very lost with this one!
Below is a dummy JSON object that is an exact structure replica of the file I'm working with:
{
"field1":"dummyString",
"field2": null,
"field3":" dummyString",
"field4":{
"field5":" dummyString",
"field6":" dummyString",
"field7":" dummyString",
"field8":" dummyString",
"field9":"dummyNumber",
"field10":null,
"field11":" dummyString",
"field12":[
{
"field13":" dummyString",
"field14":" dummyString",
"field15":{
"field16":" dummyString",
"field17":" dummyString",
"field18":" dummyString"
},
"field19":" dummyString",
"field20":" dummyString",
"field21":" dummyString",
"field22":" dummyString",
"field23":" dummyString",
"field24": "dummyNumber",
"field25": "dummyNumber",
"field26":null,
"field27":" dummyString",
"field28":" dummyString",
"field29":" dummyString",
"field30": "dummyBoolean",
"field31": "dummyNumber",
"field32":" dummyString",
"field33":null,
"field34":[
{
"field35":"dummyString",
"field36":null,
"field37":{
"field38":"dummyString",
"field39":"dummyString",
"field40":" dummyString",
"field41":" dummyString",
"field42":" dummyString",
"field43":" dummyString",
"field44":null,
"field45":" dummyString",
"field46":" dummyString"
},
"field47":null,
"field48":null,
"field49":null,
"field50":null,
"field51": "dummyNumber",
"field52":"dummyString",
"field53":"dummyString",
"field54":"dummyBoolean",
"field55": "dummyBoolean",
"field56":"dummyString",
"field57":null,
"field58":null
},{
"field59":{
"field60":null,
"field61":null,
"field62":null
}
}
]
}
],
"field63":"dummyStringl"
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我写了一个函数来扁平化像你这样的字典。
输出:
I wrote a function to flatten a dict like yours.
Output: