当索引级别具有相同值时,替换 pandas 数据帧多索引第二级上的奇异值
我有一个数据框,它有两个级别的多重索引。给出第二级的以下示例:
d = {
"col1": [1, 2, 3, 4],
"col2": [1, 2, 3, 4],
"col3": [1, 2, 3, 4],
"col4": [1, 2, 3, 4],
"col5": [1, 2, 3, 4],
}
df = pd.DataFrame(data=d)
df.columns = pd.MultiIndex.from_product([df.columns, ["identical"]])
如何更改奇异值以使索引的第二级看起来像这样?
['example', 'identical', 'identical', 'identical', 'identical']
我尝试这样做:
updated_columns = list(df.columns.get_level_values(1))
updated_columns[0] = 'example'
df.columns.set_levels(
updated_columns, level=1, inplace=True, verify_integrity=False
)
在这种情况下我的更改被忽略。
我也尝试过此主题的答案: pandas MultiIndex with重复值一级
df.columns = pd.MultiIndex.from_tuples(
df.columns.set_levels(updated_columns, 1, verify_integrity=False).values
)
这也被忽略了。
我还考虑过使用 rename() 方法。不幸的是,它仅在提供重命名的列的值时才有效。鉴于存在相同的值,这是行不通的。
对于非多重索引,有这种方法:
df.columns.values[0] = 'example'
但根据我收集的信息,它不适用于多重索引。
我添加了 verify_integrity=False 因为该方法不允许我设置相同的值。
任何帮助将不胜感激。
I have a dataframe that has a multiindex with two levels. Given the following example for the second level:
d = {
"col1": [1, 2, 3, 4],
"col2": [1, 2, 3, 4],
"col3": [1, 2, 3, 4],
"col4": [1, 2, 3, 4],
"col5": [1, 2, 3, 4],
}
df = pd.DataFrame(data=d)
df.columns = pd.MultiIndex.from_product([df.columns, ["identical"]])
How do I change a singular value so that the second level of the index looks like this?
['example', 'identical', 'identical', 'identical', 'identical']
I have tried to do it this way:
updated_columns = list(df.columns.get_level_values(1))
updated_columns[0] = 'example'
df.columns.set_levels(
updated_columns, level=1, inplace=True, verify_integrity=False
)
My change is ignored in this case.
I have also tried the answer from this topic: pandas MultiIndex with duplicate values in one level
df.columns = pd.MultiIndex.from_tuples(
df.columns.set_levels(updated_columns, 1, verify_integrity=False).values
)
Which was also ignored.
I have also considered using the rename() method. Unfortunately it only works if the value of the column that is renamed is provided. Given that there are identical values, this will not work.
For non-multiindexes there is this method:
df.columns.values[0] = 'example'
But from what I have gathered, it will not work for a multi index.
I have added verify_integrity=False because the method would not otherwise allow me to set identical values.
Any help would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一种方法是获取组成 MultiIndex 的元组并直接修改它们:
输出:
MultiIndexes 有点奇怪。它们存储为级别列表(其中包含唯一的标签值)和代码(标签的索引)。例如,当前 MultiIndex 的级别和代码如下所示:
如您所见,其中只有一个
'indentical'
字符串。它的重复是由代码定义的。因此,如果您想通过操作级别和代码将第二级的第一个标签设置为
example
,您可以像这样进行操作:这是一个函数,您可以使用它来设置MultiIndex 的特定级别:
用法:
这是另一个函数(使用前一个函数),您可以使用它来设置 MultiIndex 的特定级别的特定标签(这是您的问题):
用法:
One way would be to get the tuples that make up the MultiIndex and modify them directly:
Output:
MultiIndexes are a bit weird. They're stored as a list of levels (which contain unique label values), and codes (which are the indexes of the labels). For example, the levels and codes for your current MultiIndex look like this:
As you can see, there's only one
'indentical'
string in there. The repetitions of it are defined by the codes.So if you wanted to set the first label of the second level to
example
by manipulating the levels and codes, you might go about it like this:Here's a function that you can use to set all the labels of a particular level of a MultiIndex:
Usage:
Here's another function (which uses the previous one) that you can use to set a particular label of a particular level of a MultiIndex (which is your question):
Usage: