python:脱衣舞列的名称

发布于 2025-01-27 18:12:12 字数 547 浏览 2 评论 0原文

我有一个带有看起来像这样的列的数据框架:

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])

df:
(NYSE_close, close) (NYSE_close, open) (NYSE_close, volume) (NASDAQ_close, close) (NASDAQ_close, open) (NASDAQ_close, volume)

我想在下划线后删除所有内容,并附加逗号之后的所有内容以获取以下内容:

df:
NYSE_close  NYSE_open  NYSE_volume  NASDAQ_close  NASDAQ_open  NASDAQ_volume

我尝试剥离列名,但它用NAN代替了。关于如何做的任何建议?

先感谢您。

I have a DataFrame with columns that look like this:

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])

df:
(NYSE_close, close) (NYSE_close, open) (NYSE_close, volume) (NASDAQ_close, close) (NASDAQ_close, open) (NASDAQ_close, volume)

I want to remove everything after the underscore and append whatever comes after the comma to get the following:

df:
NYSE_close  NYSE_open  NYSE_volume  NASDAQ_close  NASDAQ_open  NASDAQ_volume

I tried to strip the column name but it replaced it with nan. Any suggestions on how to do that?

Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不念旧人 2025-02-03 18:12:12

您可以使用 re.sub 提取列名称的适当部分以替换为:

import re

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])
df.columns = [re.sub(r'\(([^_]+_)\w+, (\w+)\)', r'\1\2', c) for c in df.columns]

输出:

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []

You could use re.sub to extract the appropriate parts of the column names to replace them with:

import re

df=pd.DataFrame(columns=['(NYSE_close, close)','(NYSE_close, open)','(NYSE_close, volume)', '(NASDAQ_close, close)','(NASDAQ_close, open)','(NASDAQ_close, volume)'])
df.columns = [re.sub(r'\(([^_]+_)\w+, (\w+)\)', r'\1\2', c) for c in df.columns]

Output:

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []
夜雨飘雪 2025-02-03 18:12:12

您可以:

import re

def cvt_col(x):
    s = re.sub('[()_,]', ' ', x).split()
    return s[0] + '_' + s[2] 

df.rename(columns = cvt_col)

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []

You could:

import re

def cvt_col(x):
    s = re.sub('[()_,]', ' ', x).split()
    return s[0] + '_' + s[2] 

df.rename(columns = cvt_col)

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []
柠檬色的秋千 2025-02-03 18:12:12

使用列表理解,两次:

step1 = [ent.strip('()').split(',') for ent  in df]

df.columns = ["_".join([left.split('_')[0], right.strip()]) 
              for left, right  in step1]

df

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []

Use a list comprehension, twice:

step1 = [ent.strip('()').split(',') for ent  in df]

df.columns = ["_".join([left.split('_')[0], right.strip()]) 
              for left, right  in step1]

df

Empty DataFrame
Columns: [NYSE_close, NYSE_open, NYSE_volume, NASDAQ_close, NASDAQ_open, NASDAQ_volume]
Index: []
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文