Pandas 计算字符串模式并附加到列多索引

发布于 2025-01-19 12:51:44 字数 1457 浏览 1 评论 0原文

我有此数据框架,并希望计算出发生模式的次数,然后将其附加到新的COLUM上。在这种情况下,我感兴趣的模式是“ mv =?” IE MV = 5455等。

d = [{'AX':['Rec(POS=4,,REF=FF,, MV=55), Rec(POS=2,, REF=GH,, MV=23)'], 'AVF1':[], 'HI':['Rec(POS=2,,REF=RTD,, MV=23), Rec(POS=234,, REF=FFRE,, MV=00)'],'AV1':[], 'version_1':[]},
      {'AX':[], 'AVF1':['Rec(POS=43,,REF=FeF,, MV=5455), Rec(POS=2,, REF=GH,, MV=23), Rec(POS=231,, REF=JK, MV=TR)'], 'HI':[],'AV1':[], 'version_2':[]},
      {'AX':['Rec(POS=2342,,REF=FhF,, MV=1)'], 'AVF1':['Rec(POS=11,,REF=FF11,, MV=551)'], 'HI':[],'AV1':[], 'version_3':[]}]



frame = pd.DataFrame(d)


f = frame.T

lst = []
f['temp'] = f.index
for i in f.iloc[-3:, -1]:
  lst.append(i)
f = f.drop(columns={'temp'})

f.columns = [lst, f.columns]
f

ALTS = pd.DataFrame(index=f.index, columns=pd.MultiIndex.from_product([f.columns.levels[0], ['ALT']]))

f = pd.concat([f,ALTS], axis=1).sort_index(level=0, axis=1)
f = f.drop(f.index[[-1,-2,-3]])

f

所需的输出 您可以看到第0列中有两项MV计数,第2列中的MV计数等等。

           version_1          version_2      version_3
           ALT                ALT            ALT

AX         2                  NaN            1
AVF1       NaN                3              1
HI         2                  NaN            NaN
AV1        NaN                NaN            NaN

我正在处理的较大数据框架有更多列,我的互联网非常糟糕,因此我无法上传整个数据框架。

我正在考虑使用以下类似的内容,但是我有多索引列:

f['ALT'] = f.0.str.extract('MV=??').count()

I have this dataframe and am looking to count the number of times a pattern occurs and then append to a new colum. In this case the pattern I'm interested in is "MV=??" i.e. MV=5455 etc.

d = [{'AX':['Rec(POS=4,,REF=FF,, MV=55), Rec(POS=2,, REF=GH,, MV=23)'], 'AVF1':[], 'HI':['Rec(POS=2,,REF=RTD,, MV=23), Rec(POS=234,, REF=FFRE,, MV=00)'],'AV1':[], 'version_1':[]},
      {'AX':[], 'AVF1':['Rec(POS=43,,REF=FeF,, MV=5455), Rec(POS=2,, REF=GH,, MV=23), Rec(POS=231,, REF=JK, MV=TR)'], 'HI':[],'AV1':[], 'version_2':[]},
      {'AX':['Rec(POS=2342,,REF=FhF,, MV=1)'], 'AVF1':['Rec(POS=11,,REF=FF11,, MV=551)'], 'HI':[],'AV1':[], 'version_3':[]}]



frame = pd.DataFrame(d)


f = frame.T

lst = []
f['temp'] = f.index
for i in f.iloc[-3:, -1]:
  lst.append(i)
f = f.drop(columns={'temp'})

f.columns = [lst, f.columns]
f

ALTS = pd.DataFrame(index=f.index, columns=pd.MultiIndex.from_product([f.columns.levels[0], ['ALT']]))

f = pd.concat([f,ALTS], axis=1).sort_index(level=0, axis=1)
f = f.drop(f.index[[-1,-2,-3]])

f

Desired Output
You can see there are two counts of MV in column 0, one count of MV in column 2 and so on.

           version_1          version_2      version_3
           ALT                ALT            ALT

AX         2                  NaN            1
AVF1       NaN                3              1
HI         2                  NaN            NaN
AV1        NaN                NaN            NaN

The larger data frame I am working on has more columns, my internet is pretty bad so I can't upload the entire data frame.

I was thinking of using something like below, but I have multi index columns:

f['ALT'] = f.0.str.extract('MV=??').count()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

你如我软肋 2025-01-26 12:51:44

尝试使用应用str.Count

output = f.apply(lambda x: x.str[0].str.count("MV=")).dropna(how="all", axis=1)
output = output.rename(columns={c[1]: "ALT" for c in output.columns},level=1)

     version_1 version_2 version_3
           ALT       ALT       ALT
AX         2.0       NaN       1.0
AVF1       NaN       3.0       1.0
HI         2.0       NaN       NaN
AV1        NaN       NaN       NaN

Try with apply and str.count:

output = f.apply(lambda x: x.str[0].str.count("MV=")).dropna(how="all", axis=1)
output = output.rename(columns={c[1]: "ALT" for c in output.columns},level=1)

     version_1 version_2 version_3
           ALT       ALT       ALT
AX         2.0       NaN       1.0
AVF1       NaN       3.0       1.0
HI         2.0       NaN       NaN
AV1        NaN       NaN       NaN
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文