重新启动数据框中的循环

发布于 2025-02-12 00:06:10 字数 753 浏览 0 评论 0原文

我有此数据框,

   index turns conv
0      0  utt1  yes
1      1  utt2  yes
2      2  utt3   no
3      3  utt4  yes
4      0  utt5  yes
5      1  utt6   no
6      2  utt7  yes

我想打印“转弯”列的两个元素和“ conv”列的相应元素,但在索引0上重新启动了loop 的,以便utt4和utt5 don t连接。我拥有的代码是:

for i in range(len(df['turns'])):
    if(i+1==len(df['turns'])):
      break;
    else:
      print(df['turns'][i], df['turns'][i+1], df['conv'][i+1]) 

但是当前输出:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes
utt4 utt5 yes
utt5 utt6 no
utt6 utt7 yes

虽然我需要它来输出:(

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes

utt5 utt6 no
utt6 utt7 yes

这个想法是一个滑动窗口的想法,但我无法弄清楚如何以更简单的方式做到这一点)

I have this dataframe

   index turns conv
0      0  utt1  yes
1      1  utt2  yes
2      2  utt3   no
3      3  utt4  yes
4      0  utt5  yes
5      1  utt6   no
6      2  utt7  yes

I want to print two elements of the 'turns' column and the corresponding element of the 'conv' column but re-start the for loop at index 0, so that utt4 and utt5 don't get connected. The code I have is this:

for i in range(len(df['turns'])):
    if(i+1==len(df['turns'])):
      break;
    else:
      print(df['turns'][i], df['turns'][i+1], df['conv'][i+1]) 

But currently it outputs:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes
utt4 utt5 yes
utt5 utt6 no
utt6 utt7 yes

Whereas I need it to output:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes

utt5 utt6 no
utt6 utt7 yes

(The idea is that of a sliding window but I couldn't figure out how to do that in a simpler way)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

丢了幸福的猪 2025-02-19 00:06:10

如果您只想打印,则可以将循环更改为:

for i in range(len(df['turns'])-1):
    if df.loc[i+1, 'index'] == 0:
        print()
    else:
        print(df['turns'][i], df['turns'][i+1], df['conv'][i+1]) 

输出:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes

utt5 utt6 no
utt6 utt7 yes

矢量解决方案是:

group = df['index'].eq(0).cumsum()
(df
 .assign(turns2=df.groupby(group)['turns'].shift())
 .dropna(subset=['turns2'])
 [['turns2', 'turns', 'conv']]
 .to_csv('out.csv', index=False, header=False, sep=' ')
)

out.csv

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes
utt5 utt6 no
utt6 utt7 yes

If you just want to print, you could change your loop to:

for i in range(len(df['turns'])-1):
    if df.loc[i+1, 'index'] == 0:
        print()
    else:
        print(df['turns'][i], df['turns'][i+1], df['conv'][i+1]) 

output:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes

utt5 utt6 no
utt6 utt7 yes

A vectorial solution would be:

group = df['index'].eq(0).cumsum()
(df
 .assign(turns2=df.groupby(group)['turns'].shift())
 .dropna(subset=['turns2'])
 [['turns2', 'turns', 'conv']]
 .to_csv('out.csv', index=False, header=False, sep=' ')
)

out.csv:

utt1 utt2 yes
utt2 utt3 no
utt3 utt4 yes
utt5 utt6 no
utt6 utt7 yes
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文