当前位置：文江博客话题详情

如何删除所有记录中同时包含字母和数字值的 pandas 列

发布于 2025-01-18 21:51:17 字数 493 浏览 1 评论 0原文

我有一个名为 df 的 pandas 数据框，包含大约 200 万条记录。有一个名为 transaction_id 的列，可能包含：

某些记录的 alpha 值（例如“abscdwew”）
某些记录的数字值（例如“123454”）
字母值和数字值（例如“asd12354”）对于某些记录的
字母、数字和特殊字符（例如“asd435_！”）对于某些记录的
特殊字符（例如“_-！”）

我想删除该列，如果所有值（即跨所有记录）包含：

字母和数值的组合（例如“aseder345”）
字母和特殊字符的组合（例如“asedre_!”）
数字和特殊字符的组合（例如“123_!”）
所有特殊字符（例如“< em>!")

有没有一种Python式的方法可以做到这一点？

因此，如果一列包含跨所有

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

思慕 2025-01-25 21:51:18

鉴于以下玩具数据框，应删除COL1并根据您的标准保留COL2：

import pandas as pd

df = pd.DataFrame(
    {
        "col1": [
            "abs@&wew",
            "123!45!4",
            "asd12354",
            "asdfzf_!",
            "123_!",
            "asd435_!",
            "_-!",
        ],
        "col2": [
            "abscdwew",
            "123454",
            "asd12354",
            "a_!sdfzf",
            "123_!",
            "asd435_!",
            "_-!",
        ],
    }
)

这是一种方法：

test = lambda x: True if x.isalpha() or x.isdigit() else False
cols_to_keep = df.apply(lambda x: any(test(x) for x in x))

df = df.loc[:, cols_to_keep]

print(df)
# Output
       col2
0  abscdwew
1    123454
2  asd12354
3  a_!sdfzf
4     123_!
5  asd435_!
6       _-!

Given the following toy dataframe, in which col1 should be removed and col2 should be kept according to your criteria:

import pandas as pd

df = pd.DataFrame(
    {
        "col1": [
            "abs@&wew",
            "123!45!4",
            "asd12354",
            "asdfzf_!",
            "123_!",
            "asd435_!",
            "_-!",
        ],
        "col2": [
            "abscdwew",
            "123454",
            "asd12354",
            "a_!sdfzf",
            "123_!",
            "asd435_!",
            "_-!",
        ],
    }
)

Here is one way to do it:

test = lambda x: True if x.isalpha() or x.isdigit() else False
cols_to_keep = df.apply(lambda x: any(test(x) for x in x))

df = df.loc[:, cols_to_keep]

print(df)
# Output
       col2
0  abscdwew
1    123454
2  asd12354
3  a_!sdfzf
4     123_!
5  asd435_!
6       _-!

回复收藏 0 原文

~没有更多了~