Python将计数器转换为数据框列
我在这里没有找到问题的答案,我想知道我是否可以得到一些帮助(对这些链接表示歉意,我还不能嵌入图像)。
我已经将计数器对象存储在我的数据框架中,还希望它们作为每个计数元素的列中添加到数据框架中。
启动数据
data = {
"words": ["ABC", "BCDB", "CDE", "F"],
"stuff": ["abc", "bcda", "cde", "f"]
}
df = pd.DataFrame(data)
patternData = {
"name": ["A", "B", "C", "D", "E", "F"],
"rex": ["A{1}", "B{1}", "C{1}", "D{1}", "E{1}", "F{1}"]
}
patterns = pd.DataFrame(patternData)
def countFound(ps):
result = Counter()
for index, row in patterns.iterrows():
findName = row['name']
findRex = row['rex']
found = re.findall(findRex, ps)
if (len(found) > 0):
result.update({findName:len(found)})
return result
df['found'] = df['words'].apply(lambda x: countFound(x))
结果
单词 | a href =“ https://i.sstatic.net/2d0wj.png” rel =“ nofollow noreferrer”>所需的 | 发现 | a | b | c | d e | f | abc |
---|---|---|---|---|---|---|---|---|
acb | acb | {'a'': 1,'b':1,'c':1} | 1 | 1 | 1 | 0 | 0 | 0 |
bcd bcd | bcd | {'b':1,'c':1,'d':1} | 0 | 2 | 1 | 1 | 0 | 0 |
CDE | CDE | {'C':1,'d':1,'e':1} | 0 | 0 | 1 | 1 1 | 1 | 0 |
f f f f | f f | {'f ':1} | 0 | 0 | 0 | 0 | 0 | 1 |
I haven't been able to find an answer here specific to my issue and I'm wondering if I could get some help (apologies for the links, I'm not allowed to embed images yet).
I have stored Counter objects within my DataFrame and also want them added to the DataFrame as a column for each counted element.
Beginning data
data = {
"words": ["ABC", "BCDB", "CDE", "F"],
"stuff": ["abc", "bcda", "cde", "f"]
}
df = pd.DataFrame(data)
patternData = {
"name": ["A", "B", "C", "D", "E", "F"],
"rex": ["A{1}", "B{1}", "C{1}", "D{1}", "E{1}", "F{1}"]
}
patterns = pd.DataFrame(patternData)
def countFound(ps):
result = Counter()
for index, row in patterns.iterrows():
findName = row['name']
findRex = row['rex']
found = re.findall(findRex, ps)
if (len(found) > 0):
result.update({findName:len(found)})
return result
df['found'] = df['words'].apply(lambda x: countFound(x))
words | stuff | found | A | B | C | D | E | F |
---|---|---|---|---|---|---|---|---|
ABC | acb | {'A': 1, 'B': 1, 'C': 1} | 1 | 1 | 1 | 0 | 0 | 0 |
BCD | bcd | {'B': 1, 'C': 1, 'D': 1} | 0 | 2 | 1 | 1 | 0 | 0 |
CDE | cde | {'C': 1, 'D': 1, 'E': 1} | 0 | 0 | 1 | 1 | 1 | 0 |
F | f | {'F': 1} | 0 | 0 | 0 | 0 | 0 | 1 |
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用
输出:
您也可以无需自定义功能即可直接获取列。为此,使用命名捕获组和” Noreflow noreferrer>
str.Extractall
:或variant nond in命名捕获组和设置列表以后将列名调整为:
输出:
You can use
json_normalize
:Output:
You can also directly get the columns without your custom function. For this use a dynamically crafted regex with named capturing groups and
str.extractall
:Or variant without named capturing groups and settings up the column names later:
Output:
计数器
的行为很像字典。在字典列表上调用pd.dataframe
将为您提供计数值的矩阵:A
Counter
behaves a lot like a dictionary. Callingpd.DataFrame
on a list of dictionaries will give you the matrix of counted values: