在列中熔化python pandas和输入文件名
我有大约(100个文件 +)XLS文件,其中包含不同列的名称和数据类型
file_1.xls:
Id test category
1 ab 4
2 cs 3
3 cs 1
file_2.xls:
index remove stocks category
1 dr 4 a
2 as 3 b
3 ae 1 v
file 3:.. 。
import pandas as pd
import pathlib
data = []
for filename in pathlib.Path.cwd().iterdir():
if filename.suffix.lower().startswith('.xls'):
data.append(pd.read_excel(filename).melt())
df = pd.concat(data, ignore_index=True)
filename variable value
0 File_1 Id 1
1 File_1 Id 2
2 File_1 Id 3
3 File_1 test ab
4 File_1 test cs
5 File_1 test cs
6 File_1 category 4
7 File_1 category 3
8 File_1 category 1
9 File_1 index 1
10 File_1 index 2
11 File_1 index 3
12 FILE_2 remove dr
13 FILE_2 remove as
14 FILE_2 remove ae
15 FILE_2 stocks 4
16 FILE_2 stocks 3
17 FILE_2 stocks 1
18 FILE_2 category a
19 FILE_2 category b
20 FILE_2 category v
1000 FILE_100 .... ..
文件名“来源的名称 文件?
I have about (100 files +) XLS files in a folder with different columns names and data types
File_1.xls:
Id test category
1 ab 4
2 cs 3
3 cs 1
FILE_2.xls:
index remove stocks category
1 dr 4 a
2 as 3 b
3 ae 1 v
File 3: ....
File 100.....
Thats my code:
import pandas as pd
import pathlib
data = []
for filename in pathlib.Path.cwd().iterdir():
if filename.suffix.lower().startswith('.xls'):
data.append(pd.read_excel(filename).melt())
df = pd.concat(data, ignore_index=True)
I would like to have the Dataframe in my output like this:
filename variable value
0 File_1 Id 1
1 File_1 Id 2
2 File_1 Id 3
3 File_1 test ab
4 File_1 test cs
5 File_1 test cs
6 File_1 category 4
7 File_1 category 3
8 File_1 category 1
9 File_1 index 1
10 File_1 index 2
11 File_1 index 3
12 FILE_2 remove dr
13 FILE_2 remove as
14 FILE_2 remove ae
15 FILE_2 stocks 4
16 FILE_2 stocks 3
17 FILE_2 stocks 1
18 FILE_2 category a
19 FILE_2 category b
20 FILE_2 category v
1000 FILE_100 .... ..
How I can melt all mu columns and keep in the column "filename" the name of the source file?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不要在循环中读取数据,而要收集文件名,然后使用字典理解来添加文件名作为串联键:
输出:
Don't read your data in the loop but rather collect the filenames, then use a dictionary comprehension to add the filenames as concatenation keys:
output: