如何访问所有子文件夹名称并包含同一子文件夹的文件并制作XLSX或CSV文件？

发布于 2025-02-07 11:13:49 字数 2046 浏览 3 评论 0原文

我的问题有点棘手。我希望我能够解释我的问题。在目录中，有22个文件夹。

Folder 1
Folder 2
Folder 3
Folder 4
...
Folder 22

每个文件夹至少包含12张图像。

Folder 1/
  image1.jpg
  image2.jpg
  image3.jpg
....
  image12.jpg

Folder 2/
  image1.png
  image2.jpg
  image3.jpg
....
  image12.jpg

Folder 3/
  image1.jpg
  image2.png
  image3.jpg
....
  image12.jpg

首先，我想从列中的文件夹中获取图像的名称，然后将图像在另一列中的文件夹的名称中获取。然后，在第三列中要在图像上进行扩展。最后，想要包含文件夹级别的最后一列，然后保存在“ xlsx”或“ csv”格式中。

我尝试了：

path1 = ('/content/drive/MyDrive/imagesFolders/')

subDirectory = list(os.walk(path1))
print(subDirectory)
sd = subDirectory[1][2]
sd

输出：

[('/content/drive/MyDrive/imagesFolders/', ['Folder1', 'Folder2', 'Folder3', 'Folder4', 'Folder5', 'Folder6', 'Folder7', 'Folder8', 'Folder9', 'Folder10', 'Folder11', 'Folder12', 'Folder13', 'Folder14', 'Folder15', 'Folder16', 'Folder17', 'Folder18', 'Folder19', 'Folder20', 'Folder21', 'Folder22'], []), ('/content/drive/MyDrive/imagesFolders/Folder1', [], ['IMG20220610090323.jpg', 'IMG20220610090325_BURST000_COVER.jpg', 'IMG20220610090325_BURST001.jpg', 'IMG20220610090325_BURST002.jpg', 'IMG20220610090325_BURST009.jpg', 'IMG20220610090325_BURST010.jpg', 'IMG20220610090325_BURST011.jpg', 'IMG20220610090325_BURST012.jpg', 'IMG20220610090331.jpg', 'IMG20220610090325_BURST019.jpg', 'IMG20220610090335_BURST000_COVER.jpg', 'IMG20220610090335_BURST017.jpg']), .............. #continued for 22 folders.

['IMG20220610090323.jpg',
 'IMG20220610090325_BURST000_COVER.jpg',
 'IMG20220610090325_BURST001.jpg',
 'IMG20220610090325_BURST002.jpg',
 'IMG20220610090325_BURST009.jpg',
 'IMG20220610090325_BURST010.jpg',
 'IMG20220610090325_BURST011.jpg',
 'IMG20220610090325_BURST012.jpg',
 'IMG20220610090331.jpg',
 'IMG20220610090325_BURST019.jpg',
 'IMG20220610090335_BURST000_COVER.jpg',
 'IMG20220610090335_BURST017.jpg']

下面共享演示格式：

原文

My problem is a little tricky. I hope I am able to explain my problem.
In a directory, there are 22 folders.

Folder 1
Folder 2
Folder 3
Folder 4
...
Folder 22

Each folder contains at least 12 images.

Folder 1/
  image1.jpg
  image2.jpg
  image3.jpg
....
  image12.jpg

Folder 2/
  image1.png
  image2.jpg
  image3.jpg
....
  image12.jpg

Folder 3/
  image1.jpg
  image2.png
  image3.jpg
....
  image12.jpg

At first, I want to take the image's name from a folder in a column and then the name of the folders where the images are in another column. Then in the third column want to take the extension on the images. And finally, want the last column which contains the level of a folder and then save in "xlsx" or in "CSV" format.

I tried:

path1 = ('/content/drive/MyDrive/imagesFolders/')

subDirectory = list(os.walk(path1))
print(subDirectory)
sd = subDirectory[1][2]
sd

Output:

[('/content/drive/MyDrive/imagesFolders/', ['Folder1', 'Folder2', 'Folder3', 'Folder4', 'Folder5', 'Folder6', 'Folder7', 'Folder8', 'Folder9', 'Folder10', 'Folder11', 'Folder12', 'Folder13', 'Folder14', 'Folder15', 'Folder16', 'Folder17', 'Folder18', 'Folder19', 'Folder20', 'Folder21', 'Folder22'], []), ('/content/drive/MyDrive/imagesFolders/Folder1', [], ['IMG20220610090323.jpg', 'IMG20220610090325_BURST000_COVER.jpg', 'IMG20220610090325_BURST001.jpg', 'IMG20220610090325_BURST002.jpg', 'IMG20220610090325_BURST009.jpg', 'IMG20220610090325_BURST010.jpg', 'IMG20220610090325_BURST011.jpg', 'IMG20220610090325_BURST012.jpg', 'IMG20220610090331.jpg', 'IMG20220610090325_BURST019.jpg', 'IMG20220610090335_BURST000_COVER.jpg', 'IMG20220610090335_BURST017.jpg']), .............. #continued for 22 folders.

['IMG20220610090323.jpg',
 'IMG20220610090325_BURST000_COVER.jpg',
 'IMG20220610090325_BURST001.jpg',
 'IMG20220610090325_BURST002.jpg',
 'IMG20220610090325_BURST009.jpg',
 'IMG20220610090325_BURST010.jpg',
 'IMG20220610090325_BURST011.jpg',
 'IMG20220610090325_BURST012.jpg',
 'IMG20220610090331.jpg',
 'IMG20220610090325_BURST019.jpg',
 'IMG20220610090335_BURST000_COVER.jpg',
 'IMG20220610090335_BURST017.jpg']

A demo format is shared below:

Demo-File

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奶茶白久 2025-02-14 11:13:49

所有这些事情在Google合作中做到了！

在波纹管中，我分享了解决方案：

import os
import xlsxwriter

路径初始化我们的excel文件将用名称保存
的位置

path1 = os.chdir('/content/drive/MyDrive/Research/excel')
workbook = xlsxwriter.Workbook('data_path_reader.xlsx')

默认情况下，电子表格中的工作表名称为
Sheet1，Sheet2等，但我们也可以指定一个名称。

工作表= workbook.add_worksheet（“ first _sheet”）

使用工作表对象编写
通过写（）方法的数据。

worksheet.write('A1', 'Image Name')
worksheet.write('B1', 'Folder Name')
worksheet.write('C1', 'Extension')
worksheet.write('D1', 'Label')

设置主目录的路径

path = os.chdir('/content/drive/MyDrive/Research/images')
current_directory = os.listdir(path)
subDirectory = list(os.walk(path))

sd = subDirectory[0][1]
lengthOFsd = len(sd)
lengthOFsd = lengthOFsd + 1

初始化列和行值

row1 = 1
row2 = 1
row3 = 1
row4 = 1

col1 = 0
col2 = 0
col3 = 0
col4 = 0

此循环适用于第一列图像名称，第三列的扩展名称
子文件夹中的文件名，第二列文件夹名称和第四列
标签。

for j in range(0,lengthOFsd):
  subd1 = list(subDirectory[j][2])
  foldername = os.path.basename(subDirectory[j][0])              # subFoldName = subDirectory[0][1][j]  # # This is the folder name without use os.path.basename
  label = j-1
  for imgname in subd1:
    extension = [imgname.split('.', 1)[1]]
    print(extension)
    worksheet.write(row1, col1, imgname)
    worksheet.write(row2, col2 + 1, foldername)
    worksheet.write_column(row3, col3 + 2, extension)
    worksheet.write(row4, col4 + 3, label)
    row1 += 1
    row2 += 1
    row3 += 1
    row4 += 1

workbook.close()

All these things did in google collab!

In bellow I shared the solution:

import os
import xlsxwriter

Path initializing where our excel file will save with the name

path1 = os.chdir('/content/drive/MyDrive/Research/excel')
workbook = xlsxwriter.Workbook('data_path_reader.xlsx')

By default worksheet names in the spreadsheet will be
Sheet1, Sheet2, etc., but we can also specify a name.

worksheet = workbook.add_worksheet("First _Sheet")

Use the worksheet object to write
data via the write() method.

worksheet.write('A1', 'Image Name')
worksheet.write('B1', 'Folder Name')
worksheet.write('C1', 'Extension')
worksheet.write('D1', 'Label')

Set the path of the main directory

path = os.chdir('/content/drive/MyDrive/Research/images')
current_directory = os.listdir(path)
subDirectory = list(os.walk(path))

sd = subDirectory[0][1]
lengthOFsd = len(sd)
lengthOFsd = lengthOFsd + 1

Initializing the Column and Row value

row1 = 1
row2 = 1
row3 = 1
row4 = 1

col1 = 0
col2 = 0
col3 = 0
col4 = 0

This loop is for 1st column images name, 3rd column for extensions
name of files in subfolders, 2nd column folder name, and fourth column
label.

for j in range(0,lengthOFsd):
  subd1 = list(subDirectory[j][2])
  foldername = os.path.basename(subDirectory[j][0])              # subFoldName = subDirectory[0][1][j]  # # This is the folder name without use os.path.basename
  label = j-1
  for imgname in subd1:
    extension = [imgname.split('.', 1)[1]]
    print(extension)
    worksheet.write(row1, col1, imgname)
    worksheet.write(row2, col2 + 1, foldername)
    worksheet.write_column(row3, col3 + 2, extension)
    worksheet.write(row4, col4 + 3, label)
    row1 += 1
    row2 += 1
    row3 += 1
    row4 += 1

workbook.close()

回复收藏 0 原文

~没有更多了~