如何用Python编写标签删除器脚本
我想实现一个文件读取器(文件夹和子文件夹)脚本,它检测一些标签并从文件中删除这些标签。
这些文件是.cpp、.h、.txt 和.xml,它们是同一文件夹下的数百个文件。
我对Python一无所知,但人们告诉我,我可以轻松做到。
示例:
我的主文件夹是 A: C:\A
在 A 内部,我有文件夹(B、C、D)和一些文件 A.cpp Ah A.txt 和 A.xml。在 B 中,我有文件夹 B1、B2、B3,其中一些有更多子文件夹,以及文件 .cpp、.xml 和 .h....
xml 文件,包含一些标签,例如
.h 和 .cpp 文件包含另一种标签,如
//$TAG some text$
.txt 具有不同的格式标签:
#$这是我的标签$
它总是以 $ 符号开始和结束,但它总是有一个注释字符(//,
这个想法是运行一个脚本并删除所有文件中的所有标签,因此该脚本必须:
- 读取文件夹和子文件夹
- 打开文件并查找标签
- 如果存在,则删除并保存更改后的文件
我有:
import os
for root, dirs, files in os.walk(os.curdir):
if files.endswith('.cpp'):
%Find //$ and delete until next $
if files.endswith('.h'):
%Find //$ and delete until next $
if files.endswith('.txt'):
%Find #$ and delete until next $
if files.endswith('.xml'):
%Find <!-- $ and delete until next $ and -->
I want to implement a file reader (folders and subfolders) script which detects some tags and delete those tags from the files.
The files are .cpp, .h .txt and .xml And they are hundreds of files under same folder.
I have no idea about python, but people told me that I can do it easily.
EXAMPLE:
My main folder is A: C:\A
Inside A, I have folders (B,C,D) and some files A.cpp A.h A.txt and A.xml. In B i have folders B1, B2,B3 and some of them have more subfolders, and files .cpp, .xml and .h....
xml files, contains some tags like
<!-- $Mytag: some text$ -->
.h and .cpp files contains another kind of tags like
//$TAG some text$
.txt has different format tags:
#$This is my tag$
It always starts and ends with $ symbol but it always have a comment character (//,
The idea is to run one script and delete all tags from all files so the script must:
- Read folders and subfolders
- Open files and find tags
- If they are there, delete and save files with changes
WHAT I HAVE:
import os
for root, dirs, files in os.walk(os.curdir):
if files.endswith('.cpp'):
%Find //$ and delete until next $
if files.endswith('.h'):
%Find //$ and delete until next $
if files.endswith('.txt'):
%Find #$ and delete until next $
if files.endswith('.xml'):
%Find <!-- $ and delete until next $ and -->
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般的解决方案是:
fn_name.endswith('.cpp')
和 if/elseif 来确定您正在使用的文件re
模块创建常规文件您可以使用表达式来确定一行是否包含您的标签。tempfile
模块)。逐行迭代源文件并将过滤后的行输出到临时文件中。os.unlink()
加上os.rename()
来替换原始文件。这对于 Python 专家来说是一个微不足道的练习,但对于新手来说语言,可能需要几个小时才能开始工作。不过,您可能找不到比这更好的任务来介绍该语言了。祝你好运!
----- 更新 -----
os.walk 返回的
files
属性是一个列表,因此您还需要迭代它。此外,files
属性将仅包含文件的基本名称。您需要将root
值与os.path.join()
结合使用,将其转换为完整路径名。尝试这样做:如果您使用的是 Python 3,打印语句将需要是函数调用,但总体思路保持不变。
The general solution would be to:
os.walk()
function to traverse the directory tree.fn_name.endswith('.cpp')
with if/elseif to determine which file you're working withre
module to create a regular expression you can use to determine if a line contains your tagtempfile
module). Iterate over the source file line by line and output the filtered lines to your tempfile.os.unlink()
plusos.rename()
to replace your original fileIt's a trivial excercise for a Python adept but for someone new to the language, it'll probably take a few hours to get working. You probably couldn't ask for a better task to get introduced to the language though. Good Luck!
----- Update -----
The
files
attribute returned by os.walk is a list so you'll need to iterate over it as well. Also, thefiles
attribute will only contain the base name of the file. You'll need to use theroot
value in conjunction withos.path.join()
to convert this to a full path name. Try doing just this:If you're using Python 3, the print statements will need to be function calls but the general idea remains the same.
尝试这样的操作:
os.splitext
的一个好处是,它对以.
开头的文件名执行正确的操作。Try something like this:
Nice thing about
os.splitext
is that it does the right thing for filenames that start with a.
.