覆盖 XML 文件

发布于 2024-11-15 13:06:47 字数 1402 浏览 0 评论 0原文

我正在尝试使用 elementtree 解析 XML 文件。然而,我试图读取的 XML 文件是从 MySql 导出的,当创建 XML 文件时,如果我在数据库中有一个条目,例如:c:cygwin\bin,它将把 '\b' 转换为退格键。无论如何,我试图从 XML 文件中删除 '\b' 的所有条目,以便我可以通过 elementtree.parse() 方法发送它。由于某种原因,在删除“\b”的所有条目后,我没有写出整个文件。

这就是我正在做的事情:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = map(lambda line: line.replace("\b", " "), file)

    #go to the beginning of the file
    file.seek(0);

    #overwrite with correct data
    file.writelines(lines)
    sys.exit()


'''Entry into the program'''
#test the file to see if processing is needed before parsing
for line in xml_file:
    p = re.compile("\\b") #search for '\b'
    if(p.match(line)):
        processing = True
        break #only one match needed

if processing:
    preprocess(xml_file)

结果是我最终得到一个标头被切断的 XML 文件,因此当传递给解析器时它会失败。

这是从 XML 文件中删除的内容:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ROOT SYSTEM "diskreport.dtd">
<ROOT>
    <row>
      <field name="buildid">26960</field>
      <field name="cast(status as char)">Filesystem           1K-blocks      Used Available Use% Mounted on
C:cygwinin        285217976  88055920 197162056  31% /usr/bin

任何帮助/想法都会很棒, 谢谢

I am trying to parse an XML file using elementtree. The XML file I am trying to read however got exported from MySql and when the XML file is created if I have an entry in the database like: c:cygwin\bin it translates the '\b' as a backspace. Anyway I am trying to delete all the entries of '\b' from the XML file so I can send it through the elementtree.parse() method. And for some reason, after removing all the entries of '\b' I am not writing the entire file out.

Here is what I am doing:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = map(lambda line: line.replace("\b", " "), file)

    #go to the beginning of the file
    file.seek(0);

    #overwrite with correct data
    file.writelines(lines)
    sys.exit()


'''Entry into the program'''
#test the file to see if processing is needed before parsing
for line in xml_file:
    p = re.compile("\\b") #search for '\b'
    if(p.match(line)):
        processing = True
        break #only one match needed

if processing:
    preprocess(xml_file)

The results are I end up with an XML file that has the header cut off, so when passed to the parser it fails.

This is what gets cut out of the XML file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ROOT SYSTEM "diskreport.dtd">
<ROOT>
    <row>
      <field name="buildid">26960</field>
      <field name="cast(status as char)">Filesystem           1K-blocks      Used Available Use% Mounted on
C:cygwinin        285217976  88055920 197162056  31% /usr/bin

Any help/ideas would be awesome,
Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

话少情深 2024-11-22 13:06:47

我发现了问题,当我确实需要使用 p.search 时,我使用 p.match 来查找 '\b' 的匹配项,p.match 只从行首查找,search 查找整个行中的出现情况整条线。

解决方案:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = map(lambda line: line.replace("\b", ""), file)

    #go to the beginning of the file
    file.seek(0);

    #overwrite with correct data
    file.writelines(lines)
    sys.exit()


'''Entry into the program'''
#test the file to see if processing is needed before parsing
for line in xml_file:
    p = re.compile("\\b")
    if(p.search(line)): ####Changed to p.search here
        processing = True
        break #only one match needed

if processing:
    preprocess(xml_file)

I figured out the problem, I was using p.match to look for matches of '\b' when I really needed to be using p.search, p.match only looks from the beginning of the line, search looks for occurences throughout the entire line.

Solution:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = map(lambda line: line.replace("\b", ""), file)

    #go to the beginning of the file
    file.seek(0);

    #overwrite with correct data
    file.writelines(lines)
    sys.exit()


'''Entry into the program'''
#test the file to see if processing is needed before parsing
for line in xml_file:
    p = re.compile("\\b")
    if(p.search(line)): ####Changed to p.search here
        processing = True
        break #only one match needed

if processing:
    preprocess(xml_file)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文