Python lxml (objectify):Xpath 麻烦

发布于 2024-10-27 10:12:10 字数 2455 浏览 4 评论 0原文

我正在尝试解析 xml 文档,使用 lxml objectify 和 xpath 提取数据。这是文档的片段:

<?xml version="1.0" encoding="UTF-8"?>
<Assets>
 <asset name="Adham">
  <pos>
   <x>27913.769923</x>
   <y>5174.627773</y>
  </pos>
   <description>Ba bla bla</description>
   <bar>(null)</bar>
  </general>
 </asset>
 <asset name="Adrian">
  <pos>
   <x>-179.477707</x>
   <y>5286.959359</y>
  </pos>
   <commodities/>
   <description>test test test</description>
   <bar>more bla</bar>
  </general>
 </asset>
</Assets>

我有以下方法:

def getALLattributesX(self, _root):
    '''Uses getattributeX and parses through the attribute dict, assigning
    values as it goes. _root is the main document root'''
    for k in self.attrib:
        self.getattributeX(_root, self.attribPaths[k], k)

...调用此方法:

def getattributeX(self, node, x_path, _attrib): 
    '''Gets a value from an xml node indicated by an xpath
    and assigns it to a the appropriate. If node does not exists
    it assigns "error"
    '''

    print node.xpath(x_path)[0].text
    try:
        self.attrib[_attrib] = node.xpath(x_path)
    except KeyError:
        self.misload = True
    #except AttributeError:
       # self.attrib[attrib] = "error loading " + attrib
        #self.misload = True

打印语句来自测试。当我执行第一个方法时,它会解析 xml 文档,成功地停在每个资产对象处。我有一个变量字典供它查找,还有一个免费的路径字典供它使用,如下定义:

class tAssetList:

    alist = {} #dict of assets
    tlist = []
    tree = None # XML tree
    root = None #root elem

    def readXML(self, _filename):
        #Load file
        fileobject = open(_filename, "r") #read-only
        self.tree = objectify.parse(fileobject)
        self.root = self.tree.getroot()

        for elem in self.root.asset:
            temp_asset = tAsset()
            a_name = elem.get("name") # get name, which is the key for dict
            temp_asset.getALLattributesX(elem)
            self.alist[a_name] = temp_asset


class tAsset(obs.nxObject):
    def __init__(self):
        self.attrib = {"X_pos" : None,  "Y_pos" : None}
        self.attribPaths = {"X_pos" : '/pos/x',  "Y_pos" : '/pos/y'}

然而,当我在节点(这是一个对象化的 xml 节点)上调用它时,xpath 似乎不起作用)。如果我直接将其等同,它只会输出 [ ],如果我尝试:[0].text,它会给出索引超出范围错误。

这是怎么回事?

I am attempting to parse an xml document, extracting data using lxml objectify and xpath. Here is a snip of the document:

<?xml version="1.0" encoding="UTF-8"?>
<Assets>
 <asset name="Adham">
  <pos>
   <x>27913.769923</x>
   <y>5174.627773</y>
  </pos>
   <description>Ba bla bla</description>
   <bar>(null)</bar>
  </general>
 </asset>
 <asset name="Adrian">
  <pos>
   <x>-179.477707</x>
   <y>5286.959359</y>
  </pos>
   <commodities/>
   <description>test test test</description>
   <bar>more bla</bar>
  </general>
 </asset>
</Assets>

I have the following method:

def getALLattributesX(self, _root):
    '''Uses getattributeX and parses through the attribute dict, assigning
    values as it goes. _root is the main document root'''
    for k in self.attrib:
        self.getattributeX(_root, self.attribPaths[k], k)

...which calls this method:

def getattributeX(self, node, x_path, _attrib): 
    '''Gets a value from an xml node indicated by an xpath
    and assigns it to a the appropriate. If node does not exists
    it assigns "error"
    '''

    print node.xpath(x_path)[0].text
    try:
        self.attrib[_attrib] = node.xpath(x_path)
    except KeyError:
        self.misload = True
    #except AttributeError:
       # self.attrib[attrib] = "error loading " + attrib
        #self.misload = True

The print statement is from testing. When i execute the first method, it parses through the xml document, successfully stopping at each asset object. I have a dict of variables for it to find, and a complimentary dict of paths for it to use, as defined here:

class tAssetList:

    alist = {} #dict of assets
    tlist = []
    tree = None # XML tree
    root = None #root elem

    def readXML(self, _filename):
        #Load file
        fileobject = open(_filename, "r") #read-only
        self.tree = objectify.parse(fileobject)
        self.root = self.tree.getroot()

        for elem in self.root.asset:
            temp_asset = tAsset()
            a_name = elem.get("name") # get name, which is the key for dict
            temp_asset.getALLattributesX(elem)
            self.alist[a_name] = temp_asset


class tAsset(obs.nxObject):
    def __init__(self):
        self.attrib = {"X_pos" : None,  "Y_pos" : None}
        self.attribPaths = {"X_pos" : '/pos/x',  "Y_pos" : '/pos/y'}

However, xpath doesn't seem to be working when I call it on the node (which is an objectified xml node). It just outputs [ ] if I equate it directly, and it gives an index out of range error if I try: [0].text.

What is going on here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

烟火散人牵绊 2024-11-03 10:12:10

/pos/x/pos/y 是绝对 XPath 表达式,它们不选择任何元素,因为提供的 XML 文档没有 pos< /code> 顶部元素。

尝试

pos/x

pos/y

/pos/x and /pos/y are absolute XPath expressions and they don't select any element because the provided XML document doesn't have a pos top element.

Try:

pos/x

and

pos/y
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文