从 xml 文件中获取数据时遇到问题

发布于 2024-09-15 03:24:22 字数 1299 浏览 8 评论 0原文

我正在尝试从 Google 的地理编码 API 中解析出一些信息，但在有效地从 xml 中获取数据时遇到了一些麻烦。例如，请参阅链接

我真正关心的是从 address_component 获取 short_name，其中类型为 administrative_area_level_1 和 来自 administrative_area_level_2 的 long_name 然而，在我的测试程序中，我的 XPath 查询没有返回两个查询的结果。

public static void Main(string[] args)
{
    using(WebClient webclient = new WebClient())
    {
        webclient.Proxy = null;
        string locationXml = webclient.DownloadString("http://maps.google.com/maps/api/geocode/xml?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false");
        using(var reader = new StringReader(locationXml))
        {
            var doc = new XPathDocument(reader);
            var nav = doc.CreateNavigator();
            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/address_component[type=administrative_area_level_1]/short_name").InnerXml);
            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/address_component[type=administrative_area_level_2]/long_name").InnerXml);

        }
    }
}

谁能帮我找出我做错了什么，或者推荐更好的方法？

原文

I am trying to parse out some information from Google's geocoding API but I am having a little trouble with efficiently getting the data out of the xml. See link for example

All I really care about is getting the short_name from address_component where the type is administrative_area_level_1 and the long_name from administrative_area_level_2
However with my test program my XPath query returns no results for both queries.

public static void Main(string[] args)
{
    using(WebClient webclient = new WebClient())
    {
        webclient.Proxy = null;
        string locationXml = webclient.DownloadString("http://maps.google.com/maps/api/geocode/xml?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false");
        using(var reader = new StringReader(locationXml))
        {
            var doc = new XPathDocument(reader);
            var nav = doc.CreateNavigator();
            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/address_component[type=administrative_area_level_1]/short_name").InnerXml);
            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/address_component[type=administrative_area_level_2]/long_name").InnerXml);

        }
    }
}

Can anyone help me find what I am doing wrong, or recommending a better way?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你丑哭了我 2024-09-22 03:24:22

您需要将要查找的节点的值放在引号中：

".../address_component[type='administrative_area_level_1']/short_name"
                            ↑                           ↑

You need to put the value of the node you're looking for in quotes:

".../address_component[type='administrative_area_level_1']/short_name"
                            ↑                           ↑

回复收藏 0 原文

梦冥 2024-09-22 03:24:22

我绝对推荐使用 LINQ to XML 而不是 XPathNavigator。根据我的经验，它使 XML 查询变得轻而易举。在这种情况下，我不确定到底出了什么问题......但我会提出一个 LINQ to XML 片段。

using System;
using System.Linq;
using System.Net;
using System.Xml.Linq;

class Test
{
    public static void Main(string[] args)
    {
        using(WebClient webclient = new WebClient())
        {
            webclient.Proxy = null;
            string locationXml = webclient.DownloadString
                ("http://maps.google.com/maps/api/geocode/xml?address=1600"
                 + "+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false");
            XElement root = XElement.Parse(locationXml);

            XElement result = root.Element("result");
            Console.WriteLine(result.Elements("address_component")
                                    .Where(x => (string) x.Element("type") ==
                                           "administrative_area_level_1")
                                    .Select(x => x.Element("short_name").Value)
                                    .First());
            Console.WriteLine(result.Elements("address_component")
                                    .Where(x => (string) x.Element("type") ==
                                           "administrative_area_level_2")
                                    .Select(x => x.Element("long_name").Value)
                                    .First());
        }
    }
}

现在这个是更多的代码¹...但我个人发现它比 XPath 更容易正确，因为编译器对我的帮助更多。

编辑：我觉得有必要更详细地说明为什么我通常更喜欢这样的代码而不是使用 XPath，尽管它显然更长。

当您在 C# 程序中使用 XPath 时，您拥有两种不同的语言 - 但只有一种语言处于控制状态 (C#)。 XPath 被归入字符串领域：Visual Studio 不会对 XPath 表达式进行任何特殊处理；它不理解它是一个 XPath 表达式，因此它无法帮助您。这并不是说 Visual Studio 不知道 XPath；而是说 Visual Studio 不知道 XPath。正如 Dimitre 指出的那样，如果您正在编辑 XSLT 文件（而不是 C# 文件），它完全能够发现错误。

每当您将一种语言嵌入另一种语言并且该工具不知道它时，就会出现这种情况。常见的示例有：

SQL
正则表达式
HTML
XPath

当代码以另一种语言的数据形式呈现时，第二语言会失去许多工具优势。

虽然您可以在各处进行上下文切换，将 XPath（或 SQL 或正则表达式等）拉出到它们自己的工具中（可能在同一个实际程序中，但在单独的文件或窗口中）我发现从长远来看这会使代码更难阅读。如果代码只被编写，之后不再读取，那可能没问题 - 但您确实需要能够在之后读取代码，而且我个人认为，当发生这种情况时，可读性会受到影响。

上面的 LINQ to XML 版本仅使用字符串表示纯数据（元素的名称等），并使用代码（方法调用）来表示“查找具有给定名称的元素”或“应用此过滤器”等操作。在我看来，这是更惯用的 C# 代码。

显然其他人不同意这个观点，但我认为值得扩展以表明我来自哪里。

请注意，这当然不是一个硬性的规则……在某些情况下，XPath、正则表达式等是最好的解决方案。在本例中，我更喜欢 LINQ to XML，仅此而已。

¹ 当然，我可以将每个 Console.WriteLine 调用保留在一行上，但我不喜欢在上发布带有水平滚动条的代码所以。请注意，使用与上面相同的缩进编写正确的 XPath 版本并避免滚动仍然相当令人讨厌：

            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/" +
                "address_component[type='administrative_area_level_1']" +
                "/short_name").InnerXml);

一般来说，长行在 Visual Studio 中比在 Stack Overflow 上工作得更好...

I'd definitely recommend using LINQ to XML instead of XPathNavigator. It makes XML querying a breeze, in my experience. In this case I'm not sure exactly what's wrong... but I'll come up with a LINQ to XML snippet instead.

using System;
using System.Linq;
using System.Net;
using System.Xml.Linq;

class Test
{
    public static void Main(string[] args)
    {
        using(WebClient webclient = new WebClient())
        {
            webclient.Proxy = null;
            string locationXml = webclient.DownloadString
                ("http://maps.google.com/maps/api/geocode/xml?address=1600"
                 + "+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false");
            XElement root = XElement.Parse(locationXml);

            XElement result = root.Element("result");
            Console.WriteLine(result.Elements("address_component")
                                    .Where(x => (string) x.Element("type") ==
                                           "administrative_area_level_1")
                                    .Select(x => x.Element("short_name").Value)
                                    .First());
            Console.WriteLine(result.Elements("address_component")
                                    .Where(x => (string) x.Element("type") ==
                                           "administrative_area_level_2")
                                    .Select(x => x.Element("long_name").Value)
                                    .First());
        }
    }
}

Now this is more code¹... but I personally find it easier to get right than XPath, because the compiler is helping me more.

EDIT: I feel it's worth going into a little more detail about why I generally prefer code like this over using XPath, even though it's clearly longer.

When you use XPath within a C# program, you have two different languages - but only one is in control (C#). XPath is relegated to the realm of strings: Visual Studio doesn't give an XPath expression any special handling; it doesn't understand that it's meant to be an XPath expression, so it can't help you. It's not that Visual Studio doesn't know about XPath; as Dimitre points out, it's perfectly capable of spotting errors if you're editing an XSLT file, just not a C# file.

This is the case whenever you have one language embedded within another and the tool is unaware of it. Common examples are:

SQL
Regular expressions
HTML
XPath

When code is presented as data within another language, the secondary language loses a lot of its tooling benefits.

While you can context switch all over the place, pulling out the XPath (or SQL, or regular expressions etc) into their own tooling (possibly within the same actual program, but in a separate file or window) I find this makes for harder-to-read code in the long run. If code were only ever written and never read afterwards, that might be okay - but you do need to be able to read code afterwards, and I personally believe the readability suffers when this happens.

The LINQ to XML version above only ever uses strings for pure data - the names of elements etc - and uses code (method calls) to represent actions such as "find elements with a given name" or "apply this filter". That's more idiomatic C# code, in my view.

Obviously others don't share this viewpoint, but I thought it worth expanding on to show where I'm coming from.

Note that this isn't a hard and fast rule of course... in some cases XPath, regular expressions etc are the best solution. In this case, I'd prefer the LINQ to XML, that's all.

¹ Of course I could have kept each Console.WriteLine call on a single line, but I don't like posting code with horizontal scrollbars on SO. Note that writing the correct XPath version with the same indentation as the above and avoiding scrolling is still pretty nasty:

            Console.WriteLine(nav.SelectSingleNode("/GeocodeResponse/result/" +
                "address_component[type='administrative_area_level_1']" +
                "/short_name").InnerXml);

In general, long lines work a lot better in Visual Studio than they do on Stack Overflow...

回复收藏 0 原文

一指流沙 2024-09-22 03:24:22

我建议仅在 Visual Studio 中键入 XPath 表达式作为 XSLT 文件的一部分。您将在“键入时”收到错误消息——这是一个优秀的 XML/XSLT/XPath 编辑器。

例如，我正在输入：

<xsl:apply-templates select="@* | node() x"/>

并立即在错误列表窗口中出现以下错误：

Error   9   Expected end of the expression, found 'x'.  @* | node()  -->x<--

XSLTFile1.xslt  9   14  Miscellaneous Files

仅当 XPath 表达式不会引发任何错误时（我也可能测试它是否也选择了预期的节点），我是否可以将此表达式放入我的 C# 代码中。

这可以确保我在运行 C# 程序时不会出现 XPath（语法和语义）错误。

I would recommend just typing the XPath expression as part of an XSLT file in Visual Studio. You'll get error messages "as you type" -- this is an excellent XML/XSLT/XPath editor.

For example, I am typing:

<xsl:apply-templates select="@* | node() x"/>

and immediately get in the Error List window the following error:

Error   9   Expected end of the expression, found 'x'.  @* | node()  -->x<--

XSLTFile1.xslt  9   14  Miscellaneous Files

Only when the XPath expression does not raise any errors (I might also test that it selects the intended nodes, too), would I put this expression into my C# code.

This ensures that I will have no XPath -- syntax and semantic -- errors when I run the C# program.

回复收藏 0 原文

方觉久 2024-09-22 03:24:22

dtb的回应是准确的。我想补充一点，您可以使用像下面的链接这样的 xpath 测试工具来帮助找到正确的 xpath：

http ://www.bit-101.com/xpath/

回复收藏 0 原文

谜兔 2024-09-22 03:24:22

string url = @"http://maps.google.com/maps/api/geocode/xml?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false";
string value = "administrative_area_level_1";

using(WebClient client = new WebClient())
{
    string wcResult = client.DownloadString(url);

    XDocument xDoc = XDocument.Parse(wcResult);

    var result = xDoc.Descendants("address_component")
                    .Where(p=>p.Descendants("type")
                                .Any(q=>q.Value.Contains(value))
                    );

}

结果是“address_component”的枚举，其中至少有一个“type”节点包含您正在搜索的值。上述查询的结果是一个包含以下数据的 XElement。

<address_component>
  <long_name>California</long_name>
  <short_name>CA</short_name>
  <type>administrative_area_level_1</type>
  <type>political</type>
</address_component>

我真的建议花一点时间学习 LINQ，因为它对于操作和查询内存中对象、查询数据库非常有用，并且在处理 XML 时往往比使用 XPath 更容易。我最喜欢参考的网站是 http://www.hookedonlinq.com/

string url = @"http://maps.google.com/maps/api/geocode/xml?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=false";
string value = "administrative_area_level_1";

using(WebClient client = new WebClient())
{
    string wcResult = client.DownloadString(url);

    XDocument xDoc = XDocument.Parse(wcResult);

    var result = xDoc.Descendants("address_component")
                    .Where(p=>p.Descendants("type")
                                .Any(q=>q.Value.Contains(value))
                    );

}

The result is an enumeration of "address_component"s that have at least one "type" node that has contains the value you're searching for. The result of the query above is an XElement that contains the following data.

<address_component>
  <long_name>California</long_name>
  <short_name>CA</short_name>
  <type>administrative_area_level_1</type>
  <type>political</type>
</address_component>

I would really recommend spending a little time learning LINQ in general because its very useful for manipulating and querying in-memory objects, querying databases and tends to be easier than using XPath when working with XML. My favorite site to reference is http://www.hookedonlinq.com/

回复收藏 0 原文

~没有更多了~