如何解决Android上XML解析性能问题

发布于 2024-12-01 14:32:26 字数 637 浏览 6 评论 0原文

我必须在 Android 上读取大约 4000 行的 XML 文件。首先,我尝试了 SimpleXML 库,因为它是最简单的,在我的 HTC Desire 上大约花了 2 分钟。所以我想 SimpleXML 之所以这么慢,可能是因为反射和这个库使用的所有其他魔法。我重写了我的解析器并使用内置的 DOM 解析方法,并特别关注性能。这有点帮助,但仍然需要大约 60 秒,这仍然是完全不可接受的。经过一番研究后,我发现 这篇文章开发者网站。有一些图表显示其他两种可用方法(SAX 解析器和 Android 的 XML Pull-Parser)同样慢。在文章的最后,您会发现以下声明:

我的第一个惊讶是这三种方法都很慢。用户 不想在手机上等待太久结果,所以解析 任何超过几十条记录都可能需要不同的方法。

什么可能是“不同的方法”?如果您的记录超过“几十条”怎么办?

I have to read a XML file with about ~4000 lines on Android. First I tried the SimpleXML library because it's the easiest and it took about 2 minutes on my HTC Desire. So I thought maybe SimpleXML is so slow because of reflection and all the other magic that this library uses. I rewrote my parser and used the built-in DOM parsing method with some special attention for performance. That helped a bit but it still took about 60 seconds which is still totally unacceptable. After a bit of research I found this article on developer.com. There are some graphs that show that the other two available methods - the SAX parser and Android's XML Pull-Parser - are equally slow. And at the end of the article you'll find the following statement:

The first surprise I had was at how slow all three methods were. Users
don't want to wait long for results on mobile phones, so parsing
anything more than a few dozen records may mandate a different method.

What might be a "different method"? What to do if you have more than "a few dozen records"?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

空心↖ 2024-12-08 14:32:27

原始答案,2012 年

(注意:请务必阅读下面的 2016 年更新!)

我刚刚做了一些性能测试,比较了 Android(和其他平台)上的解析器。正在解析的 XML 文件只有 500 行左右(它是 Twitter 搜索 Atom 提要),但 Pull 和 DOM 解析可以在 Samsung Galaxy S2 或 Motorola Xoom2 上每秒处理大约 5 个此类文档。 OP 使用的 SimpleXML(图表中的粉色)与 DOM 解析速度最慢。

SAX 解析在我的两台 Android 设备上都快了数量级,单线程管理速度为 40 个文档/秒,多线程管理速度为 65+/秒。

Android 2.3.4:

xml 的性能比较Android 上的解析方法

代码可在 github 中找到,并进行讨论此处

更新于 2016 年 3 月 18 日

好吧,已经过去近 4 年了,世界已经发生了变化。我终于抽出时间重新运行测试:

  1. 运行 Android 4.1.2 的 Samsung Galaxy S3 运行
  2. Android 4.4.4 的 Nexus7 (2012)
  3. 运行 Android 6.0.1 的 Nexus5

介于 Android 4.4.4 和 Android 6.0.1 之间的情况发生了巨大的变化,我们有了一个新的赢家:Pull Parsing FTW,其吞吐量是 SAX 的两倍以上。不幸的是,我不知道这个变化何时到来,因为我没有任何运行 Android 的设备 > 4.4.4 和< 6.0.1。

Android 4.1.2:

Android 4.1.2 上 xml 解析方法的性能比较

Android 4.4.4:

性能比较Android 4.4.4Android 6.0.1 上的 xml 解析方法

性能比较Android 6.0.1上的xml解析方法

Original answer, in 2012

(note: make sure you read the 2016 update below!)

I just did some perf testing comparing parsers on Android (and other platforms). The XML file being parsed is only 500 lines or so (its a Twitter search Atom feed), but Pull and DOM parsing can churn through about 5 such documents a second on a Samsung Galaxy S2 or Motorola Xoom2. SimpleXML (pink in the chart) as used by the OP ties for slowest with DOM parsing.

SAX Parsing is an order of magnitude faster on both of my Android devices, managing 40 docs/sec single-threaded, and 65+/sec multi-threaded.

Android 2.3.4:

performance comparison of xml parsing methods on Android

The code is available in github, and a discussion here.

Update 18th March 2016

OK, so its been almost 4 years and the world has moved on. I finally got around to re-running the tests on:

  1. A Samsung Galaxy S3 running Android 4.1.2
  2. A Nexus7 (2012) running Android 4.4.4
  3. A Nexus5 running Android 6.0.1

Somewhere between Android 4.4.4 and Android 6.0.1 the situation changed drastically and we have a new winner: Pull Parsing FTW at more than twice the throughput of SAX. Unfortunately I don't know exactly when this change arrived as I don't have any devices running Android > 4.4.4 and < 6.0.1.

Android 4.1.2:

performance comparison of xml parsing methods on Android 4.1.2

Android 4.4.4:

performance comparison of xml parsing methods on Android 4.4.4

Android 6.0.1:

performance comparison of xml parsing methods on Android 6.0.1

阳光①夏 2024-12-08 14:32:27

我认为在 Android 上使用 XML 的最佳方法是使用 VDT-XML 库

我的 XML 文件包含更多内容然后 60 000 行,VDT-XML 处理如下:

Nexus 5:2055 毫秒

Galaxy Note 4:2498 毫秒

您可以通过链接找到更多基准报告:VTD-XML Benchmark

XML 文件的简短示例

 <database name="products">
        <table name="category">
            <column name="catId">20</column>
            <column name="catName">Fruit</column>
        </table>
        <table name="category">
            <column name="catId">31</column>
            <column name="catName">Vegetables</column>
        </table>
        <table name="category">
            <column name="catId">45</column>
            <column name="catName">Rice</column>
        </table>
        <table name="category">
            <column name="catId">50</column>
            <column name="catName">Potatoes</column>
        </table>
</database>

“build.gradle”文件的配置

dependencies {
    compile files('libs/vtd-xml.jar')
}

源代码示例:

import com.ximpleware.AutoPilot;
import com.ximpleware.VTDGen;
import com.ximpleware.VTDNav;


String fileName = "products.xml";

VTDGen vg = new VTDGen();

if (vg.parseFile(fileName, true)) {

     VTDNav vn = vg.getNav();
     AutoPilot table = new AutoPilot(vn);
     table.selectXPath("database/table");

     while (table.iterate()) {
        String tableName = vn.toString(vn.getAttrVal("name"));

        if (tableName.equals("category")) {
            AutoPilot column = new AutoPilot(vn);
            column.selectElement("column");

            while (column.iterate()) {
                 String text = vn.toNormalizedString(vn.getText());
                 String name = vn.toString(vn.getAttrVal("name"));

                 if (name.equals("catId")) {
                    Log.d("Category ID = " + text);
                 } else if (name.equals("catName")) {
                    Log.d("Category Name = " + text);
                 } 

            }
        }
     }
}

结果

Category ID = 20
Category Name = Fruit

Category ID = 31
Category Name = Vegetables

Category ID = 45
Category Name = Rice

Category ID = 50
Category Name = Potatoes

它对我有用,希望对你有帮助。

I think the best way to work with XML on Android is use VDT-XML library

My XML file contains more then 60 000 lines and VDT-XML handles it as following:

Nexus 5 : 2055 millisec

Galaxy Note 4 : 2498 milisec

You can find more benchmark reports by link : VTD-XML Benchmark

Short example of XML file

 <database name="products">
        <table name="category">
            <column name="catId">20</column>
            <column name="catName">Fruit</column>
        </table>
        <table name="category">
            <column name="catId">31</column>
            <column name="catName">Vegetables</column>
        </table>
        <table name="category">
            <column name="catId">45</column>
            <column name="catName">Rice</column>
        </table>
        <table name="category">
            <column name="catId">50</column>
            <column name="catName">Potatoes</column>
        </table>
</database>

Configuration of "build.gradle" file

dependencies {
    compile files('libs/vtd-xml.jar')
}

Source code example:

import com.ximpleware.AutoPilot;
import com.ximpleware.VTDGen;
import com.ximpleware.VTDNav;


String fileName = "products.xml";

VTDGen vg = new VTDGen();

if (vg.parseFile(fileName, true)) {

     VTDNav vn = vg.getNav();
     AutoPilot table = new AutoPilot(vn);
     table.selectXPath("database/table");

     while (table.iterate()) {
        String tableName = vn.toString(vn.getAttrVal("name"));

        if (tableName.equals("category")) {
            AutoPilot column = new AutoPilot(vn);
            column.selectElement("column");

            while (column.iterate()) {
                 String text = vn.toNormalizedString(vn.getText());
                 String name = vn.toString(vn.getAttrVal("name"));

                 if (name.equals("catId")) {
                    Log.d("Category ID = " + text);
                 } else if (name.equals("catName")) {
                    Log.d("Category Name = " + text);
                 } 

            }
        }
     }
}

Result

Category ID = 20
Category Name = Fruit

Category ID = 31
Category Name = Vegetables

Category ID = 45
Category Name = Rice

Category ID = 50
Category Name = Potatoes

it works for me and hope it helps you.

一笑百媚生 2024-12-08 14:32:27

使用 SAX 解析器,我可以在 HTC Desire 上在大约 10 秒内解析 15,000 行 XML 文件。我怀疑还涉及其他一些问题。

您是否从 XML 填充数据库?如果是这样,您是否记得将整个解析操作包装在数据库事务中?仅此一项就可以将速度提高一个数量级。

Using the SAX parser, I can parse a 15,000-line XML file in around 10 seconds on my HTC Desire. I suspect there is some other issue involved.

Are you populating a database from the XML? If so, are you remembering to wrap your entire parse operation in a DB transaction? That alone can speed things up by an order of magnitude.

别把无礼当个性 2024-12-08 14:32:27

如果您正在解析 XML 中的日期,这可能会显着减慢解析速度。对于更新版本的 Android,这不再是一个问题(因为它们优化了时区信息的加载)。

如果您有正在解析的日期并且不需要它们,那么您可以使用 SAX 解析器并忽略任何日期元素。

或者,如果您可以更改 XML 架构,请考虑将日期存储为整数而不是格式化字符串。

您提到您正在进行字符串比较,这也可能非常昂贵。也许考虑对要比较的字符串使用 HashMap,这可以带来显着的性能优势。

If you are parsing Dates within your XML that can significantly slow down your parsing. With the more recent versions of Android this becomes less of a problem (as they optimised the loading of timezone info)

If you have Dates that are being parsed and you don't need them, then you could use a SAX parser and ignore any of the Date elements.

Or if you can change your XML schema, consider storing the Dates as integers rather than formatted strings.

You mentioned you are making string comparisons, this can be pretty expensive as well. Perhaps consider using a HashMap for the strings you are comparing, this can give noticeable performance benifits.

霓裳挽歌倾城醉 2024-12-08 14:32:27

在不查看代码的情况下,很难告诉您为什么代码速度慢,并且当您没有提供任何测量的详细信息来证明这一点时,很难相信您断言速度慢是由于 XML 解析器造成的。

It's very hard to tell you why your code is slow without seeing your code, and it's very hard to believe your assertion that the slowness is due to the XML parser when you haven't provided details of any measurements to prove this.

堇色安年 2024-12-08 14:32:27

我们非常有效地使用 pull-parser 来处理 1MB XML 文件 - 根据我的愿望,它们在大约 10-20 秒内被读取。所以如果你的代码没问题,速度也会很好。很明显,DOM 在有限的内存环境下非常慢,但 pull 或 SAX 确实不是

we're using the pull-parser very effectively for 1MB XML Files - and they are read in about 10-20 Seconds on my Desire. So if your code is okay, the speed will be as well. It's obvious that DOM is very slow on a limited memory environment, but pull or SAX really aren't

情话已封尘 2024-12-08 14:32:27

如果您从 Socket 进行解析,那么花费时间的是 I/O,而不是解析。首先尝试使用数据,然后在加载后进行解析并测量性能。如果文件太大,请考虑具有非常大缓冲区的 BufferedInputStream,这应该会提高您的性能。

我非常严重怀疑 Simple XML 是否需要 2 分钟才能加载 4000 行,我意识到手机将会比工作站慢很多,但是我可以在我的工作站上在 600 毫秒内加载 200,000 行 XML。

If your parsing from a Socket its the I/O thats taking the time, not the parsing. Try consume the data first, then parse once loaded and measure the performance. If the file is too big then consider a BufferedInputStream with a very large buffer, this should improve performance for you.

I very seriously doubt Simple XML is going to take 2 minutes to load 4000 lines, I realise a handset is going to be a lot slower than a workstation, however I can load 200,000 lines of XML in 600ms on my workstation.

诠释孤独 2024-12-08 14:32:27

与其使其成为同步过程,不如使其异步。您可以使用一个按钮来启动 IntentService,该服务将为您处理数据并更新结果并在完成时显示通知。这样你就不会停止 UI 线程。

Rather than making it a synchronous process, make it asynchronous. You can have a button that starts an IntentService which will process the data for you and will update the results and show a notification when it is done. That way you don't stop the UI thread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文