Objective-C 中的 Hpple 找不到特定对象(XML/HTML 解析器)

发布于 2024-10-06 13:54:20 字数 5704 浏览 1 评论 0原文

对于那些没有尝试过 Hpple 的老手来说,这非常棒。它使用 Xpath 来搜索 HTML/XML 文档。它可以完成工作,而且对于像我这样的新手来说很容易理解。但是,我遇到了麻烦。

我有这段 HTML:

    <ul class="challengesList dailyChallengesList">

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl00_challengeImage" title="Gunslinger" src="/images/reachstats/challenges/0.png" alt="Gunslinger" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl00_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1500cR</p>
</div>
<h5>Gunslinger</h5>
<p class="description">Kill 150 enemies in multiplayer Matchmaking.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl00_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl00_progressBar" class="bar" style="width:21%;"><span></span></div> 
<p>31/150</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl01_challengeImage" title="A Great Friend" src="/images/reachstats/challenges/0.png" alt="A Great Friend" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl01_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1400cR</p>
</div>
<h5>A Great Friend</h5>
<p class="description">Earn 15 assists today in multiplayer Matchmaking.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl01_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl01_progressBar" class="bar" style="width:40%;"><span></span></div> 
<p>6/15</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl02_challengeImage" title="Cannon Fodder" src="/images/reachstats/challenges/2.png" alt="Cannon Fodder" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl02_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1000cR</p>
</div>
<h5>Cannon Fodder</h5>
<p class="description">Kill 50 infantry-class foes in the Campaign today.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl02_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl02_progressBar" class="bar" style="width:0%;"><span></span></div> 
<p>0/50</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl03_challengeImage" title="Heroic Demon" src="/images/reachstats/challenges/3.png" alt="Heroic Demon" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl03_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1500cR</p>
</div>
<h5>Heroic Demon</h5>
<p class="description">Kill 30 Elites in Firefight Matchmaking on Heroic or harder.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl03_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl03_progressBar" class="bar" style="width:0%;"><span></span></div> 
<p>0/30</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

</ul>

最重要的是,我无法让 Hpple“看到”

。我使用以下方法来查找它:

NSArray * rawProgress = [doc search:@"//ul[@class='challengesList']
                                          /li/div[@class='info']
                                                 /div[@class='reward']/p"];

这总是返回一个空数组。这让我发疯,因为同样的事情适用于该项目中的所有其他元素...

任何帮助将不胜感激:)

编辑

这有效:

NSArray * rawDescriptions = [doc search:@"//ul[@class='challengesList']
                                              /li/div[@class='info']
                                                     /p[@class='description']"];

这不起作用:

NSArray * rawProgress = [doc search:@"//ul[@class='challengesList']
                                          /li/div[@class='info']
                                                 /div[@class='reward']
                                                     /div[@id]//p"];

此外,尝试列出 rFloat 或奖励的子节点会导致崩溃:(

For those veterans who haven't tried Hpple, it's great. It uses Xpath for searching through HTML/XML documents. It gets the job done and it's easy enough for a newbie like me to understand. However, I'm having trouble.

I have this chunk of HTML:

    <ul class="challengesList dailyChallengesList">

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl00_challengeImage" title="Gunslinger" src="/images/reachstats/challenges/0.png" alt="Gunslinger" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl00_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1500cR</p>
</div>
<h5>Gunslinger</h5>
<p class="description">Kill 150 enemies in multiplayer Matchmaking.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl00_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl00_progressBar" class="bar" style="width:21%;"><span></span></div> 
<p>31/150</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl01_challengeImage" title="A Great Friend" src="/images/reachstats/challenges/0.png" alt="A Great Friend" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl01_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1400cR</p>
</div>
<h5>A Great Friend</h5>
<p class="description">Earn 15 assists today in multiplayer Matchmaking.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl01_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl01_progressBar" class="bar" style="width:40%;"><span></span></div> 
<p>6/15</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl02_challengeImage" title="Cannon Fodder" src="/images/reachstats/challenges/2.png" alt="Cannon Fodder" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl02_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1000cR</p>
</div>
<h5>Cannon Fodder</h5>
<p class="description">Kill 50 infantry-class foes in the Campaign today.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl02_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl02_progressBar" class="bar" style="width:0%;"><span></span></div> 
<p>0/50</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

<li>
<div class="corner topLeft"></div>
<img id="ctl00_mainContent_dailyChallengesRepeater_ctl03_challengeImage" title="Heroic Demon" src="/images/reachstats/challenges/3.png" alt="Heroic Demon" style="border-width:0px;">
<div class="info">
<div class="rFloat">
<p id="ctl00_mainContent_dailyChallengesRepeater_ctl03_challengeExpiration" class="timeDisplay dailyExpirationCountdown"><span>0d</span><span>19h</span><span>9m</span><span class="seconds">37s</span></p>
<p>1500cR</p>
</div>
<h5>Heroic Demon</h5>
<p class="description">Kill 30 Elites in Firefight Matchmaking on Heroic or harder.</p>
<div class="reward">

<div id="ctl00_mainContent_dailyChallengesRepeater_ctl03_progressBox" class="barContainer">
<div id="ctl00_mainContent_dailyChallengesRepeater_ctl03_progressBar" class="bar" style="width:0%;"><span></span></div> 
<p>0/30</p>
</div>
</div>
</div>
<div class="clear"></div>
</li>

</ul>

The nutty part is, I cannot get Hpple to "see" the <div class="reward">. I'm using the following to find it:

NSArray * rawProgress = [doc search:@"//ul[@class='challengesList']
                                          /li/div[@class='info']
                                                 /div[@class='reward']/p"];

This always returns an empty array. It's driving me nuts, as the same kind of thing worked for all of the other elements in this project...

Any help would be appreciated :)

EDIT

This works:

NSArray * rawDescriptions = [doc search:@"//ul[@class='challengesList']
                                              /li/div[@class='info']
                                                     /p[@class='description']"];

This doesn't:

NSArray * rawProgress = [doc search:@"//ul[@class='challengesList']
                                          /li/div[@class='info']
                                                 /div[@class='reward']
                                                     /div[@id]//p"];

Furthermore, trying to list the child nodes of rFloat or reward produces a crash :(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

赠佳期 2024-10-13 13:54:20

您的“p”元素不是 div class="reward" 的直接子元素。

使用您提供的 XML,XPath 表达式

div[@class='info']/div[@class='reward']//p

将起作用。

Your "p" element is not an immediate child of div class="reward".

Using XML you provided, XPath expression

div[@class='info']/div[@class='reward']//p

will work.

葬﹪忆之殇 2024-10-13 13:54:20
  • 有关 Hpple 问题的类似报告以及 替代方案列表。

您可能会看到一个错误。根据此页面

它被归类为实验性的
开发商的项目,但到目前为止
它“对我有用”

更新:现在似乎有点坏了。
有人有更好的解决方案吗?

您可能想要输入错误报告,如果该项目仍在维护中,也许开发人员将提供修复或解决方案。或者您可以在 这个推荐 hpple 的页面,看看该博主或他的一位读者是否可以解决该问题或告诉您 hpple 是否处于活动状态。

您还可以查看是否可以找到 HyperParser。 “它是一个简单的 HTML 解析器,具有与 NSXMLParser 类似的 API。专门为解析半有效 HTML 而设计。”但它似乎不在原来的链接处。

  • See this SO question for a similar report on problems with Hpple and a list of alternatives.

You may be seeing a bug. According to this page,

It's classified as an experimental
project by the developer, but so far
it's "worked for me"

UPDATE: seems to be kinda broken now.
Anyone got a better solution?

You may want to enter a bug report, and if the project is still being maintained, maybe the developer will respond with a fix or solution. Or you could leave a comment on this page that recommended hpple, and see if that blogger or one of his readers can address the problem or tell you if hpple is active at all.

You could also see if you can find HyperParser. "It's a simple HTML parser that has API similar to NSXMLParser. Designed specially to parse semi-valid HTML." But it doesn't seem to be there at the link where it used to be.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文