在 foreach 中使用 QueryPath 的多个查找
我正在使用 QueryPath 和 PHP。
这发现 .eventdate 没问题,但不为 .dtstart 返回任何内容:
$qp = htmlqp($url);
foreach ($qp->find('table#schedule')->find('tr') as $tr){
echo 'date: ';
echo $tr->find('.eventdate')->text();
echo ' time: ';
echo $tr->find('.dtstart')->text();
echo '<br>';
}
如果我交换两者,.dtstart 工作正常,但 .eventdate 不返回任何内容。因此,看起来 querypath 中的 find() 破坏了元素并只返回它需要的值,使得 $tr 上的迭代不可能搜索多个项目。
下面是我正在处理的 TR 的示例 HTML:
<tr class="event"><th class="date first" scope="row"><abbr class="eventdate" title="Thursday, February 01, 2011" >02/01</abbr><span class="eventtime" ><abbr class="dtstart" title="2012-02-01T19:00:00" >7:00 PM</abbr><abbr class="dtend" title="2012-02-01T21:00:00" >9:00 PM</abbr></span></th><td class="opponent summary"><ul><li class="first">@ <a class="team" href="/high-schools/ridge-wolves/basketball-winter-11-12/schedule.htm" >Ridge </a> <span class="game-note">*</span></li><li class="location" title="Details: Ridge High School">Details: Ridge High School</li><li class="last"><a class="" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&ssid=381d-49f5-9f6d" >Preview Game</a></li></ul></td><td class="result last"><a class="pregame" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&ssid=381d-49f5-9f6d">Preview</a></td></tr>
我尝试在第一个查找之前复制 $tr 并在第二个查找之前替换它,但这不起作用。
如何在每个 $tr 期间搜索某些变量?
仅供参考,除了 .eventdate 和 .dtstart 之外,我还想要对手的 a
下的 .opponent、href 和 a
锚文本。
I'm using QueryPath and PHP.
This finds the .eventdate okay, but doesn't return anything for .dtstart:
$qp = htmlqp($url);
foreach ($qp->find('table#schedule')->find('tr') as $tr){
echo 'date: ';
echo $tr->find('.eventdate')->text();
echo ' time: ';
echo $tr->find('.dtstart')->text();
echo '<br>';
}
If I swap the two, .dtstart works okay, but .eventdate doesn't return anything. Thus, it seems that find() in querypath destroys the element and only returns the value it needs, making iteration over $tr not possible to search for multiple items.
Here's example HTML for a TR I'm dealing with:
<tr class="event"><th class="date first" scope="row"><abbr class="eventdate" title="Thursday, February 01, 2011" >02/01</abbr><span class="eventtime" ><abbr class="dtstart" title="2012-02-01T19:00:00" >7:00 PM</abbr><abbr class="dtend" title="2012-02-01T21:00:00" >9:00 PM</abbr></span></th><td class="opponent summary"><ul><li class="first">@ <a class="team" href="/high-schools/ridge-wolves/basketball-winter-11-12/schedule.htm" >Ridge </a> <span class="game-note">*</span></li><li class="location" title="Details: Ridge High School">Details: Ridge High School</li><li class="last"><a class="" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&ssid=381d-49f5-9f6d" >Preview Game</a></li></ul></td><td class="result last"><a class="pregame" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&ssid=381d-49f5-9f6d">Preview</a></td></tr>
I tried copying the $tr before the first find and replacing it before the second, but that didn't work.
How can I search during each $tr for certain variables?
FYI, beyond .eventdate and .dtstart, I also want the .opponent, href under the a
for the opponent and the a
anchor text.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
出于性能原因,QueryPath 在内部维护其状态(与 jQuery 不同)。所以
branch()
是正确的选择。不过,作为对建议解决方案的修改,我建议通过这样做来最大限度地减少 find() 调用的数量:
最后,任何时候您执行“破坏性”操作(例如
find()
) ,您始终可以使用end()
返回上一步。所以上面也可以这样完成:这是一个非常非常小的性能改进,但我更喜欢
branch()
方法,除非我正在处理大于 1M 的文档。在 QueryPath 3.x 中,它具有大量新的性能增强功能,我正在考虑使用 jQuery 方式为每个函数创建一个新对象。不幸的是,这个方法会使用更多的内存,所以我可能不会保留它。虽然
branch()
需要一些时间来学习,但它确实有其优点。QueryPath maintains its state internally (unlike jQuery) for performance reasons. So
branch()
is the way to go.As a modification to the proposed solution, though, I would suggest minimizing the number of find() calls by doing this:
Finally, any time you do a "destructive" action (like a
find()
), you can always go back one step usingend()
. So the above could also be done like this:This is a VERY VERY minor performance improvement, but I prefer the
branch()
method unless I'm working with documents larger than 1M.In QueryPath 3.x, which has a whole bunch of new performance enhancements, I am toying with the idea of going with the jQuery way of creating a new object for each function. Unfortunately, this method will use a LOT more memory, so I may not keep it. While
branch()
takes a little while to learn, it does have its advantages.我自己刚刚学习 QueryPath,但我认为你应该对行对象进行分支。否则,
$tr->find('.eventdate')
会将您带到该行中包含的abbr
元素,以及后面的每个find()< /code> 将尝试查找
abbr
下面的元素,导致没有匹配项。branch()
(参见文档)创建一个副本QueryPath 对象的原始对象(在本例中为$tr
)保持不变。所以你的代码是:
我不知道这是否是实现你想要的目标的首选方法,但它似乎有效。
I'm just learning QueryPath myself, but I think you should branch the row object. Otherwise the
$tr->find('.eventdate')
will take you to theabbr
element contained in the row, and each followingfind()
will try to find elements beneath theabbr
, resulting in no matches.branch()
(see documentation) creates a copy of the QueryPath object, leaving the original object (in this case$tr
) intact.So your code would be:
I don't know if this is the preferred way to achieve what you want, but it seems to work.
是的,你是对的,我今天实际上遇到了这个问题,在 jquery 中,你只需查询、查询、查询、查询没有问题,但是 QueryPath 如果你查询,它会更改对象的内部“状态”,因此如果你尝试第二次查询,它适用于当前状态。
因此,如果您想查询文档中的多个“单独”位置,则必须在
$q = qp("something.html);
之前进行分支
$a = $q->branch()->find("tr");
$b = $q->branch()->find("a");
这似乎适用于我的代码,所以我想它也适用于你的代码。
yeah you are right, I actually had this problem today, in jquery, you just query, query, query, query no problems, however QueryPath if you query, it changes the internal "state" of the object so if you attempt a second query, it's applied against the current state.
so if you want to query multiple "separate" locations in the document, you have to branch before
$q = qp("something.html);
$a = $q->branch()->find("tr");
$b = $q->branch()->find("a");
that seems to work in my code, so I suppose it will work in yours.