iPhone 上的 HTML 解析

发布于 2024-10-31 04:25:41 字数 4906 浏览 0 评论 0 原文

我想解析这个网站:

http://its.wonju.go.kr /movinginfo2/DetailSub/StopDetail.asp?StopID=1959#

所以我尝试使用 TFHpple。像这样:

NSString *htmlWillInsert = [NSString stringWithContentsOfURL:
                            [NSURL URLWithString:@"http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959#"]
                                                    encoding:-2147481280
                                                       error:nil];

NSData *htmlData = [htmlWillInsert dataUsingEncoding:NSUnicodeStringEncoding];

TFHpple *xpathParser =[[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *busNumbs = [xpathParser search:@"//td //a[@class='new']"];
NSLog (@"%@",busNumbs);

 for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *Numb = [busNumb content];
    NSString *st = [[NSString alloc]initWithFormat:@"%@",Numb];
    NSLog(@"%@",st);
}

但它不起作用:(...
它加载▷ 22번(순환) 但它无法加载 ; 108분후 도착
就像这样:

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환)

怎么了?

我想要这样解析:

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환) 77분후 도착
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환) 108분후 도착

I want to parse this site :

http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959#

So I tried use TFHpple. Like this :

NSString *htmlWillInsert = [NSString stringWithContentsOfURL:
                            [NSURL URLWithString:@"http://its.wonju.go.kr/movinginfo2/DetailSub/StopDetail.asp?StopID=1959#"]
                                                    encoding:-2147481280
                                                       error:nil];

NSData *htmlData = [htmlWillInsert dataUsingEncoding:NSUnicodeStringEncoding];

TFHpple *xpathParser =[[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *busNumbs = [xpathParser search:@"//td //a[@class='new']"];
NSLog (@"%@",busNumbs);

 for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *Numb = [busNumb content];
    NSString *st = [[NSString alloc]initWithFormat:@"%@",Numb];
    NSLog(@"%@",st);
}

But it isn't working :(...
It loads<a href="#" onClick="MapMove(22);" class="new" onFocus='this.blur();'">▷ 22번(순환)
But It can't loads <font color='red'> 108분후 도착</font>
Like this :

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환)

What's wrong?

I want parse like this :

2011-04-09 10:42:07.413 GangwonBus[5461:207] ▷ 1번(순환)
2011-04-09 10:42:07.414 GangwonBus[5461:207] ▷ 2번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 21번(순환)
2011-04-09 10:42:07.415 GangwonBus[5461:207] ▷ 22번(순환)
2011-04-09 10:42:07.416 GangwonBus[5461:207] ▷ 23번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 24번(순환)
2011-04-09 10:42:07.417 GangwonBus[5461:207] ▷ 25번(순환)
2011-04-09 10:42:07.422 GangwonBus[5461:207] ▷ 32번(순환)
2011-04-09 10:42:07.423 GangwonBus[5461:207] ▷ 41번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 41-1번(순환)
2011-04-09 10:42:07.424 GangwonBus[5461:207] ▷ 42번(순환)
2011-04-09 10:42:07.425 GangwonBus[5461:207] ▷ 51번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 52번(순환)
2011-04-09 10:42:07.426 GangwonBus[5461:207] ▷ 53번(순환)
2011-04-09 10:42:07.427 GangwonBus[5461:207] ▷ 54번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 55번(순환)
2011-04-09 10:42:07.428 GangwonBus[5461:207] ▷ 56번(순환)
2011-04-09 10:42:07.429 GangwonBus[5461:207] ▷ 57번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 58번(순환)
2011-04-09 10:42:07.430 GangwonBus[5461:207] ▷ 64번(기점 -> 종점)
2011-04-09 10:42:07.431 GangwonBus[5461:207] ▷ 70번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 71번(순환)
2011-04-09 10:42:07.432 GangwonBus[5461:207] ▷ 72번(기점 -> 종점)
2011-04-09 10:42:07.433 GangwonBus[5461:207] ▷ 73번(순환) 77분후 도착
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 74번(순환)
2011-04-09 10:42:07.434 GangwonBus[5461:207] ▷ 81-1번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 82번(순환)
2011-04-09 10:42:07.435 GangwonBus[5461:207] ▷ 83번(순환)
2011-04-09 10:42:07.436 GangwonBus[5461:207] ▷ 84번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 90번(순환)
2011-04-09 10:42:07.437 GangwonBus[5461:207] ▷ 91번(순환) 108분후 도착

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

俯瞰星空 2024-11-07 04:25:41

标签是锚标签的子节点。描述[aTFHppleElement content]被错误地描述作为innerHTML,因为它只包含元素的文本内容而不是其子节点的内容。因此,您需要访问子节点才能显示该信息。事实上,存储库中的代码不提供访问子节点所需的 API,而是提供其 forks< 之一/a> 有更改。

[aTFHppleElement Children] 提供对子节点数组的访问。它是一个 TFHppleElement 数组。查看 HTML 结构,只有少数有子节点,因此,

for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *out = [busNumb content];
    if ( [[busNumb children] count] != 0 ) {
        out = [out stringByAppendingFormat:@" %@", [[busNumb firstChild] content]];
    }
    NSLog(@"%@",out);
}

The <font> tag is a child node to the anchor tag. The description [aTFHppleElement content] is incorrectly described as the innerHTML as it only contains the textual content of the element and not the content of its child nodes. So you will need to access the child nodes to display that information. As it were, the code in the repository doesn't provide the neccessary API to access the child nodes but one of its forks has the changes.

[aTFHppleElement children] provides access to the array of child nodes. It is an array of TFHppleElement's. Looking at the HTML structure, only few of them have the child nodes so,

for (int i=0;i<[busNumbs count];i++) {
    TFHppleElement *busNumb = [busNumbs objectAtIndex:i];
    NSString *out = [busNumb content];
    if ( [[busNumb children] count] != 0 ) {
        out = [out stringByAppendingFormat:@" %@", [[busNumb firstChild] content]];
    }
    NSLog(@"%@",out);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文