whitespaceAndNewlineCharacterSet 似乎正在删除特殊字符之前的空格
我正在使用 NSXMLParser 来解析 rss feed。但我遇到了一些奇怪的行为,我相信我已经将范围缩小到 stringByTrimmingCharactersInSet:[NSCharacterSetwhitespaceAndNewlineCharacterSet]。
如果我有这样一句话:
你好,我的名字是“桑尼”。
它最终会显示如下:
你好,我的名字是“Sonny”。
这是我的 foundCharacters
方法:
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent)
currentNodeContent = [[NSMutableString alloc] initWithString:string];
else
{
[currentNodeContent appendString:string];
NSString *trimmedString = currentNodeContent;
trimmedString = [trimmedString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
}
}
我尝试将 whitespaceAndNewlineCharacterSet
更改为 newlineCharacterSet
,这解决了问题,但导致显示各种不需要的空格和回车符向上。关于为什么会发生这种情况以及我能做些什么来解决它有什么想法吗?
更新
所以我根据下面德克的答案更新了我的代码,这似乎很好地完成了这个任务。
- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if ([elementname isEqualToString:@"item"])
{
[comments addObject:currentComment];
currentComment = nil;
}
NSString *trimmedString = [tempString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
tempString = nil;
currentNodeContent = nil;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent) {
currentNodeContent = [[NSMutableString alloc] initWithString:string];
tempString = [[NSMutableString alloc] init];
} else {
[tempString appendString:string];
}
}
I'm using NSXMLParser
to parse an rss feed. But I'm getting some strange behavior that I believe I've narrowed down to stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]
.
If I have a sentence like this:
Hello, my name is "Sonny."
It will end up getting displayed like this:
Hello, my name is"Sonny."
Here is my foundCharacters
method:
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent)
currentNodeContent = [[NSMutableString alloc] initWithString:string];
else
{
[currentNodeContent appendString:string];
NSString *trimmedString = currentNodeContent;
trimmedString = [trimmedString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
}
}
I tried changing whitespaceAndNewlineCharacterSet
to newlineCharacterSet
, which fixed the problem but caused all kinds of unwanted whitespace and carriage returns to show up. Any thoughts on why this is happening and what I can do to fix it?
UPDATE
So I updated my code based on Dirk's answer below, this seems to have done the trick nicely.
- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if ([elementname isEqualToString:@"item"])
{
[comments addObject:currentComment];
currentComment = nil;
}
NSString *trimmedString = [tempString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
tempString = nil;
currentNodeContent = nil;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent) {
currentNodeContent = [[NSMutableString alloc] initWithString:string];
tempString = [[NSMutableString alloc] init];
} else {
[tempString appendString:string];
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在这种情况下:
您不应依赖于接收以下事件序列:
startElement
"element"characterData
"Some Content"endElement
“element”也可以是(取决于解析器的内部,如缓冲区大小等):
startElement
“element”characterData
“So”characterData
“我Cont`characterData
"ent"endElement
"element"为了安全起见,您应该简单地存储收到的字符,直到看到元素结束事件,然后才应用 结果进行修剪操作。
对 href="http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/NSXMLParserDelegate_Protocol/Reference/Reference.html#//apple_ref/occ/intf/NSXMLParserDelegate" rel="nofollow">来自
NSXMLParser
文档:In a situation like this:
you should not rely on receiving exactly the following sequence of events:
startElement
"element"characterData
"Some Content"endElement
"element"It could just as well be (depending on interna of the parser like buffer size, etc.):
startElement
"element"characterData
"So"characterData
"me Cont`characterData
"ent"endElement
"element"To be safe, you should simply store the characters received until the end-of-element event is seen, and only then apply the trimming operation on the result.
From the
NSXMLParser
documentation: