whitespaceAndNewlineCharacterSet 似乎正在删除特殊字符之前的空格

发布于 2024-12-31 23:19:54 字数 1866 浏览 2 评论 0原文

我正在使用 NSXMLParser 来解析 rss feed。但我遇到了一些奇怪的行为,我相信我已经将范围缩小到 stringByTrimmingCharactersInSet:[NSCharacterSetwhitespaceAndNewlineCharacterSet]。

如果我有这样一句话:

你好,我的名字是“桑尼”。

它最终会显示如下:

你好,我的名字是“Sonny”。

这是我的 foundCharacters 方法:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { 
    if(!currentNodeContent) 
        currentNodeContent = [[NSMutableString alloc] initWithString:string];
    else
    {
        [currentNodeContent appendString:string];        
        NSString *trimmedString = currentNodeContent;
        trimmedString = [trimmedString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        [currentNodeContent setString:trimmedString];
    }
}

我尝试将 whitespaceAndNewlineCharacterSet 更改为 newlineCharacterSet,这解决了问题,但导致显示各种不需要的空格和回车符向上。关于为什么会发生这种情况以及我能做些什么来解决它有什么想法吗?

更新

所以我根据下面德克的答案更新了我的代码,这似乎很好地完成了这个任务。

- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{    
    if ([elementname isEqualToString:@"item"]) 
    {
        [comments addObject:currentComment];
        currentComment = nil;
    }

    NSString *trimmedString = [tempString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    [currentNodeContent setString:trimmedString];
    tempString = nil;
    currentNodeContent = nil;
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { 
    if(!currentNodeContent) {
        currentNodeContent = [[NSMutableString alloc] initWithString:string];
        tempString = [[NSMutableString alloc] init];
    } else {
        [tempString appendString:string];
    }
}

I'm using NSXMLParser to parse an rss feed. But I'm getting some strange behavior that I believe I've narrowed down to stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet].

If I have a sentence like this:

Hello, my name is "Sonny."

It will end up getting displayed like this:

Hello, my name is"Sonny."

Here is my foundCharacters method:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { 
    if(!currentNodeContent) 
        currentNodeContent = [[NSMutableString alloc] initWithString:string];
    else
    {
        [currentNodeContent appendString:string];        
        NSString *trimmedString = currentNodeContent;
        trimmedString = [trimmedString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        [currentNodeContent setString:trimmedString];
    }
}

I tried changing whitespaceAndNewlineCharacterSet to newlineCharacterSet, which fixed the problem but caused all kinds of unwanted whitespace and carriage returns to show up. Any thoughts on why this is happening and what I can do to fix it?

UPDATE

So I updated my code based on Dirk's answer below, this seems to have done the trick nicely.

- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{    
    if ([elementname isEqualToString:@"item"]) 
    {
        [comments addObject:currentComment];
        currentComment = nil;
    }

    NSString *trimmedString = [tempString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    [currentNodeContent setString:trimmedString];
    tempString = nil;
    currentNodeContent = nil;
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { 
    if(!currentNodeContent) {
        currentNodeContent = [[NSMutableString alloc] initWithString:string];
        tempString = [[NSMutableString alloc] init];
    } else {
        [tempString appendString:string];
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寒冷纷飞旳雪 2025-01-07 23:19:54

在这种情况下:

<element>Some Content</element>

您不应依赖于接收以下事件序列:

  • startElement "element"
  • characterData "Some Content"
  • endElement “element”

也可以是(取决于解析器的内部,如缓冲区大小等):

  • startElement “element”
  • characterData “So”
  • characterData“我Cont`
  • characterData "ent"
  • endElement "element"

为了安全起见,您应该简单地存储收到的字符,直到看到元素结束事件,然后才应用 结果进行修剪操作。

对 href="http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/NSXMLParserDelegate_Protocol/Reference/Reference.html#//apple_ref/occ/intf/NSXMLParserDelegate" rel="nofollow">来自NSXMLParser 文档:

解析器对象可以向委托发送多个 parser:foundCharacters: 消息来报告元素的字符。由于字符串可能只是当前元素总字符内容的一部分,因此您应该将其追加到当前的字符累积中,直到元素发生变化。

In a situation like this:

<element>Some Content</element>

you should not rely on receiving exactly the following sequence of events:

  • startElement "element"
  • characterData "Some Content"
  • endElement "element"

It could just as well be (depending on interna of the parser like buffer size, etc.):

  • startElement "element"
  • characterData "So"
  • characterData "me Cont`
  • characterData "ent"
  • endElement "element"

To be safe, you should simply store the characters received until the end-of-element event is seen, and only then apply the trimming operation on the result.

From the NSXMLParser documentation:

The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文