如何在 iPhone 应用程序中使用正则表达式以 , (逗号)分隔字符串

发布于 2024-12-31 21:48:49 字数 1147 浏览 1 评论 0原文

我必须读取包含三列的 .csv 文件。在解析 .csv 文件时,我得到以下格式的字符串 Christopher Bass,\"Cry the Beloved Country Final Essay\",[电子邮件受保护]。我想将三列的值存储在一个数组中,所以我使用了 componentSeparatedByString:@"," 方法!它成功地向我返回了包含三个组成部分的数组:

  1. Christopher Bass
  2. Cry the Beloved Country Final Essay
  3. [email  ;受保护]

但是当有列值中已经有一个逗号,如下所示 克里斯托弗·巴斯,\“哭泣,心爱的国家期末作文\”,[电子邮件]受保护] 它将字符串分成四个部分,因为 Cry:

  1. Christopher Bass
  2. Cry
  3. the Beloved Country Final Essay
  4. [电子邮件受保护]

那么,如何使用正则表达式来处理这个问题。我有“RegexKitLite”类,但我应该使用哪个正则表达式。请帮忙!

谢谢-

I have to read .csv file which has three columns. While parsing the .csv file, I get the string in this format Christopher Bass,\"Cry the Beloved Country Final Essay\",[email protected]. I want to store the values of three columns in an Array, so I used componentSeparatedByString:@"," method! It is successfully returning me the array with three components:

  1. Christopher Bass
  2. Cry the Beloved Country Final Essay
  3. [email protected]

but when there is already a comma in the column value, like this
Christopher Bass,\"Cry, the Beloved Country Final Essay\",[email protected]
it separates the string in four components because there is a ,(comma) after the Cry:

  1. Christopher Bass
  2. Cry
  3. the Beloved Country Final Essay
  4. [email protected]

so, How can I handle this by using regular expression. I have "RegexKitLite" classes but which regular expression should I use. Please help!

Thanks-

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

相权↑美人 2025-01-07 21:48:49

任何正则表达式都可能会出现同样的问题,您需要的是通过转义逗号或通过以下方式突出显示字符串来清理您的条目或字符串:“My string”。否则你也会遇到同样的问题。祝你好运。

对于您的示例,您可能需要执行以下操作:

\"Christopher Bass\",\"Cry\, the Beloved Country Final Essay\",\"[email protected]\"

这样您就可以使用正则表达式,甚至可以使用 NSString 类中的相同方法。

完全不相关,但是清理字符串的重要性:http://xkcd.com/327 /呵呵。

Any regular expression would probably turn out with the same problem, what you need is to sanitize your entries or strings, either by escaping your commas or by highlighting strings this way: "My string". Otherwise you will have the same problem. Good luck.

For your example you would probably need to do something like:

\"Christopher Bass\",\"Cry\, the Beloved Country Final Essay\",\"[email protected]\"

That way you could use a regexp or even the same method from the NSString class.

Not related at all, but the importance of sanitizing strings: http://xkcd.com/327/ hehehe.

魄砕の薆 2025-01-07 21:48:49

怎么样:

componentsSeparatedByRegex:@",\\\"|\\\","

这应该在 ", 以任意顺序一起出现的地方分割字符串,从而产生一个三成员数组。当然,这假设该字符串始终括在括号内,并且字符 ", 在三个组件中不会连续出现。

如果这些假设中的任何一个不正确,则可以使用其他方法来识别字符串组件,但应该明确的是,不存在通用解决方案。如果三个组成字符串可以在任何地方包含 ",,那么在这种情况下,甚至不可能有有限的解决方案:

Doe, John,\"\"Why Unescaped Strings Suck\", And Other Development Horror Stories\",Doe, John <[email protected]>

希望您的 CSV 数据中没有类似上述内容。如果有,数据基本上无法使用,你应该寻找更好的 CSV 导出器。

How about this:

componentsSeparatedByRegex:@",\\\"|\\\","

This should split your string whereever " and , appear together in either order, resulting in a three-member array. This of course assumes that the second element in the string is always enclosed in parentheses, and the characters " and , never appear consecutively within the three components.

If either of these assumptions is incorrect, other methods to identify string components may be used, but it should be made clear that no generic solution exists. If the three component strings can contain " and , anywhere, not even a limited solution is possible in such cases:

Doe, John,\"\"Why Unescaped Strings Suck\", And Other Development Horror Stories\",Doe, John <[email protected]>

Hopefully there is nothing like the above in your CSV data. If there is, the data is basically unusable, and you should look into a better CSV exporter.

终难愈 2025-01-07 21:48:49

您要搜索的正则表达式是: \\"(.*)\\"[ ^,]*|([^,]*),

在 ObjC 中: (('\ "' && string_1 && '\"' && 0-n 空格) || string_2 除逗号) &&逗号

NSString *str = @"Christopher Bass,\"Cry, the Beloved Country ,Final Essay\",[email protected],som";
NSString *regEx = @"\\\"(.*)\\\"[ ^,]*|([^,]*),";
NSMutableArray *split = [[str componentsSeparatedByRegex:regEx] mutableCopy];
[split removeObject:@""]; // because it will print always both groups even if the other is empty
NSLog(@"%@", split);

// OUTPUT:
2012-02-07 17:42:18.778 tmpapp[92170:c03] (
    "Christopher Bass",
    "Cry, the Beloved Country ,Final Essay",
    "[email protected]",
    som
)

RegexKitLite 会将两个字符串添加到数组中,因此您的数组最终会得到空对象。 removeObject:@"" 将删除这些值,但如果您需要维护真正的空值(例如,您的源有 val,,ue),您必须将代码修改为以下:

str = [str stringByReplacingOccurrencesOfRegex:regEx withString:@"$1$2∏"];
NSArray *split = [str componentsSeparatedByString:@"∏"];

$1 和 $2 是上面提到的两个字符串,∏ 在这种情况下是一个很可能永远不会出现在普通文本中的字符(并且很容易记住:option-shift-p)。

The regex you're searching for is: \\"(.*)\\"[ ^,]*|([^,]*),

in ObjC: (('\"' && string_1 && '\"' && 0-n spaces) || string_2 except comma) && comma

NSString *str = @"Christopher Bass,\"Cry, the Beloved Country ,Final Essay\",[email protected],som";
NSString *regEx = @"\\\"(.*)\\\"[ ^,]*|([^,]*),";
NSMutableArray *split = [[str componentsSeparatedByRegex:regEx] mutableCopy];
[split removeObject:@""]; // because it will print always both groups even if the other is empty
NSLog(@"%@", split);

// OUTPUT:
2012-02-07 17:42:18.778 tmpapp[92170:c03] (
    "Christopher Bass",
    "Cry, the Beloved Country ,Final Essay",
    "[email protected]",
    som
)

RegexKitLite will add both strings to the array, therefore you will end up with empty objects for your array. removeObject:@"" will delete those but if you need to maintain true empty values (eg. your source has val,,ue) you have to modify the code to the following:

str = [str stringByReplacingOccurrencesOfRegex:regEx withString:@"$1$2∏"];
NSArray *split = [str componentsSeparatedByString:@"∏"];

$1 and $2 are those two strings mentioned above, ∏ is in this case a character which will most likely never appear in normal text (and is easy to remember: option-shift-p).

滥情稳全场 2025-01-07 21:48:49

最后一部分看起来永远不会包含逗号。据我所知,第一个也不会......像

这样分割字符串怎么样:

NSArray *splitArr = [str componentsSeparatedByString:@","];
NSString *nameStr = [splitArr objectAtIndex:0];
NSString *emailStr = [splitArr lastObject];

NSString *contentStr = @"";
for(int i=1; i<[splitArr count]-1; ++i) {
    contentStr = [contentStr stringByAppendingString:[splitArr objectAtIndex:i]];
}

这将按原样使用第一个和最后一个字符串,并将其余字符串合并到内容中。

有点像黑客,但姓名和电子邮件地址永远不会包含逗号,对吗?

The last part looks like it will never contain a comma. Neither will the first one as far as I can see...

What about splitting the string like this:

NSArray *splitArr = [str componentsSeparatedByString:@","];
NSString *nameStr = [splitArr objectAtIndex:0];
NSString *emailStr = [splitArr lastObject];

NSString *contentStr = @"";
for(int i=1; i<[splitArr count]-1; ++i) {
    contentStr = [contentStr stringByAppendingString:[splitArr objectAtIndex:i]];
}

This will use the first and last string as is, and combine the rest into the content.

Kind of a hack, but a name and an email address will never contain a comma, right?

久伴你 2025-01-07 21:48:49

标题是否保证有引号?它是唯一可以拥有它们的组件吗?因为 componentSeparatedByString:@"\"" 应该会为您提供:

  1. Christopher Bass,
  2. 《哭泣,心爱的国家》期末论文
  3. [email protected]

然后使用 componentSeparatedByString:@","substringFrom/ToIndex: 摆脱这两个第一个和最后一个组件中的逗号

以下是使用子字符串的解决方案:

NSString* input = @"Christopher Bass,\"Cry, the Beloved Country Final Essay\",[email protected]";
NSArray* split = [input componentsSeparatedByString:@"\""];
NSString* part1 = [split objectAtIndex:0];
NSString* part2 = [split objectAtIndex:1];
NSString* part3 = [split objectAtIndex:2];
part1 = [part1 substringToIndex:[part1 length] - 1];
part3 = [part3 substringFromIndex:1];

NSLog(part1);
NSLog(part2);
NSLog(part3);

Is the title guarantied to have the quotation marks? And is it the only component that can have them? Because then componentSeparatedByString:@"\"" should get you this:

  1. Christopher Bass,
  2. Cry, the Beloved Country Final Essay
  3. ,[email protected]

Then use componentSeparatedByString:@"," or substringFrom/ToIndex: to get rid of the two commas in the first and last component.

Here's a solution using substring:

NSString* input = @"Christopher Bass,\"Cry, the Beloved Country Final Essay\",[email protected]";
NSArray* split = [input componentsSeparatedByString:@"\""];
NSString* part1 = [split objectAtIndex:0];
NSString* part2 = [split objectAtIndex:1];
NSString* part3 = [split objectAtIndex:2];
part1 = [part1 substringToIndex:[part1 length] - 1];
part3 = [part3 substringFromIndex:1];

NSLog(part1);
NSLog(part2);
NSLog(part3);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文