当前位置：文江博客话题详情

在 C# 中使用 Linq 以不同条件分割字符串

发布于 2024-10-21 10:38:36 字数 1062 浏览 1 评论 0原文

我需要从字符串中提取并删除一个单词。该单词应为大写，并位于分隔符 /、;、(、- 或

“

一些示例：

这是测试 A/ABC”
预期输出：“这是测试 A” 和 “ABC”
“这是测试；ABC/XYZ”
预期输出：“这是一个测试；ABC” 和 “XYZ”
“此任务已分配给我们项目中的 ANIL/SHAM” 预期输出：“此任务已分配给我们项目中的 ANIL” 和 “SHAM”
“此任务已分配给我们项目中的 ANIL/SHAM” “
预期输出：“此任务已分配给项目中的 ANIL/SHAM” 和 “OUR”
“这是测试 AWN.A”
预期输出：“这是测试” 和 “AWN.A”
“XETRA-DAX” 预期输出："XETRA" 和 "DAX"
"FTSE-100" 预期输出："-100" 和 "FTSE"
"ATHEX" 预期输出："" 和 "ATHEX"
"Euro-Stoxx-50" 预期输出："Euro-Stoxx-50" 和 ""

我怎样才能实现这一目标？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

心意如水 2024-10-28 10:38:36

“智能”版本：

    string strValue = "this is test A/ABC";
    int ix = strValue.LastIndexOfAny(new[] { '/', ' ', ';', '(', '-' });
    var str1 = strValue.Substring(0, ix);
    var str2 = strValue.Substring(ix + 1);

“愚蠢的 LINQ”版本：

    var str3 = new string(strValue.Reverse().SkipWhile(p => p != '/' && p != ' ' && p != ';' && p != '(' && p != '-').Skip(1).Reverse().ToArray());
    var str4 = new string(strValue.Reverse().TakeWhile(p => p != '/' && p != ' ' && p != ';' && p != '(' && p != '-').Reverse().ToArray());

两种情况都没有检查。如果 OP 需要的话，他可以添加支票。

对于第二个问题，使用 LINQ 确实太困难了。使用正则表达式，这“很容易实现”。

var regex = new Regex("^(.*[A-Z]+)([-/ ;(]+)([A-Z]+)(.*?)$");

var strValueWithout = regex.Replace(strValue, "$1$4");
var extractedPart = regex.Replace(strValue, "$3");

对于第三个问题，

var regex = new Regex("^(.*?)([A-Z.]*)([-/ ;(]+)([A-Z.]+)(.*?)$", RegexOptions.RightToLeft);

var strValueWithout = regex.Replace(strValue, "$1$2$5");
var extractedPart = regex.Replace(strValue, "$4");

带有代码示例： http://ideone.com/5OSs0

另一个更新（它变得无聊）

Regex Regex = new Regex(@"^(?<1>.*?)(?<2>[-/ ;(]*)(?<=\b)(?<3>[A-Z.]+)(?=\b)(?<4>.*?)$|^(?<1>.*)$", RegexOptions.RightToLeft);
Regex Regex2 = new Regex(@"^(?<1>.*?)(?<2>[-/ ;(]*)(?<=\b)(?<3>(?:\p{Lu}|\.)+)(?=\b)(?<4>.*?)$|^(?<1>.*)$", RegexOptions.RightToLeft);

var str1 = Regex.Replace(str, "$1$4");
var str2 = Regex.Replace(str, "$3");

两者之间的区别是第一个将使用 AZ 作为大写字符，第二个将使用其他“大写”字符，例如 ÀÈÉÌÒÙ

代码示例： http://ideone.com/FqcmY

An "intelligent" version:

    string strValue = "this is test A/ABC";
    int ix = strValue.LastIndexOfAny(new[] { '/', ' ', ';', '(', '-' });
    var str1 = strValue.Substring(0, ix);
    var str2 = strValue.Substring(ix + 1);

A "stupid LINQ" version:

    var str3 = new string(strValue.Reverse().SkipWhile(p => p != '/' && p != ' ' && p != ';' && p != '(' && p != '-').Skip(1).Reverse().ToArray());
    var str4 = new string(strValue.Reverse().TakeWhile(p => p != '/' && p != ' ' && p != ';' && p != '(' && p != '-').Reverse().ToArray());

both cases are WITHOUT checks. The OP can add checks if he wants them.

For the second question, using LINQ is REALLY too much difficult. With a Regex it's "easily doable".

var regex = new Regex("^(.*[A-Z]+)([-/ ;(]+)([A-Z]+)(.*?)$");

var strValueWithout = regex.Replace(strValue, "$1$4");
var extractedPart = regex.Replace(strValue, "$3");

For the third question

var regex = new Regex("^(.*?)([A-Z.]*)([-/ ;(]+)([A-Z.]+)(.*?)$", RegexOptions.RightToLeft);

var strValueWithout = regex.Replace(strValue, "$1$2$5");
var extractedPart = regex.Replace(strValue, "$4");

With code sample: http://ideone.com/5OSs0

Another update (it's becoming BORING)

Regex Regex = new Regex(@"^(?<1>.*?)(?<2>[-/ ;(]*)(?<=\b)(?<3>[A-Z.]+)(?=\b)(?<4>.*?)$|^(?<1>.*)$", RegexOptions.RightToLeft);
Regex Regex2 = new Regex(@"^(?<1>.*?)(?<2>[-/ ;(]*)(?<=\b)(?<3>(?:\p{Lu}|\.)+)(?=\b)(?<4>.*?)$|^(?<1>.*)$", RegexOptions.RightToLeft);

var str1 = Regex.Replace(str, "$1$4");
var str2 = Regex.Replace(str, "$3");

The difference between the two is that the first will use A-Z as upper case characters, the second one will use other "upper case" characters, for example ÀÈÉÌÒÙ

With code sample: http://ideone.com/FqcmY

回复收藏 0 原文

百变从容 2024-10-28 10:38:36

这应该根据新的要求工作：它应该找到最后一个用大写单词包裹的分隔符：

Match lastSeparator = Regex.Match(strExample,
                                  @"(?<=\b\p{Lu}+)[-/ ;(](\p{Lu}+)\b",
                                  RegexOptions.RightToLeft); // last match
string main = lastSeparator.Result("这应该根据新的要求工作：它应该找到最后一个用大写单词包裹的分隔符：

这个正则表达式有点棘手。主要技巧：

使用 RegexOptions.RightToLeft 查找最后一个匹配项。
使用 Match.Result代替。
 $`$' 作为替换字符串： http://www.regular- Expressions.info/refreplace.html
 \p{Lu} 表示大写字母，如果您更习惯，可以将其更改为 [AZ] 。




如果该单词不应跟随大写单词，您可以将正则表达式简化为：
@"[-/ ;(](\p{Lu}+)\b"  



如果您还想要其他字符，您可以使用字符类（并且可能删除 \b）。例如：
@"[-/ ;(]([\p{Lu}.,]+)"



工作示例：http://ideone.com/U9AdK
");  // before and after the match
string word = lastSeparator.Groups[1].Value; // word after the separator

这个正则表达式有点棘手。主要技巧：

使用 RegexOptions.RightToLeft 查找最后一个匹配项。
使用 Match.Result代替。
$`$' 作为替换字符串： http://www.regular- Expressions.info/refreplace.html
\p{Lu} 表示大写字母，如果您更习惯，可以将其更改为 [AZ] 。

如果该单词不应跟随大写单词，您可以将正则表达式简化为：
如果您还想要其他字符，您可以使用字符类（并且可能删除 \b）。例如：

工作示例：http://ideone.com/U9AdK

This should work according to the new requirements: it should find the last separator that is wrapped with uppercase words:

Match lastSeparator = Regex.Match(strExample,
                                  @"(?<=\b\p{Lu}+)[-/ ;(](\p{Lu}+)\b",
                                  RegexOptions.RightToLeft); // last match
string main = lastSeparator.Result("This should work according to the new requirements:  it should find the last separator that is wrapped with uppercase words:

This regex is a little tricky. Main tricks:

Use RegexOptions.RightToLeft to find the last match.
Use of Match.Result for a replace.
$`$' as replacement string: http://www.regular-expressions.info/refreplace.html
\p{Lu} for upper-case letters, you can change that to [A-Z] if your more comfortable with that. 




If the word shouldn't follow an upper case word, you can simplify the regex to:
@"[-/ ;(](\p{Lu}+)\b"  



If you want other characters as well, you can use a character class (and maybe remove \b). For example:  
@"[-/ ;(]([\p{Lu}.,]+)"



Working example: http://ideone.com/U9AdK
");  // before and after the match
string word = lastSeparator.Groups[1].Value; // word after the separator

This regex is a little tricky. Main tricks:

Use RegexOptions.RightToLeft to find the last match.
Use of Match.Result for a replace.
$`$' as replacement string: http://www.regular-expressions.info/refreplace.html
\p{Lu} for upper-case letters, you can change that to [A-Z] if your more comfortable with that.

If the word shouldn't follow an upper case word, you can simplify the regex to:
If you want other characters as well, you can use a character class (and maybe remove \b). For example:

Working example: http://ideone.com/U9AdK

回复收藏 0 原文

や莫失莫忘 2024-10-28 10:38:36

使用字符串列表，将所有单词设置为它

找到 / 的索引，然后使用 ElementAt() 确定要拆分的单词，即“SHAM”问题。

在下面的句子中，您的索引 / 将为 6。

string strSentence ="This TASK is assigned to ANIL/SHAM in our project";

然后在 index 末尾使用 ElementAt(6)

是索引List 中的 /

str = str.Select(s => strSentence.ElementAt(index+1)).ToList();

这将返回 SHAM

str = str.Delete(s => strSentence.ElementAt(index+1));

则只需打印不带 SHAM 的 strSentence

，这将删除 SHAM，然后如果您不想使用字符串列表，我认为你可以使用“”来确定句子中的单词，但这还有很长的路要走。

我认为我的想法是正确的，但代码可能不是那么完美。

use a List of strings, set all the words to it

find the index of the / then use ElementAt() to determine the word to split which is "SHAM" in your question.

in the below sentence of yours your index of / will be 6.

string strSentence ="This TASK is assigned to ANIL/SHAM in our project";

then use ElementAt(6) at the end of

index is the index of the / in your List<string>

str = str.Select(s => strSentence.ElementAt(index+1)).ToList();

this will return you the SHAM

str = str.Delete(s => strSentence.ElementAt(index+1));

this will delete the SHAM then just print the strSentence without SHAM

if you dont want to use a list of strings you can use the " " to determinate the words in your sentence i think, but that would be a long way to go.

the idea of mine is right i think but the code may not be that flawless.

回复收藏 0 原文

暮色兮凉城 2024-10-28 10:38:36

您可以结合使用 string.Split() 方法和 Regex 类。简单的Split适用于简单的情况，例如根据字符/进行拆分。正则表达式非常适合匹配更复杂的模式。

回复收藏 0 原文

浪漫之都 2024-10-28 10:38:36

作为概念证明，您可以使用 TakeWhile 和 SkipWhile 在 LINQ 中重新实现 Split

    string strValue  = "this is test A/ABC";
    var s1=new string(
        strValue
        .TakeWhile(c => c!= '/')
        .ToArray());
    var s2=new string(
        strValue
        .SkipWhile(c => c!= '/')
        .Skip(1)
        .ToArray());

我认为生成的代码非常丑陋，我希望您决定不使用 linq

As a proof of concept, you could re-implement Split in LINQ using TakeWhile and SkipWhile

    string strValue  = "this is test A/ABC";
    var s1=new string(
        strValue
        .TakeWhile(c => c!= '/')
        .ToArray());
    var s2=new string(
        strValue
        .SkipWhile(c => c!= '/')
        .Skip(1)
        .ToArray());

I think the resulting code is so mind-blowingly ugly that I hope you'll decide not to use linq

回复收藏 0 原文

~没有更多了~