.NET - 如何拆分“大写字母” 将分隔字符串放入数组中?
我如何从这个字符串:“ThisIsMyCapsDelimitedString”
...到这个字符串:“This Is My Caps Delimited String”
VB.net 中的代码行数最少是首选,但也欢迎使用 C#。
干杯!
How do I go from this string: "ThisIsMyCapsDelimitedString"
...to this string: "This Is My Caps Delimited String"
Fewest lines of code in VB.net is preferred but C# is also welcome.
Cheers!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(19)
一个简单的解决方案,它应该比正则表达式解决方案快几个数量级(基于我针对本线程中的顶级解决方案运行的测试),特别是当输入字符串的大小增长时:
A simple solution, which should be order(s) of magnitude faster than a regex solution (based on the tests I ran against the top solutions in this thread), especially as the size of the input string grows:
它处理所有 Unicode 字符,此外,如果您的字符串是包含驼峰式大小写表达式的常规句子(并且您希望保持句子完整,但要将驼峰式大小写分解为单词,而不需要重复空格等)。
我采用了Markus Jarderot的答案,这是非常好的(所以归功于他)并替换了
[AZ]
与\p{Lu}
和[az]
与\p{Ll}
并将最后一部分修改为处理数字。如果您希望数字在缩写词后面(例如
HTML5Guide
⮕HTML5 Guide
):另一种方法
解决问题的另一种方法:
更多选项
如果您希望数字在缩写词后面(例如
HTML5Guide
⮕HTML5 Guide
):如果您希望数字在任何单词后面跟随(例如
Html5Guide
⮕Html5 Guide
):如果您不想处理数字并且您确定字符串中不包含它们:
对于更简单的版本(忽略特殊的 Unicode 字符,例如 fiiancé 中的 é),
选择上述任何正则表达式并简单
将
\p{Lu}
替换为[AZ]
,将\p{Ll}
替换为[az]
和 <代码>\p{L} 与[A-Za-z]
。It deals with all Unicode characters, plus it works fine if your string is a regular sentence that contains a camel case expression (and you want to keep the sentence intact but to break the camel case into words, without duplicating spaces etc).
I took Markus Jarderot's answer which is excellent (so credits to him) and replaced
[A-Z]
with\p{Lu}
and[a-z]
with\p{Ll}
and modified the last part to deal with numbers.If you want numbers to trail after acronyms (e.g.
HTML5Guide
⮕HTML5 Guide
):Another approach
Just another approach to solve the problem:
More Options
If you want numbers to trail after acronyms (e.g.
HTML5Guide
⮕HTML5 Guide
):If you want numbers to trail after any word (e.g.
Html5Guide
⮕Html5 Guide
):If you don't want to deal with numbers and you're sure to not have them in the string:
For a simpler version (ignoring special Unicode characters like é as in fiancé),
pick any of the above regexes and simply
replace
\p{Lu}
with[A-Z]
,\p{Ll}
with[a-z]
and\p{L}
with[A-Za-z]
.程序化和快速实现:
测试:
Procedural and fast impl:
Tests:
要在非大写字母和 大写字母 Unicode 类别之间进行匹配:
(?<=\P{Lu})(?=\p{Lu})< /代码>
To match between non-uppercase and Uppercase Letter Unicode Category :
(?<=\P{Lu})(?=\p{Lu})
天真的正则表达式解决方案。 不会处理 O'Conner,并且还会在字符串的开头添加一个空格。
Naive regex solution. Will not handle O'Conner, and adds a space at the start of the string as well.
对于基于 @ZombieSheep 的很棒的答案构建的 C#,但现在使用编译的正则表达式以获得更好的性能:
示例代码:
结果:
这个属性的一个优点是它也适用于包含数字/数字的字符串。
For C# building on this awesome answer by @ZombieSheep but now using a compiled regex for better performance:
Sample code:
Result:
A plus point of this one is that it also works for strings that contain digits/numbers.
可能有一个更优雅的解决方案,但这就是我想到的:
There's probably a more elegant solution, but this is what I come up with off the top of my head:
尝试使用
结果将适合字母与数字的混合
Try to use
The result will fit for alphabet mix with numbers
实现伪代码: https://stackoverflow.com/a/5796394/4279201
Implementing the psudo code from: https://stackoverflow.com/a/5796394/4279201
我不久前做了这个。 它匹配 CamelCase 名称的每个组成部分。
例如:
要将其转换为仅在单词之间插入空格:
如果您需要处理数字:
I made this a while ago. It matches each component of a CamelCase name.
For example:
To convert that to just insert spaces between the words:
If you need to handle digits:
很好的答案,MizardX! 我稍微调整了一下,将数字视为单独的单词,这样“AddressLine1”将变为“Address Line 1”而不是“Address Line1”:
Great answer, MizardX! I tweaked it slightly to treat numerals as separate words, so that "AddressLine1" would become "Address Line 1" instead of "Address Line1":
只是为了一点变化...这是一个不使用正则表达式的扩展方法。
Just for a little variety... Here's an extension method that doesn't use a regex.
撇开格兰特·瓦格纳的精彩评论不谈:
Grant Wagner's excellent comment aside:
我需要一个支持首字母缩略词和数字的解决方案。 这个基于正则表达式的解决方案将以下模式视为单独的“单词”:
您可以将其作为一句台词来完成:
一种更具可读性的方法可能会更好:
以下是 (XUnit) 测试的摘录:
I needed a solution that supports acronyms and numbers. This Regex-based solution treats the following patterns as individual "words":
You could do it as a one-liner:
A more readable approach might be better:
Here's an extract from the (XUnit) tests:
为了获得更多变化,使用普通的旧式 C# 对象,以下代码会生成与 @MizardX 出色的正则表达式相同的输出。
For more variety, using plain old C# objects, the following produces the same output as @MizardX's excellent regular expression.
下面是一个将以下内容转换为标题大小写的原型:
显然,您自己只需要“ToTitleCase”方法。
控制台输出如下:
引用的博客文章
Below is a prototype that converts the following to Title Case:
Obviously you would only need the "ToTitleCase" method yourself.
The console out would be as follows:
Blog Post Referenced
正则表达式比简单循环慢大约 10-12 倍:
Regex is about 10-12 times slower than a simple loop: