自动更正编码正则表达式模式
我正在为字符串输入编码构建自动更正。我想构建一个用于编码模式的正则表达式。
例如:
var encoding = "utd-8";
Correct c = new Correct(encoding);
var c.Correct();
输出为utf-8
。 我承担了大部分工作(并使用了一些伟大的人编写的一些开源代码,他们写了漂亮的东西)。有人可以帮忙吗?
更新
我最终需要的是正确编码的正则表达式模式。 用户输入编码名称iso-8859-1
并检查其是否有效。
I am building auto correct for string input encoding. And I want to build a regex for encoding pattern.
For example:
var encoding = "utd-8";
Correct c = new Correct(encoding);
var c.Correct();
And the output is utf-8
.
I have most of the work (and using some open source coding from some great people that wrote beautiful stuff). Can some one help please?
UPDATE
What I need in the end is the regex pattern for the right encoding.
The user input a encoding name iso-8859-1
and it check if its valid.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在弄清楚如何解决问题之前,您不应该决定使用哪种技术;正则表达式真的有必要吗?
如果我正确理解你的问题,你想检查输入字符串是否看起来很像支持的编码之一。在编写一行代码之前,您必须弄清楚:
UTF-16
与Unicode
相同)?UTF-16
还是UTF-32
?也许您可以看一下其中一种字符串距离算法(例如,http://en.wikipedia。 org/wiki/Levenshtein_distance)以获取有关该主题的灵感。 “另请参阅”部分中有大量链接。
You shouldn't decide on which technology to use before you have figured out how to solve the problem; are Regular Expressions really necessary?
If I understand your question correctly, you want to check whether the input string looks alot like one of the supported encodings. Before writing a single line of code, you'll have to figure out:
UTF-16
is the same asUnicode
)?UTF-16
orUTF-32
?Perhaps you can take a look at one of the string distance algorithms (for example, http://en.wikipedia.org/wiki/Levenshtein_distance) for inspiration on the subject. There are a ton of links in the "see also" section there.