许多基于 Web 的用户身份验证系统不允许用户名包含字母、数字和下划线以外的字符。


Many web based user authentication systems don't allow usernames that contain characters other than letters, numbers and underscores.

Could there be a technical reason for that?

A well-designed system doesn't necessarily need to prevent any special characters in usernames.

That said, the reason underscores have traditionally been accepted, is that underscore is typically treated as a "word" character, along with letters and numbers. It is usually the only other character given this distinction. This is true in regular expressions, and even at a base level in most operating systems (type an underscore in a word and double click the letters. The selection will extend past the underscore. Now try the same with a dash, it most likely will not.)

即使没有被恶意使用,允许用户键入与其他地方发生冲突的字符也可能会造成不必要的麻烦。例如,如果您决定为每个用户创建一个文件系统目录,以存储他们的上传内容,则用户名必须符合该操作系统上的目录命名规则(例如,没有 \/:*?"<>|< 了

一旦您避免了像目录命名这样的冲突,并删除了 "';%// 以避免注入攻击,您就可以 删除了大部分标点符号,以及“为什么有人甚至需要在用户名中使用标点符号”?

编写一个快速的正则表达式来根据 [a-zA-Z0-9_] 验证用户名并完成它,比费力地找出所有可能不会冲突的标点符号要容易得多,或以某种方式将它们映射到其他字符。


Yes: to avoid having to escape special characters. Lazy programmers will just drop what the user types, straight into the code somewhere and this is what leads to injection attacks.

Even if it's not used maliciously, allowing the user to type characters that will conflict somewhere else can be more hassle than necessary. For example, if you decide to create a filesystem directory per user, to store their uploads in, then the username must conform to directory naming rules on that OS (e.g. no \/:*?"<>| on Windows).

Once you've avoided clashes like the directory naming one, and stripped out "';% and // to avoid injection attacks, you have removed most punctuation, and "why does someone even need punctuation in their user name"?

It was far easier to write a quick regex to validate usernames against [a-zA-Z0-9_] and be done with it, than faff about with figuring out all the possible punctuation that will not clash, or mapping them to other characters in some way.

Then, like many things in computing, as soon as enough people start having just letters, numbers and underscores for usernames, and people start making usernames to that spec, it became the de facto standard and self perpetuates!

(更新了正则表达式以修复回溯 @abney317 提到




这需要长度为 4,最多 32 个字符。它必须以单词字符开头,并且可以包含不连续的点和破折号。我使用它的唯一原因是因为它足够严格,可以与几乎任何东西集成:)





When not specified I use this:

(updated regex to fix the backtracking @abney317 mentioned)


(original regex)


This requires a length of 4 with maximum 32 characters. It must start with a word character and can have non continuous dots and dashes. The only reason I use this is because it's strict enough to integrate with almost anything :)

Valid :


Invalid :


将其限制为这些字符(甚至是它们的 ASCII 子集)可以防止像

Limiting it to these characters (or even the ASCII subset of them) prevents usernames like ???????????????? from being accepted. By not accepting these characters, you can prevent a wide range or usernames-that-look-like-other-usernames.

我建议您尝试使用包含 http://msdn.microsoft .com/en-us/library/20bw873z.aspx#SupportedUnicodeGeneralCategorieshttp://msdn.microsoft.com/en-us/library/20bw873z.aspx#SupportedNamedBlocks。我还没有尝试过这个,但



I don't like the readability argument when it interferes with the ability for people to use their native language in usernames.

I recommend you experiment with using character classes that incorporate http://msdn.microsoft.com/en-us/library/20bw873z.aspx#SupportedUnicodeGeneralCategories or http://msdn.microsoft.com/en-us/library/20bw873z.aspx#SupportedNamedBlocks. I haven't tried this, but


might be worth an experiment.

就我个人而言,我真的、真的希望人们能够稍微扩展一下内容,以允许使用破折号和撇号。这将允许人们使用非英语语音名称(例如:美洲原住民部落名称,如 She-Ki 和 Ke`Xthsa-Tse)

Because it allows multiple words to be represented in a somewhat readable manner.

Peronally I really, really wish folks would expand things a bit to allow dashes and apostrophes. This would allow people to use non-english phonetic names (eg: Native American tribal names like She-Ki and Ke`Xthsa-Tse)

网站强制执行此类规则的主要原因是可读性(因为像 ~-|this<>one|-~ 这样的用户名很烦人)。也可能是因为它的工作量较少(下划线由 \w+ 正则表达式匹配,而破折号和其他特殊字符则不匹配),但我怀疑这是一个主要原因。


The main reason websites enforce such rules is readability (because usernames like ~-|this<>one|-~ are annoying). It might also be because it's less work (underscores get matched by a \w+ regex, while dashes and other special characters don't), but I doubt that's a major reason.

There is no "standard", so if neither of the above reasons bother you, do whatever you'd like. Personally I'd like to see more websites accept dashes and periods, but it's really a personal preference of readability and consistency vs expression.

Depends how your usernames are used. There isn't a general rule, without knowing the context.

Underscore was traditionally allowed in identifiers in most programming languages, and was generally the only "special" character allowed.
But many web login still do not accept ANY special character and are limited to lower/upper case characters and digits...
And other are fine with really special ones ;-)

人们可能想写他们的用户名 like_this 而不是 likethis 或 LikeThis。

People may want to write their usernames like_this rather than likethis or LikeThis.

