为什么标签通常都是小写?

发布于 2024-07-13 14:38:14 字数 1431 浏览 11 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

风和你 2024-07-20 14:38:15

询问工程师为什么某件事是这样的,他们会不遗余力地找出答案。 ;)

在这种情况下,我倾向于用懒惰(程序员不愿意考虑你提出的观点)和模仿(一旦你在网站 S 上看到它以某种方式完成,你倾向于使用类似的假设为站点 S' 重新实现它)。

以不区分大小写的方式存储标签(出于排序、查询等目的),但以最初的大写形式显示标签,这似乎是可行的故意的。

Ask an engineer the reason why something is a certain way, and they'll go to great lengths to figure it out. ;)

In this case, I'd be inclined to explain the prevalence of lowercase by a combination of laziness (programmers not willing to consider the points you bring up) and imitation (once you see it done a certain way on site S, you tend to reimplement it for site S' with similar assumptions).

It certainly seems feasible to store tags in such a way that case doesn't matter (for purposes of sorting, querying and so on) but display the tags with the capitalization originally intended.

岁月如刀 2024-07-20 14:38:15

标签的不同情况应始终被视为等效。

标准化存储标签的另一个原因。 单个规范化版本包含接受的案例,并且标签使用多对多链接表进行链接。 与标签表的比较不区分大小写,因此永远不会出现重复。

Different cases should be always be considered equivalent for tags.

Another reason to store your tags normalized. The single normalized version contains the accepted case, and tags are linked using many-to-many link table. Comparison against the tag table is done case-insensitive, so there will never be duplicates.

自我难过 2024-07-20 14:38:15

(我并不是在这个答案中建议任何特定的站点或系统 - 每个特定的系统可能都有自己的考虑因素)

我想原因是为了防止重复并简化排序或识别(如果您不需要考虑多个选项,那就更容易了) 。 并且可能是为了保持一定的一致性,因为许多网络用户界面都是面向那些有时可能会费心正确地大写字母或否则不会的人)。

但无论如何,这些都是一个问题,因为通常有不止一种方式来指代某事物。 如果您的标签曾经在某种脚本、配置或代码(例如邮件过滤器、设置文件、命令行)中用作符号,那么最好有一些简单的约定来指定它们,并且如果所有符号都具有相似的含义,允许或区分不同的情况变化、界限等可能是有问题的。 作为 Unix 用户,我尝试使文件名保持简单、简短、小写且没有特殊字符,并且当它们是(例如)邮箱名称或源文件时更是如此 - 因为它们可能必须在中键入和指定在许多情况下,否则会很不方便。

另一方面,当使用复杂的图形或基于 Web 的界面(允许在列表中轻松选择、完成键入的条目、建议最接近的匹配等)时,允许某种映射是有意义的。 给每个标签一个简短的、简单的小写识别名称,但也允许给它一个“长”或“人类”名称,该名称将在有意义的地方显示。 标签可以通过其短名称来唯一标识和指定,但通过其长名称更方便读取。

这类似于许多系统中用户名的工作方式。 我不会选择混合大小写的用户名,而是让用户名不区分大小写(所以我只会使用对我所在的系统有意义的大小写,在 Unix 中为小写,但在其他一些旧系统中为大写) )。 然后,大多数系统都存储了一些关于用户的其他信息,比如他们的长名或全名,这样更容易阅读,因此有许多用户界面(例如 Windows XP、Mac OS,我猜还有一些较新的 Unix 桌面界面,如 GNOME 和KDE)在桌面上显示登录选择器、消息等。

就网络上社区系统的标签而言,我认为重复问题的解决方案是对标签进行某种程度的控制,即使只是由社区本身进行,并且能够重命名和合并标签(与大多数情况下的用户名不同)或编辑其长名称,以防某些内容被错误标记。

(I am not advising for any particular site or system in this answer - each specific system may have its own considerations)

I guess the reason is to prevent duplication and ease sorting or identification (it's easier if you do not need to consider multiple options). And possibly to maintain some consistency, as many web user interfaces are geared towards people that are likely to sometimes bother to capitalize correctly and otherwise not).

But then, those are a problems anyway because there is all too often more than one way to refer to something. If your tags are ever used as symbols in some sort of script, configuration, or code (e.g. mail filters, setting files, command lines), it's good to have some simple convention for specifing them, and if all symbols are of similar significance, allowing or distinguishing between different case variations, deliminations, etc. can be problematic. As a Unix user, I try to keep file names simple, short, lowercase, and without special characters, and moreso when they are (for example) mailbox names or source files - as they are likely to have to be typed, and specified in many contexts where doing otherwise will be inconvenient.

On the other hand, when using a sophisticated graphical or web-based interface which allows easy selection among a list, completion of typed entry, suggests closest matches, etc., it makes sense to allow some sort of mapping. Give each tag a short simple lowercase identifying name, but allow giving it also a "long" or "human" name, which will be shown where it makes sense. Tags can be uniquely identified and specified by their short name, but read more conveniently by their long name.

This is similar to how usernames work in many systems. I wouldn't choose a mixed-case username, and rather have usernames be treated case-insensitive (so I would just use the case that makes sense on the system I am in, which is lowercase in Unix but uppercase in some other old systems). Then, most systems have some other information stored about users, like their long or full name, which is nicer to read, and therefore many user interfaces (e.g. Windows XP, Mac OS, and I guess also some newer Unix desktop interfaces like GNOME and KDE) display on desktop login choosers, messages, etc.

In the case of tags for community systems on the web, I guess the solution to the duplication problem is some level of moderation to tags, even if just by the community itself, and the ability to rename and merge tags (unlike usernames in most cases) or edit their long names, in case something was mistagged.

苄①跕圉湢 2024-07-20 14:38:15

我希望看到标签能够代表它们的分类内容。 在这方面,标签应该遵循与它们所描述的事物完全相同的形式。

然而,从技术角度来看,我知道问题可能出现在哪里; 我不认为这是不充分研究解决方案的理由。

我从事数字出版工作,我可以看到遵循正确使用方法的好处。 另一方面,你很难看到杂志、书籍或报纸中使用全小写字母(除非是出于风格选择)。

http://en.wikipedia.org/wiki/List_of_case-sensitive_English_words

也就是说,英语词典的美妙之处在于它具有适应、修改和发展的能力。

I'd like to see tags being representative of what they categorise. In this respect, tags should follow the exact same form as the thing they are describing.

From a techincal point of view I see where the problems may arise, however; I don't see it being a reason not to fully investigate a solution.

I work in digital publishing and I can see the benefit of following correct usuage. On the flip side, you'd be hard pushed to see full-lowercase being used in a magazine, book or newspaper (unless it was stylistic choice).

http://en.wikipedia.org/wiki/List_of_case-sensitive_English_words

That said, the beauty of the english lexicon is it's ability to adapt, modify and evolve.

娇女薄笑 2024-07-20 14:38:15

这对我来说听起来很有道理。 我确信他们可以想出一些简单的解析来将每个单词大写(用破折号分隔),但是您怎么知道它应该是 IBM,而不是 Ibm? 我认为有人必须手动更改标签查找表才能完成此任务。

That sounds like a valid point to me. I'm sure they could come up with some simple parsing to capitalize each word (separated by dashes), but how would you know that its supposed to be IBM, instead of Ibm? I think someone would have to manually change the tag lookup table to accomplish this.

偏爱你一生 2024-07-20 14:38:15

我同意原则上这可以通过更复杂的方式来完成。 例如,您可以实现一个相似性度量,将所有这些都识别为可能的同义词:

  • IBM
  • ibm
  • IB M
  • IBM< /code>
  • IBM

然而,增加的运行时间(更不用说开发工作)和实用性的增加之间需要权衡。

我的一般经验是,随着启发式方法变得更加复杂,它们的失败模式也变得更加神秘和怪异。 至少,将字母转换为标准大小写的技术对于人们来说很容易理解,并且当他们有疑问时可以在头脑中执行。

I agree that in principle this could be done in a more sophisticated manner. For example, you could implement a similarity metric that could recognize all of these as being likely synonyms:

  • IBM
  • ibm
  • I B M
  • I. B. M.
  • I.B.M.

However, there's a tradeoff between the increased runtime (not to mention development effort) and the increase in utility.

It's also been my general experience that as heuristics become more complex, their failure modes become more mysterious and bizarre. At least the convert-alphabetics-to-standard-case technique is easy for humans to understand and do in their heads when they have questions.

白龙吟 2024-07-20 14:38:15

打字时,您必须打开大写锁定才能使所有内容变为大写。 人们很懒。

When typing, you would have to turn on caps lock to make everything upper-case. People are lazy.

情话墙 2024-07-20 14:38:14

正如您已经注意到的,它可以防止重复。 人们的大小写不一致。 只要看一下这里的标签,就会发现人们无法决定它是“objective-c”、“objc”还是“objectivec”。 加上“Objective-C”、“Objective-c”等等,你就会陷入混乱。

请注意,我并不是说处理大写字母是不可能的,只是很困难。 例如,你如何知道正确的大小写? 只接受第一个输入的正确内容吗? 靠版主来清理?

As you already noticed, it prevents duplication. People are not consistent in their capitalization. Just look at the tags here and notice that people can't decide whether it's "objective-c", "objc" or "objectivec". Throw in "Objective-C", "Objective-c" and so on, and you'd have a real mess.

Note I'm not saying it would be impossible to deal with capitals, just difficult. For example, how do you know the correct capitalization? Just accept the first one entered as correct? Rely on moderators to clean up?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文