当前位置：文江博客话题详情

浏览器代码页检测

发布于 2024-10-01 23:01:51 字数 398 浏览 9 评论 0原文

我有一个 ASP.Net 页面，用户可以在 TEXTAREA 中输入一些文本并将其提交到服务器。该文本将存储在数据库中并显示在 winform 应用程序中。

我如何确保 winform 应用程序 显示用户在 TEXTAREA 中输入的确切字符。

也就是说，我是否存在潜在问题，例如，如果用户输入特殊语言特定字母，例如 Æ、Ø 和 Å（丹麦语字母）？
这些字母根据代码页有不同的代码，所以据我所知，我需要知道 TEXTAREA 控件显示其输入的代码页。或者我在这里遗漏了什么？

我试图在网上找到有关此问题的材料，但很难找到解决此问题的内容。我通常会找到讨论服务器要求浏览器使用什么代码页的页面，以便正确显示发送的数据。

但我的问题是相反的，即从客户端到服务器。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

满栀 2024-10-08 23:01:52

如果您确实想确保用户使用蹩脚的浏览器发送文本不会损坏您的数据骨干。

本质上它是这样工作的：

每个代码页都有自己的指纹。例如单个实体“º”可用于区分三巨头：ISO-8859-1/Windows-1252 (=BA)、MacRoman(=BC) 和 UTF-8 (=C2BA)。

在表单中，您只需添加包含这些指纹作为实体的隐藏输入（例如 °、÷ 和 —），当用户提交表单时，您只需检查返回的十六进制值并比较它们对着你的指纹表。
如果这没有给出匹配，则只能继续其他后备解决方案。

稍微大一点的实现只需五个代码点就可以很好地工作：

my @fp_ents = qw/deg divide mdash bdquo euro/;
my %fingerprints = (
  "UTF-8" => ['c2b0','c3b7','e28094','e2809e','e282ac'],
  "WINDOWS-1252" => ['b0','f7','97','84','80'],
  "MAC"          => ['a1','d6','d1','e3','db'],
  "MS-HEBR"      => ['b0','ba','97','84','80'],
  "MAC-CYRILLIC" => ['a1','d6','d1','d7',''],
  "MS-GREEK"     => ['b0','','97','84','80'],
  "MAC-IS"       => ['a1','d6','d0','e3',''],
  "MS-CYRL"      => ['b0','','97','84','88'],
  "MS932"        => ['818b','8180','815c','',''],
  "WINDOWS-31J"  => ['818b','8180','815c','',''],
  "WINDOWS-936"  => ['a1e3','a1c2','a1aa','',''],
  "MS_KANJI"     => ['818b','8180','','',''],
  "ISO-8859-15"  => ['b0','f7','','','a4'],
  "ISO-8859-1"   => ['b0','f7','','',''],
  "CSIBM864"     => ['80','dd','','',''],
 );

You could also use the HEBCI: HTML Entity-Based Codepage Inference technique if you REALY want to be sure that users sending text with crappy browsers don't corrupt your data-backbone.

In essence this is how it works:

Every codepage has its own finger-print. For instance the single entity "º" could be used to distinguish between the Big Three: ISO-8859-1/Windows-1252 (=BA), MacRoman(=BC), and UTF-8 (=C2BA).

In a form you simply add a hidden input containing those fingerprints as entity's (like °, ÷, and —) and when the users submits the form you simply check the returned hex-values and compare them against your finger-print table.
IF this does not give a match, only THEN continue other fall-back solutions.

A slightly larger implementation works great with only five codepoints:

my @fp_ents = qw/deg divide mdash bdquo euro/;
my %fingerprints = (
  "UTF-8" => ['c2b0','c3b7','e28094','e2809e','e282ac'],
  "WINDOWS-1252" => ['b0','f7','97','84','80'],
  "MAC"          => ['a1','d6','d1','e3','db'],
  "MS-HEBR"      => ['b0','ba','97','84','80'],
  "MAC-CYRILLIC" => ['a1','d6','d1','d7',''],
  "MS-GREEK"     => ['b0','','97','84','80'],
  "MAC-IS"       => ['a1','d6','d0','e3',''],
  "MS-CYRL"      => ['b0','','97','84','88'],
  "MS932"        => ['818b','8180','815c','',''],
  "WINDOWS-31J"  => ['818b','8180','815c','',''],
  "WINDOWS-936"  => ['a1e3','a1c2','a1aa','',''],
  "MS_KANJI"     => ['818b','8180','','',''],
  "ISO-8859-15"  => ['b0','f7','','','a4'],
  "ISO-8859-1"   => ['b0','f7','','',''],
  "CSIBM864"     => ['80','dd','','',''],
 );

回复收藏 0 原文