在 Perl 中通过网络发送二进制安全数据

发布于 2024-12-10 03:07:05 字数 478 浏览 1 评论 0原文

我正在实现一个向服务器发送消息的网络客户端。这些消息是字节流，协议要求我预先发送每个流的长度。

如果我给出的消息（通过使用我的模块的代码）是一个字节字符串，那么长度可以很容易地通过 length $string 给出。但如果它是一串字符，我需要对其进行处理以获取原始字节。我现在所做的基本上是这样的：

my $msg = shift;   # some message from calling code
my $bytes;
if ( utf8::is_utf8( $msg ) ) { 
    $bytes = Encode::encode( 'utf-8', $msg );
} else { 
    $bytes = $msg;
}

my $length = length $bytes;

这是处理这个问题的正确方法吗？到目前为止似乎有效，但我还没有进行任何认真的测试。这种方法有哪些潜在的陷阱？

谢谢

原文

I'm implementing a network client that sends messages to a server. The messages are streams of bytes, and the protocol requires that I send the length of each stream beforehand.

If the message that I am given (by the code using my module) is a byte string, then the length is given easily enough by length $string. But if it's a string of characters, I'll need to massage it to get the raw bytes. What I'm doing now is basically this:

my $msg = shift;   # some message from calling code
my $bytes;
if ( utf8::is_utf8( $msg ) ) { 
    $bytes = Encode::encode( 'utf-8', $msg );
} else { 
    $bytes = $msg;
}

my $length = length $bytes;

Is this the correct way to handle this? It seems to work so far, but I haven't done any serious testing yet. What potential pitfalls are there with this approach?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦里泪两行 2024-12-17 03:07:05

您不应该真正猜测您的输入是什么。定义您的代码以接受字节字符串或 Unicode 字符串，并将其留给调用者将输入转换为正确的格式（或者为调用者提供某种方式来指定他们要使用哪种字符串）重新提供）。

如果您将代码定义为接受字节字符串，则 \xFF 上面的任何字符都是错误。

如果您将代码定义为接受 Unicode 字符串，则可以使用 Encode::encode_utf8() 将它们转换为字节（无论 Perl 内部如何表示它们，都应该这样做）。

无论如何，调用 utf8::is_utf8() 通常是一个错误 - 您的程序不应该关心字符串的内部表示，而只关心它们包含的实际数据（字符序列）。其中一些字符（特别是 \x80 到 \xFF 范围内的字符）是否在内部由一个或两个字节表示并不重要。

诗。阅读 perldoc Encode 可能有助于澄清 Perl 中字节和字符的问题。

回复收藏 0 原文

短叹 2024-12-17 03:07:05

发送者：

use Encode qw( encode_utf8 );

sub pack_text {
   my ($text) = @_;
   my $bytes = encode_utf8($text);
   die "Text too long" if length($bytes) > 4294967295;
   return pack('N/a*', $bytes);
}

接收者：

use Encode qw( decode_utf8 );

sub read_bytes {
   my ($fh, $to_read) = @_;
   my $buf = '';
   while ($to_read > 0) {
      my $bytes_read = read($fh, $buf, $to_read, length($buf));
      die $! if !defined($bytes_read);
      die "Premature EOF" if !$bytes_read;
      $to_read -= $bytes_read;
   }
   return $buf;
}

sub read_uint32 {
   my ($fh) = @_;
   return unpack('N', read_bytes($fh, 4));
}

sub read_text {
   my ($fh) = @_;
   return decode_utf8(read_bytes($fh, read_uint32($fh)));
}

The sender:

use Encode qw( encode_utf8 );

sub pack_text {
   my ($text) = @_;
   my $bytes = encode_utf8($text);
   die "Text too long" if length($bytes) > 4294967295;
   return pack('N/a*', $bytes);
}

The receiver:

use Encode qw( decode_utf8 );

sub read_bytes {
   my ($fh, $to_read) = @_;
   my $buf = '';
   while ($to_read > 0) {
      my $bytes_read = read($fh, $buf, $to_read, length($buf));
      die $! if !defined($bytes_read);
      die "Premature EOF" if !$bytes_read;
      $to_read -= $bytes_read;
   }
   return $buf;
}

sub read_uint32 {
   my ($fh) = @_;
   return unpack('N', read_bytes($fh, 4));
}

sub read_text {
   my ($fh) = @_;
   return decode_utf8(read_bytes($fh, read_uint32($fh)));
}

回复收藏 0 原文