为什么我们在Encoder.GetBytes方法中使用flush参数
此链接解释了Encoder.GetBytes 方法,还有一个名为“flush”的 bool 参数也进行了解释。对flush的解释是:
如果该编码器可以刷新其值,则为 true 转换结束时的状态; 否则为假。为确保正确 一系列块的终止 编码字节,最后一次调用 GetBytes 可以指定值为 true 用于冲洗。
但我不明白flush是做什么的,也许我喝醉了或者什么的:)。您能更详细地解释一下吗?
This link explains the Encoder.GetBytes Method and there is a bool parameter called flush explained too . The explanation of flush is :
true if this encoder can flush its
state at the end of the conversion;
otherwise, false. To ensure correct
termination of a sequence of blocks of
encoded bytes, the last call to
GetBytes can specify a value of true
for flush.
but I didn't understand what flush does , maybe I am drunk or somthing :). can you explain it in more details please.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
假设您通过套接字连接接收数据。您将收到由多个
byte[]
块组成的长文本。在 UTF-8 流中,1 个 Unicode 字符可能占用 2 个以上字节,并且它被分成 2 个字节块。单独编码 2 个字节块(并连接字符串)会产生错误。
所以你只能在最后一个块上指定
flush=true
。当然,如果您只有 1 个区块,那么这也是最后一个。提示:使用 TextReader 并让它为您解决这个问题。
编辑
镜像问题(实际上被问到:GetBytes)稍微难以解释。
使用
flush=true
与使用Encoder.Reset()
。它清除编码器的“状态”,基本思想是相同的:当从
字符串
转换为字节块时,反之亦然,这些块是 >不独立。Suppose you receive data over a socket connection. You will receive a long text as several
byte[]
blocks.It is possible that 1 Unicode character occupies 2+ bytes in a UTF-8 stream and that it is split over 2 byte blocks. Encoding the 2 byte blocks separately (and concatenating the strings) would produce an error.
So you can only specify
flush=true
on the last block. And of course, if you only have 1 block then that is also the last.Tip: Use a TextReader and let it handle this problem(s) for you.
Edit
The mirror problem (that was actually asked: GetBytes) is slightly harder to explain.
Using
flush=true
is the same as usingEncoder.Reset()
afterGetBytes(...)
. It clears the 'state' of the encoder,The basic idea is the same: when converting from
string
to blocks of bytes, or vice versa, the blocks are not independent.刷新将重置用于将文本编码为字节的编码器实例的内部状态。您可能会问,为什么需要内部状态?好吧,引用MSDN:
因此,如果您使用多个
GetBytes()
,您可能希望在最后刷新内部状态以终止任何需要终止的字符序列,但仅 end,因为否则可能会在单词中间引入终止序列。请注意,如今这可能是一个纯粹的理论问题。而且,你最好 使用更高的无论如何,级别包装器。如果你这样做了,喝醉就不是问题了。
Flushing will reset the internal state of the encoder instance used to encode the text into bytes. Why does it need internal state, you ask? Well, to quote MSDN:
If you're using multiple
GetBytes()
, hence, you would want to flush the internal state at the end to terminate any character sequences that need terminating, but only at the end, since terminating sequences might otherwise be introduced in the middle of words.Note that this may be a purely theoretical problem these days. And, you'd be better off using higher-level wrappers anyway. If you do, being drunk will not be a problem.
在内部,
Encoder
将使用缓冲区来实现 - 该缓冲区可能需要刷新(清除),以便正确结束读取或为下一次读取准备Encoder
。这里是一种解释缓冲区刷新。
此处描述了
flush
参数的确切用法一个>:Internally the
Encoder
would be implemented with a buffer - this buffer may need to be flushed (cleared) in order to end the read correctly or prepare theEncoder
for the next read.Here is one explanation of buffer flushing.
The exact usage of the
flush
parameter is described here: