如何实现惰性流块枚举器?

发布于 2025-01-02 22:11:17 字数 322 浏览 1 评论 0原文

我正在尝试将字节流分割成大小不断增加的块。

源流包含未知数量的字节,并且读取成本昂贵。枚举器的输出应该是大小不断增加的字节数组,从 8KB 到 1MB。

这非常简单,只需读取整个流,将其存储在数组中并取出相关部分即可。然而,由于流可能非常大,一次性读取它是不可行的。此外,虽然性能不是主要问题,但保持系统负载非常低很重要。

在实现这一点时,我注意到保持代码简短且可维护相对困难。还需要记住一些与流相关的问题(例如,Stream.Read 即使成功也可能不会填充缓冲区)。

我没有找到任何对我的案例有帮助的现有课程,也无法在网上找到类似的内容。你将如何实现这样一个类?

I'm trying to split a byte stream into chunks of increasing size.

The source stream contains an unknown number of bytes and is expensive to read. The output of the enumerator should be byte arrays of increasing size, starting at 8KB up to 1MB.

This is very simple to do by simply reading the whole stream, storing it in an array and taking the relevant pieces out. However, since the stream may be very large, reading it at once is unfeasible. Also, while performance is not the main concern, it is important to keep system load very low.

While implementing this I noticed that it's relatively difficult to keep the code short and maintainable. There are a few stream related issues to keep in mind, too (for instance, Stream.Read might not fill the buffer even though it succeeded).

I did not find any existing classes that help for my case, nor could I find something close on the net. How would you implement such a class?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

缱绻入梦 2025-01-09 22:11:17
public IEnumerable<BufferWrapper> getBytes(Stream stream)
{
    List<int> bufferSizes = new List<int>() { 8192, 65536, 220160, 1048576 };
    int count = 0;
    int bufferSizePostion = 0;
    byte[] buffer = new byte[bufferSizes[0]];
    bool done = false;
    while (!done)
    {
        BufferWrapper nextResult = new BufferWrapper();
        nextResult.bytesRead = stream.Read(buffer, 0, buffer.Length);
        nextResult.buffer = buffer;
        done = nextResult.bytesRead == 0;
        if (!done)
        {
            yield return nextResult;
            count++;
            if (count > 10 && bufferSizePostion < bufferSizes.Count)
            {
                count = 0;
                bufferSizePostion++;
                buffer = new byte[bufferSizes[bufferSizePostion]];
            }
        }
    }
}

public class BufferWrapper
{
    public byte[] buffer { get; set; }
    public int bytesRead { get; set; }
}

显然,何时增加缓冲区大小以及如何选择该大小的逻辑可以改变。

有人可能还可以找到一种更好的方法来处理要发送的最后一个缓冲区,因为这不是最有效的方法。

public IEnumerable<BufferWrapper> getBytes(Stream stream)
{
    List<int> bufferSizes = new List<int>() { 8192, 65536, 220160, 1048576 };
    int count = 0;
    int bufferSizePostion = 0;
    byte[] buffer = new byte[bufferSizes[0]];
    bool done = false;
    while (!done)
    {
        BufferWrapper nextResult = new BufferWrapper();
        nextResult.bytesRead = stream.Read(buffer, 0, buffer.Length);
        nextResult.buffer = buffer;
        done = nextResult.bytesRead == 0;
        if (!done)
        {
            yield return nextResult;
            count++;
            if (count > 10 && bufferSizePostion < bufferSizes.Count)
            {
                count = 0;
                bufferSizePostion++;
                buffer = new byte[bufferSizes[bufferSizePostion]];
            }
        }
    }
}

public class BufferWrapper
{
    public byte[] buffer { get; set; }
    public int bytesRead { get; set; }
}

Obviously the logic for when to move up in buffer size, and how to choose what that size is could be altered.

Someone could also probably find a better way of handling the last buffer to be sent, as this isn't the most efficient way.

是你 2025-01-09 22:11:17

作为参考,我当前使用的实现已经根据 @Servy 的答案进行了改进

private const int InitialBlockSize = 8 * 1024;
private const int MaximumBlockSize = 1024 * 1024;

private Stream _Stream;
private int _Size = InitialBlockSize;

public byte[] Current
{
    get;
    private set;
}

public bool MoveNext ()
{
    if (_Size < 0) {
        return false;
    }

    var buf = new byte[_Size];
    int count = 0;

    while (count < _Size) {
        int read = _Stream.Read (buf, count, _Size - count);

        if (read == 0) {
            break;
        }

        count += read;
    }

    if (count == _Size) {
        Current = buf;
        if (_Size <= MaximumBlockSize / 2) {
            _Size *= 2;
        }
    }
    else {
        Current = new byte[count];
        Array.Copy (buf, Current, count);
        _Size = -1;
    }

    return true;
}

For reference, the implementation I currently use, already with improvements as per the answer by @Servy

private const int InitialBlockSize = 8 * 1024;
private const int MaximumBlockSize = 1024 * 1024;

private Stream _Stream;
private int _Size = InitialBlockSize;

public byte[] Current
{
    get;
    private set;
}

public bool MoveNext ()
{
    if (_Size < 0) {
        return false;
    }

    var buf = new byte[_Size];
    int count = 0;

    while (count < _Size) {
        int read = _Stream.Read (buf, count, _Size - count);

        if (read == 0) {
            break;
        }

        count += read;
    }

    if (count == _Size) {
        Current = buf;
        if (_Size <= MaximumBlockSize / 2) {
            _Size *= 2;
        }
    }
    else {
        Current = new byte[count];
        Array.Copy (buf, Current, count);
        _Size = -1;
    }

    return true;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文