将二进制文件分割成块的更好方法,最好使用位串理解

发布于 2024-11-04 15:50:00 字数 845 浏览 5 评论 0原文

我正在尝试用更优雅的东西替换以下函数:(

split_packet(_, <<>>) ->
    [];
split_packet(Size, P) when byte_size(P) < Size ->
    [ P ];
split_packet(Size, P) ->
    {Chunk, Rest} = split_binary(P, Size),
    [ Chunk | split_packet(Size, Rest) ].

我现在这不是尾递归——想要保持简单,此外在较新的 Erlang 版本中性能并不重要)

示例输出:

1> split_packet(3, <<1,2,3,4,5,6,7,8>>).
[<<1,2,3>>,<<4,5,6>>,<<7,8>>]

带有列表的优雅解决方案理解会更好,因为其结果会使用列表理解进行进一步处理,然后可以将其包装在一个理解中。

我尝试过

[ X || <<X:Size/binary>> <= P ].

,但如果 Size 不是 byte_site(P) 的倍数,则会忽略最后一个块:

2>  [ X || <<X:3/binary>> <= <<1,2,3,4,5,6,7,8>> ].
[<<1,2,3>>,<<4,5,6>>]

I'm trying to replace the following function with something more elegant:

split_packet(_, <<>>) ->
    [];
split_packet(Size, P) when byte_size(P) < Size ->
    [ P ];
split_packet(Size, P) ->
    {Chunk, Rest} = split_binary(P, Size),
    [ Chunk | split_packet(Size, Rest) ].

(I now this is not tail recursive -- wanted to keep it simple, besides it does not matter performance wise in newer Erlang versions)

Example output:

1> split_packet(3, <<1,2,3,4,5,6,7,8>>).
[<<1,2,3>>,<<4,5,6>>,<<7,8>>]

A elegant solution with list comprehensions would be preferable since the result of this is further processed with a list-comprehension which could then be wrapped in one comprehension.

I tried

[ X || <<X:Size/binary>> <= P ].

but this leaves off the last chunk if the Size is not a multiple of byte_site(P):

2>  [ X || <<X:3/binary>> <= <<1,2,3,4,5,6,7,8>> ].
[<<1,2,3>>,<<4,5,6>>]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

楠木可依 2024-11-11 15:50:00

坦率地说,我认为您当前的版本没有太大问题。正如您所说,您无法使用二进制/列表理解来做到这一点,因为最后一个片段将被丢弃。

我唯一能想到的就是重新排序子句以首先匹配最常见的情况:

split_packet(Size, P) when byte_size(P) >= Size->
    {Chunk, Rest} = split_binary(P, Size),
    [Chunk|split_packet(Size, Rest)];
split_packet(_Size, <<>>) ->
    [];
split_packet(_Size, P)  ->
    [P].

Frankly, I don't see much wrong with your current version. As you state, you can't do it with a binary/list comprehension because the last fragment will be discarded.

The only thing I can think of is reordering the clauses to match the most frequent case first:

split_packet(Size, P) when byte_size(P) >= Size->
    {Chunk, Rest} = split_binary(P, Size),
    [Chunk|split_packet(Size, Rest)];
split_packet(_Size, <<>>) ->
    [];
split_packet(_Size, P)  ->
    [P].
烟织青萝梦 2024-11-11 15:50:00

您可以使用 (Size - (byte_size(Binary) rem Size)) * 8 填充输入二进制文件,通过列表理解运行它 [ X || <> <= P ]

Y = (大小 - (byte_size(Binary) rem Size)) * 8

[ X || << X:3/二进制>> <= <<二进制/二进制 , 0:Y>> ]

然后从最后一段中砍掉多余的位..

You could possibly pad the input binary with (Size - (byte_size(Binary) rem Size)) * 8 , run it thru your list comprehension [ X || <<X:Size/binary>> <= P ]

Y = (Size - (byte_size(Binary) rem Size)) * 8

[ X || << X:3/binary >> <= << Binary/binary , 0:Y >> ]

And then chop the extra bits from the last segment..

慕巷 2024-11-11 15:50:00

原始版本的一个变体,可能会更有效一些:

split_packet(Size, Data) when Size > 0 ->
    case Data of
        <<Packet:Size/binary, Rest/binary>> ->
            [Packet | split_packet(Size, Rest)];
        <<>> ->
            [];
        _ ->
            [Data]
    end.

A variation of your original that might be a little more efficient:

split_packet(Size, Data) when Size > 0 ->
    case Data of
        <<Packet:Size/binary, Rest/binary>> ->
            [Packet | split_packet(Size, Rest)];
        <<>> ->
            [];
        _ ->
            [Data]
    end.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文