Erlang 模式匹配位串

发布于 2024-11-03 14:43:44 字数 865 浏览 7 评论 0原文

我正在编写代码来解码二进制协议中的消息。每个消息类型都分配有一个 1 字节的类型标识符，并且每个消息都携带该类型 ID。消息均以包含 5 个字段的公共标头开头。我的 API 很简单：

decoder:decode(Bin :: binary()) -> my_message_type() | {error, binary()}`

我的第一直觉是通过为每种消息类型编写一个解码函数来严重依赖模式匹配，并在 fun 参数中完全解码该消息类型。

decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_X:8, Hdr3:8, Hdr4:8, Hdr5:32, 
         TypeXField1:32, TypeXFld2:32, TypeXFld3:32>>) ->
    #message_x{hdr1=Hdr1, hdr3=Hdr3 ... fld4=TypeXFld3};

decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_Y:8, Hdr3:8, Hdr4:8, Hdr5:32, 
         TypeYField1:32, TypeYFld2:16, TypeYFld3:4, TypeYFld4:32
         TypeYFld5:64>>) ->
    #message_y{hdr1=Hdr1, hdr3=Hdr3 ... fld5=TypeYFld5}.

请注意，虽然消息的前 5 个字段在结构上是相同的，但之后的字段因每种消息类型而异。

我有大约 20 种消息类型，因此有 20 个与上面类似的函数。我是否使用此结构多次解码完整消息？这是惯用语吗？我最好只解码函数标头中的消息类型字段，然后解码消息正文中的完整消息吗？

原文

I'm writing code to decode messages from a binary protocol. Each message type is assigned a 1 byte type identifier and each message carries this type id. Messages all start with a common header consisting of 5 fields. My API is simple:

decoder:decode(Bin :: binary()) -> my_message_type() | {error, binary()}`

My first instinct is to lean heavily on pattern matching by writing one decode function for each message type and to decode that message type completely in the fun argument

decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_X:8, Hdr3:8, Hdr4:8, Hdr5:32, 
         TypeXField1:32, TypeXFld2:32, TypeXFld3:32>>) ->
    #message_x{hdr1=Hdr1, hdr3=Hdr3 ... fld4=TypeXFld3};

decode(<<Hdr1:8, ?MESSAGE_TYPE_ID_Y:8, Hdr3:8, Hdr4:8, Hdr5:32, 
         TypeYField1:32, TypeYFld2:16, TypeYFld3:4, TypeYFld4:32
         TypeYFld5:64>>) ->
    #message_y{hdr1=Hdr1, hdr3=Hdr3 ... fld5=TypeYFld5}.

Note that while the first 5 fields of the messages are structurally identical, the fields after that vary for each message type.

I have roughly 20 message types and thus 20 functions similar to the above. Am I decoding the full message multiple times with this structure? Is it idiomatic? Would I be better off just decoding the message type field in the function header and then decode the full message in the body of the message?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁月静好 2024-11-10 14:43:44

只是同意你的风格是非常惯用的 Erlang。不要将解码分成单独的部分，除非您认为这会使您的代码更清晰。有时进行这种类型的分组可能更符合逻辑。

编译器很聪明，会以不会多次解码消息的方式编译模式匹配。它将首先解码前两个字段（字节），然后使用第二个字段的值（消息类型）来确定如何处理消息的其余部分。无论二进制文件的公共部分有多长，这都有效。

因此，他们不需要尝试通过将解码分成单独的部分来“帮助”编译器，这不会提高效率。再说一次，只有当它使你的代码更清晰时才这样做。

回复收藏 0 原文

拥抱我好吗 2024-11-10 14:43:44

你当前的方法是惯用的 Erlang，所以继续这个方向。不用担心性能，Erlang 编译器在这里做得很好。如果您的消息确实具有完全相同的格式，您可以为其编写宏，但它应该在后台生成相同的代码。无论如何，使用宏通常会导致可维护性变差。只是出于好奇，当所有记录类型都具有完全相同的字段时，为什么要生成不同的记录类型？另一种方法是将消息类型从常量转换为 Erlang 原子并将其存储在一个记录类型中。

回复收藏 0 原文

~没有更多了~