Erlang 中的二进制文件尾部添加了神秘的位串

发布于 2024-08-31 07:31:33 字数 791 浏览 1 评论 0原文

我想在命名管道上运行 tail 以方便进行一些二进制日志文件处理。问题在于神秘数据被添加到流的开头。我通过使用打开的端口 (open_port) 启动 erlang 进程来运行测试,然后使用另一个 shell 将 bin 放入命名管道中。

这是一个从端口获取数据的简单函数:

bin_from_tail() ->
  open_port({spawn,"/usr/bin/tail -F named_pipe"},
                             [binary,in,eof]),
  receive
  {_,{data,<<Data/binary>>}} -> Data
  end.

所以这里有两种方法可以让我获取相同的数据...

  1. 创建命名管道

    mkfifonamed_pipe

  2. 此命令会阻塞,直到您从另一个 shell 运行“cat log.bin >named_pipe”

    {ok,TailBin} = file:read_file(log.bin).

  3. 使用erlang文件库将整个文件读入内存 FileBin = file:read_file(log.in).

但TailBin和FileBin不一样! TailBin 开头有一个神秘的 120 字节字符串:

<<40,6,161,69,172,216,56,14,100,0,80,6,0,0,0>>

I want to run tail on a named pipe to facilitate some binary logfile processing. The problem is that mysterious data is being added to the beginning of the stream. I run my tests by starting the erlang process with the opened port (open_port) and then I use another shell to cat the bin into the named pipe.

Here is a simple function for getting data from the port:

bin_from_tail() ->
  open_port({spawn,"/usr/bin/tail -F named_pipe"},
                             [binary,in,eof]),
  receive
  {_,{data,<<Data/binary>>}} -> Data
  end.

So here are two ways for me to grab the same data...

  1. Create the named pipe

    mkfifo named_pipe

  2. This command blocks until you run "cat log.bin > named_pipe" from another shell

    {ok,TailBin} = file:read_file(log.bin).

  3. Read the entire file into memory using the erlang file library
    FileBin = file:read_file(log.in).

But TailBin and FileBin are not the same! TailBin has a mysterious 120-byte string at the beginning:

<<40,6,161,69,172,216,56,14,100,0,80,6,0,0,0>>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

你另情深 2024-09-07 07:31:33

感谢您关于无限循环猫/重新启动死端口的想法。看来命名管道只是缓冲了一点点,所以如果端口打开得足够快,编写器进程(另一个程序)就不会崩溃!绝对是有风险的东西,但就黑客而言……它是有效的。

因为所有邮件列表帖子都只是说做这个,做那个,没有示例,我将发布我的工作原理!如果有人想提出改进,请随时这样做。我的解决方案:

read() ->
  Port = open_port({spawn,"/bin/cat /path/to/pipe"},
                   [binary,in,eof]),
  do_read(Port).

do_read(Port) ->
  receive
    {Port,{data,<<Data/binary>>}} ->
      case do_something:with(Data) of
        ok ->
          io:format("G") % Good
        Any ->
          io:format("B") % Bad
      end;
    {Port,eof} ->
      read();
    Any ->
      io:format("No match fifo_client:do_read/1, ~p~n",[Any])
  end,
  do_read(Port).

Thanks for the idea about the endlessly looping cat/restarting a dead port. It appears that named pipes buffer just a little bit, so if the port opens up fast enough the writer process (another program) won't crash! Definitely risky stuff, but as far as hacks go... it works.

Because all the mailing list posts just said do this, do that without examples, I'm going to post how mine works! If anyone wants to offer up improvements, please feel free to do so. My solution:

read() ->
  Port = open_port({spawn,"/bin/cat /path/to/pipe"},
                   [binary,in,eof]),
  do_read(Port).

do_read(Port) ->
  receive
    {Port,{data,<<Data/binary>>}} ->
      case do_something:with(Data) of
        ok ->
          io:format("G") % Good
        Any ->
          io:format("B") % Bad
      end;
    {Port,eof} ->
      read();
    Any ->
      io:format("No match fifo_client:do_read/1, ~p~n",[Any])
  end,
  do_read(Port).
二智少女猫性小仙女 2024-09-07 07:31:33

我发现erlang之外也发生了同样的事情。问题是 tail 试图向您显示文件的结尾,而不是整个文件。如果您在普通文件上使用它,则写入的任何内容都将是新的,并由 -f 拾取,但在这种情况下,看起来 tail 正在等待直到结束文件(通过管道传输的 eof),然后显示最后 10 行(将二进制文件视为文本)。

tail -F -c 9999999

(假设您的日志为 9999999 字节或更少)可能会起作用。

也许尝试使用 cat 而不是 tail -F,这似乎对我有用。然后你只需要避免 cat 在 eof 时退出,我假设你试图通过使用 tail 来避免这种情况。

那么也许是一个无限循环 cat 的 shell 脚本?

或者让 erlang 重新启动关闭并在端口终止时重新创建端口,因为无论如何您都会收到 eof 信号。或者使用 exit_status 标志来 open_port 在进程退出时发出信号,以防您需要区分 eof 和进程退出。 (如果您同时使用 exit_status 和 eof,则 eof 永远不会出现,使用 cat < /dev/null 进行简短测试表明)

I found the same thing happened outside erlang. The problem is that tail is trying to show you the end of the file, not the whole file. If you use it on a normal file, anything written would be new, and picked up by -f, but in this case it looks like tail is waiting until the end of the file (the eof that comes through the pipe) and then showing the last 10 lines (treating the binary as text).

tail -F -c 9999999

(assuming your log is 9999999 bytes or less) would probably work.

Maybe try using cat instead of tail -F, that seemed to work for me. Then you just need to avoid the fact that cat exits upon eof, which I assume you were trying to avoid by using tail.

So a shell script which loops cat endlessly, maybe?

Or get erlang to restart close and recreate the port when it dies, since you're getting the eof signal anyway. Or use the exit_status flag to open_port to be signalled when the process exits, incase you need to distinguish eof and process exit. (If you use both exit_status and eof, the eof never comes, a brief test with cat < /dev/null indicates)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文