将文件读入字符串列表的最有效方法

发布于 2024-08-20 10:26:07 字数 677 浏览 4 评论 0原文

从将文本文件读入 erlang 中的二进制字符串列表所消耗的时间来看,最有效的方法是什么? 显而易见的解决方案会变得太慢。

-module(test).
-export([run/1]).

open_file(FileName, Mode) ->
    {ok, Device} = file:open(FileName, [Mode, binary]),
    Device.

close_file(Device) ->
    ok = file:close(Device).

read_lines(Device, L) ->
    case io:get_line(Device, L) of
        eof ->
            lists:reverse(L);
        String ->
            read_lines(Device, [String | L])
    end.

run(InputFileName) ->
    Device = open_file(InputFileName, read),
    Data = read_lines(Device, []),
    close_file(Device),
    io:format("Read ~p lines~n", [length(Data)]).

当文件包含超过 100000 行时,

What is the most efficient way from the time consumed to read a text file into a list of binary strings in erlang ? The obvious solution

-module(test).
-export([run/1]).

open_file(FileName, Mode) ->
    {ok, Device} = file:open(FileName, [Mode, binary]),
    Device.

close_file(Device) ->
    ok = file:close(Device).

read_lines(Device, L) ->
    case io:get_line(Device, L) of
        eof ->
            lists:reverse(L);
        String ->
            read_lines(Device, [String | L])
    end.

run(InputFileName) ->
    Device = open_file(InputFileName, read),
    Data = read_lines(Device, []),
    close_file(Device),
    io:format("Read ~p lines~n", [length(Data)]).

becomes too slow when the file contains more than 100000 lines.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

傲世九天 2024-08-27 10:26:07
{ok, Bin} = file:read_file(Filename).

或者如果您需要逐行内容

read(File) ->
    case file:read_line(File) of
        {ok, Data} -> [Data | read(File)];
        eof        -> []
    end.
{ok, Bin} = file:read_file(Filename).

or if you need the contents line by line

read(File) ->
    case file:read_line(File) of
        {ok, Data} -> [Data | read(File)];
        eof        -> []
    end.
草莓味的萝莉 2024-08-27 10:26:07

将整个文件读入二进制文件。转换为列表并撕掉线条。

这比任何其他方法都有效得多。如果你不相信我时间
它。

file2lines(File) ->
   {ok, Bin} = file:read_file(File),
   string2lines(binary_to_list(bin), []).

string2lines("\n" ++ Str, Acc) -> [reverse([$\n|Acc]) | string2lines(Str,[])];
string2lines([H|T], Acc)       -> string2lines(T, [H|Acc]);
string2lines([], Acc)          -> [reverse(Acc)].

read the entire file in into a binary. Convert to a list and rip out the lines.

This is far more efficient than any other method. If you don't believe me time
it.

file2lines(File) ->
   {ok, Bin} = file:read_file(File),
   string2lines(binary_to_list(bin), []).

string2lines("\n" ++ Str, Acc) -> [reverse([$\n|Acc]) | string2lines(Str,[])];
string2lines([H|T], Acc)       -> string2lines(T, [H|Acc]);
string2lines([], Acc)          -> [reverse(Acc)].
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文