mmap 比 getline 慢？

发布于 2024-11-19 16:05:44 字数 1490 浏览 5 评论 0原文

我面临着逐行读取/写入文件（在演出中）的挑战。

阅读许多论坛条目和站点（包括一堆 SO），mmap 被建议作为读取/写入文件的最快选项。但是，当我使用 readline 和 mmap 技术实现代码时，mmap 是两者中较慢的一个。对于阅读和写作来说都是如此。我一直在使用约 600 MB 大的文件进行测试。

我的实现逐行解析，然后标记该行。我将仅介绍文件输入。

这是 getline 实现：

void two(char* path) {

    std::ios::sync_with_stdio(false);
    ifstream pFile(path);
    string mystring;

    if (pFile.is_open()) {
        while (getline(pFile,mystring)) {
            // c style tokenizing
        }
    }
    else perror("error opening file");
    pFile.close();
}

这是 mmap：

void four(char* path) {

    int fd;
    char *map;
    char *FILEPATH = path;
    unsigned long FILESIZE;

    // find file size
    FILE* fp = fopen(FILEPATH, "r");
    fseek(fp, 0, SEEK_END);
    FILESIZE = ftell(fp);
    fseek(fp, 0, SEEK_SET);
    fclose(fp);

    fd = open(FILEPATH, O_RDONLY);

    map = (char *) mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);

    /* Read the file char-by-char from the mmap
     */
    char c;
    stringstream ss;

    for (long i = 0; i <= FILESIZE; ++i) {
        c = map[i];
        if (c != '\n') {
            ss << c;
        }
        else {
            // c style tokenizing
            ss.str("");
        }

    }

    if (munmap(map, FILESIZE) == -1) perror("Error un-mmapping the file");

    close(fd);

}

为了简洁起见，我省略了很多错误检查。

我的 mmap 实现是否不正确，从而影响性能？也许 mmap 不适合我的应用程序？

感谢您的任何意见或帮助！

原文

I face the challenge of reading/writing files (in Gigs) line by line.

Reading many forum entries and sites (including a bunch of SO's), mmap was suggested as the fastest option to read/write files. However, when I implement my code with both readline and mmap techniques, mmap is the slower of the two. This is true for both reading and writing. I have been testing with files ~600 MB large.

My implementations parse line by line and then tokenize the line. I will present file input only.

Here is the getline implementation:

void two(char* path) {

    std::ios::sync_with_stdio(false);
    ifstream pFile(path);
    string mystring;

    if (pFile.is_open()) {
        while (getline(pFile,mystring)) {
            // c style tokenizing
        }
    }
    else perror("error opening file");
    pFile.close();
}

and here is the mmap:

void four(char* path) {

    int fd;
    char *map;
    char *FILEPATH = path;
    unsigned long FILESIZE;

    // find file size
    FILE* fp = fopen(FILEPATH, "r");
    fseek(fp, 0, SEEK_END);
    FILESIZE = ftell(fp);
    fseek(fp, 0, SEEK_SET);
    fclose(fp);

    fd = open(FILEPATH, O_RDONLY);

    map = (char *) mmap(0, FILESIZE, PROT_READ, MAP_SHARED, fd, 0);

    /* Read the file char-by-char from the mmap
     */
    char c;
    stringstream ss;

    for (long i = 0; i <= FILESIZE; ++i) {
        c = map[i];
        if (c != '\n') {
            ss << c;
        }
        else {
            // c style tokenizing
            ss.str("");
        }

    }

    if (munmap(map, FILESIZE) == -1) perror("Error un-mmapping the file");

    close(fd);

}

I omitted much error checking in the interest of brevity.

Is my mmap implementation incorrect, and thus affecting performance? Perhaps mmap is non ideal for my application?

Thanks for any comments or help!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅笑依然 2024-11-26 16:05:44

mmap 的真正强大之处在于能够在文件中自由查找，直接使用其内容作为指针，并避免将数据从内核高速缓存复制到用户空间的开销。但是，您的代码示例没有利用这一点。

在循环中，您一次扫描一个字符到缓冲区，并附加到一个stringstream。 stringstream 不知道字符串有多长，因此必须在此过程中重新分配多次。此时，您已经消除了使用 mmap 带来的任何性能提升 - 即使标准 getline 实现也避免了多次重新分配（在 GNU C++ 实现中，通过使用 128 字节的堆栈缓冲区）。

如果您想充分利用 mmap：

不要复制字符串。完全没有。相反，将指针直接复制到 mmap 缓冲区中。
使用内置函数（例如 strnchr 或 memchr）来查找换行符；它们利用手工汇编程序和其他优化来比大多数开放编码的搜索循环运行得更快。

回复收藏 0 原文

遗失的美好 2024-11-26 16:05:44

告诉你使用 mmap 的人对现代机器不太了解。

mmap 的性能优势完全是一个神话。用Linus Torvalds 的话来说：

是的，内存很“慢”，但是该死，mmap()也是如此。

mmap 的问题在于，每次您第一次触摸映射区域中的页面时，它都会陷入内核并实际将该页面映射到您的地址空间，从而对 TLB 造成严重破坏。

尝试使用read一次读取一个大文件8K的简单基准测试，然后再次使用mmap。（一遍又一遍地使用相同的 8K 缓冲区。）您几乎肯定会发现读取实际上更快。

您的问题从来不是从内核中获取数据；而是从内核中获取数据。问题在于你之后如何处理数据。尽量减少你一次做的工作；只需扫描以找到换行符，然后对块执行单个操作。就我个人而言，我会回到读取实现，使用（并重新使用）适合 L1 缓存（8K 左右）的缓冲区。

或者至少，我会尝试一个简单的 read 与 mmap 基准测试，看看哪个在您的平台上实际上更快。

[更新]

我发现了Torvalds先生的多组评论：

http ://lkml.iu.edu/hypermail/linux/kernel/0004.0/0728.html
http://lkml.iu.edu/hypermail/linux/kernel/0004.0 /0775.html

总结：

除此之外，您还有实际的 CPU TLB 未命中成本等。
如果您只是重新阅读同一区域，通常可以避免这种情况
而不是在内存管理方面过于聪明
避免复制。
memcpy()（即本例中的“read()”）总是会更快
很多情况下，只是因为它避免了所有额外的复杂性。尽管
mmap() 在其他情况下会更快。

根据我的经验，顺序读取和处理大文件是“许多情况”之一，其中使用（和重复使用）带有 read/write 的中等大小的缓冲区性能明显优于 mmap。

回复收藏 0 原文

偷得浮生 2024-11-26 16:05:44

您可以使用 memchr 来查找行结尾。它比一次向 stringstream 添加一个字符要快得多。

回复收藏 0 原文

你如我软肋 2024-11-26 16:05:44

您正在使用stringstream来存储您识别的行。这与 getline 实现无法相比，stringstream 本身增加了开销。正如其他建议的那样，您可以将字符串的开头存储为 char* ，也可以存储行的长度（或指向行末尾的指针）。读取的正文将类似于：

char* str_start = map;
char* str_end;
for (long i = 0; i <= FILESIZE; ++i) {
        if (map[i] == '\n') {
            str_end = map + i;
            {
                // C style tokenizing of the string str_start to str_end
                // If you want, you can build a std::string like:
                // std::string line(str_start,str_end);
                // but note that this implies a memory copy.
            }
            str_start = map + i + 1;
        }
    }

另请注意，这更加高效，因为您无需处理每个字符中的任何内容（在您的版本中，您将字符添加到 stringstream 中）。

You're using stringstreams to store the lines you identify. This is not comparable with the getline implementation, the stringstream itself adds overhead. As other suggested, you can store the beginning of the string as a char*, and maybe the length of the line (or a pointer to the end of the line). The body of the read would be something like:

char* str_start = map;
char* str_end;
for (long i = 0; i <= FILESIZE; ++i) {
        if (map[i] == '\n') {
            str_end = map + i;
            {
                // C style tokenizing of the string str_start to str_end
                // If you want, you can build a std::string like:
                // std::string line(str_start,str_end);
                // but note that this implies a memory copy.
            }
            str_start = map + i + 1;
        }
    }

Note also that this is much more efficient because you don't process anything in each char (in your version you were adding the character to the stringstream).

回复收藏 0 原文

~没有更多了~