构建的 grep 比 Linux 自带的 grep 慢

发布于 2024-08-16 10:33:43 字数 8970 浏览 5 评论 0原文

我试图理解为什么我构建的 grep 比系统自带的 grep 慢很多,并试图找到系统自带的 grep 使用了哪些编译器选项。

操作系统版本:CentOS 版本 5.3(最终版) 系统上的 grep:

  Version: grep (GNU grep) 2.5.1
  Size: 88896 bytes
  ldd output: 
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

我构建的 grep:

  Version: 2.5.1
  Size: 256437 bytes
  ldd output:
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

在大型列表文本文件上运行正则表达式搜索时,系统 grep(330 毫秒)的性能比我构建的 grep(22430 毫秒)快得多。

以下是我用来计时的命令..

% time src/grep ".*asa.*" large_list.txt > /dev/null
real 0m22.430s
user 0m22.291s
sys 0m0.080s

或者

% time bin/grep ".*asa.*" large_list.txt > /dev/null
real 0m0.331s
user 0m0.236s
sys 0m0.081s

系统 grep 显然使用了一些优化选项,这带来了巨大的性能差异。

有人可以帮助我了解系统 grep 可以使用哪些选项来构建吗?

这是我构建时源文件之一的编译选项..
gcc -DLIBDIR=\"/usr/local/lib\" -DHAVE_CONFIG_H -I. -我..-我..-我。 -I../intl -g -O2 -MT xstrtol.o -MD -MP -MF .deps/xstrtol.Tpo -c -o xstrtol.o xstrtol.c

./configure 的输出:

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for gawk... (cached) gawk
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking for a BSD-compatible install... /usr/bin/install -c
checking for ranlib... ranlib
checking for getconf... getconf
checking for CFLAGS value to request large file support... 
checking for LDFLAGS value to request large file support... 
checking for LIBS value to request large file support... 
checking for _FILE_OFFSET_BITS... no
checking for _LARGEFILE_SOURCE... no
checking for _LARGE_FILES... no
checking for function prototypes... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking for size_t... yes
checking for ssize_t... yes
checking for an ANSI C-conforming const... yes
checking for inttypes.h... yes
checking for unsigned long long... yes
checking for ANSI C header files... (cached) yes
checking for string.h... (cached) yes
checking for stdlib.h... (cached) yes
checking sys/param.h usability... yes
checking sys/param.h presence... yes
checking for sys/param.h... yes
checking for memory.h... (cached) yes
checking for unistd.h... (cached) yes
checking libintl.h usability... yes
checking libintl.h presence... yes
checking for libintl.h... yes
checking wctype.h usability... yes
checking wctype.h presence... yes
checking for wctype.h... yes
checking wchar.h usability... yes
checking wchar.h presence... yes
checking for wchar.h... yes
checking for dirent.h that defines DIR... yes
checking for library containing opendir... none required
checking whether stat file-mode macros are broken... no
checking for working alloca.h... yes
checking for alloca... yes
checking whether closedir returns void... no
checking for stdlib.h... (cached) yes
checking for unistd.h... (cached) yes
checking for getpagesize... yes
checking for working mmap... yes
checking for btowc... yes
checking for isascii... yes
checking for iswctype... yes
checking for mbrlen... yes
checking for memmove... yes
checking for setmode... no
checking for strerror... yes
checking for wcrtomb... yes
checking for wcscoll... yes
checking for wctype... yes
checking whether mbrtowc and mbstate_t are properly declared... yes
checking for stdlib.h... (cached) yes
checking for mbstate_t... yes
checking for memchr... yes
checking for stpcpy... yes
checking for strtoul... yes
checking for atexit... yes
checking for fnmatch... yes
checking for stdlib.h... (cached) yes
checking whether  defines strtoumax as a macro... no
checking for strtoumax... yes
checking whether strtoul is declared... yes
checking whether strtoull is declared... yes
checking for strerror in -lcposix... no
checking for inline... inline
checking for off_t... yes
checking whether we are using the GNU C Library 2.1 or newer... yes
checking argz.h usability... yes
checking argz.h presence... yes
checking for argz.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking locale.h usability... yes
checking locale.h presence... yes
checking for locale.h... yes
checking nl_types.h usability... yes
checking nl_types.h presence... yes
checking for nl_types.h... yes
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for unistd.h... (cached) yes
checking for sys/param.h... (cached) yes
checking for feof_unlocked... yes
checking for fgets_unlocked... yes
checking for getcwd... yes
checking for getegid... yes
checking for geteuid... yes
checking for getgid... yes
checking for getuid... yes
checking for mempcpy... yes
checking for munmap... yes
checking for putenv... yes
checking for setenv... yes
checking for setlocale... yes
checking for stpcpy... (cached) yes
checking for strchr... yes
checking for strcasecmp... yes
checking for strdup... yes
checking for strtoul... (cached) yes
checking for tsearch... yes
checking for __argz_count... yes
checking for __argz_stringify... yes
checking for __argz_next... yes
checking for iconv... yes
checking for iconv declaration... 
         extern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for nl_langinfo and CODESET... yes
checking for LC_MESSAGES... yes
checking whether NLS is requested... yes
checking whether included gettext is requested... no
checking for libintl.h... (cached) yes
checking for GNU gettext in libc... yes
checking for dcgettext... yes
checking for msgfmt... /usr/bin/msgfmt
checking for gmsgfmt... /usr/bin/msgfmt
checking for xgettext... /usr/bin/xgettext
checking for bison... bison
checking version of bison... 2.3, ok
checking for catalogs to be installed...  af be bg ca cs da de el eo es et eu fi fr ga gl he hr hu id it ja ko ky lt nb nl pl pt pt_BR ro ru rw sk sl sr sv tr uk vi zh_TW
checking for dos file convention... no
checking host system type... (cached) x86_64-unknown-linux-gnu
checking host system type... (cached) x86_64-unknown-linux-gnu
checking for DJGPP environment... no
checking for environ variable separator... :
checking for working re_compile_pattern... yes
checking for getopt_long... yes
configure: WARNING: Included lib/regex.c not used
checking whether strerror_r is declared... yes
checking for strerror_r... yes
checking whether strerror_r returns char *... no
checking for strerror... (cached) yes
checking for strerror_r... (cached) yes
checking for vprintf... yes
checking for doprnt... no
checking for ANSI C header files... (cached) yes
checking for working malloc... yes
checking for working realloc... yes
checking for pcre_exec in -lpcre... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating lib/Makefile
config.status: creating lib/posix/Makefile
config.status: creating src/Makefile
config.status: creating tests/Makefile
config.status: creating po/Makefile.in
config.status: creating intl/Makefile
config.status: WARNING:  intl/Makefile.in seems to ignore the --datarootdir setting
config.status: creating doc/Makefile
config.status: creating m4/Makefile
config.status: creating vms/Makefile
config.status: creating bootstrap/Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
config.status: executing default-1 commands
config.status: creating po/POTFILES
config.status: creating po/Makefile
config.status: executing stamp-h commands

谢谢, 库马尔

I am trying to understand why grep built by me is much slower than the one that comes with the system and trying to find what compiler options are used by grep that comes with the system.

OS Version: CentOS release 5.3 (Final)
grep on system:

  Version: grep (GNU grep) 2.5.1
  Size: 88896 bytes
  ldd output: 
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

grep built by me:

  Version: 2.5.1
  Size: 256437 bytes
  ldd output:
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

The performance of system grep (330 msecs) is way faster than grep that I built (22430 msecs) when run a regex search on a large list text file.

Following is the command I used to time ..

% time src/grep ".*asa.*" large_list.txt > /dev/null
real 0m22.430s
user 0m22.291s
sys 0m0.080s

OR

% time bin/grep ".*asa.*" large_list.txt > /dev/null
real 0m0.331s
user 0m0.236s
sys 0m0.081s

The system grep is clearly using some optiomizing options that is giving huge performance difference.

Can some body help me with what options the system grep may be built with?

Here is the compile options for one of the source files when I build ..
gcc -DLIBDIR=\"/usr/local/lib\" -DHAVE_CONFIG_H -I. -I.. -I.. -I. -I../intl -g -O2 -MT xstrtol.o -MD -MP -MF .deps/xstrtol.Tpo -c -o xstrtol.o xstrtol.c

The output of ./configure:

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for gawk... (cached) gawk
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking for a BSD-compatible install... /usr/bin/install -c
checking for ranlib... ranlib
checking for getconf... getconf
checking for CFLAGS value to request large file support... 
checking for LDFLAGS value to request large file support... 
checking for LIBS value to request large file support... 
checking for _FILE_OFFSET_BITS... no
checking for _LARGEFILE_SOURCE... no
checking for _LARGE_FILES... no
checking for function prototypes... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking for size_t... yes
checking for ssize_t... yes
checking for an ANSI C-conforming const... yes
checking for inttypes.h... yes
checking for unsigned long long... yes
checking for ANSI C header files... (cached) yes
checking for string.h... (cached) yes
checking for stdlib.h... (cached) yes
checking sys/param.h usability... yes
checking sys/param.h presence... yes
checking for sys/param.h... yes
checking for memory.h... (cached) yes
checking for unistd.h... (cached) yes
checking libintl.h usability... yes
checking libintl.h presence... yes
checking for libintl.h... yes
checking wctype.h usability... yes
checking wctype.h presence... yes
checking for wctype.h... yes
checking wchar.h usability... yes
checking wchar.h presence... yes
checking for wchar.h... yes
checking for dirent.h that defines DIR... yes
checking for library containing opendir... none required
checking whether stat file-mode macros are broken... no
checking for working alloca.h... yes
checking for alloca... yes
checking whether closedir returns void... no
checking for stdlib.h... (cached) yes
checking for unistd.h... (cached) yes
checking for getpagesize... yes
checking for working mmap... yes
checking for btowc... yes
checking for isascii... yes
checking for iswctype... yes
checking for mbrlen... yes
checking for memmove... yes
checking for setmode... no
checking for strerror... yes
checking for wcrtomb... yes
checking for wcscoll... yes
checking for wctype... yes
checking whether mbrtowc and mbstate_t are properly declared... yes
checking for stdlib.h... (cached) yes
checking for mbstate_t... yes
checking for memchr... yes
checking for stpcpy... yes
checking for strtoul... yes
checking for atexit... yes
checking for fnmatch... yes
checking for stdlib.h... (cached) yes
checking whether  defines strtoumax as a macro... no
checking for strtoumax... yes
checking whether strtoul is declared... yes
checking whether strtoull is declared... yes
checking for strerror in -lcposix... no
checking for inline... inline
checking for off_t... yes
checking whether we are using the GNU C Library 2.1 or newer... yes
checking argz.h usability... yes
checking argz.h presence... yes
checking for argz.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking locale.h usability... yes
checking locale.h presence... yes
checking for locale.h... yes
checking nl_types.h usability... yes
checking nl_types.h presence... yes
checking for nl_types.h... yes
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for unistd.h... (cached) yes
checking for sys/param.h... (cached) yes
checking for feof_unlocked... yes
checking for fgets_unlocked... yes
checking for getcwd... yes
checking for getegid... yes
checking for geteuid... yes
checking for getgid... yes
checking for getuid... yes
checking for mempcpy... yes
checking for munmap... yes
checking for putenv... yes
checking for setenv... yes
checking for setlocale... yes
checking for stpcpy... (cached) yes
checking for strchr... yes
checking for strcasecmp... yes
checking for strdup... yes
checking for strtoul... (cached) yes
checking for tsearch... yes
checking for __argz_count... yes
checking for __argz_stringify... yes
checking for __argz_next... yes
checking for iconv... yes
checking for iconv declaration... 
         extern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for nl_langinfo and CODESET... yes
checking for LC_MESSAGES... yes
checking whether NLS is requested... yes
checking whether included gettext is requested... no
checking for libintl.h... (cached) yes
checking for GNU gettext in libc... yes
checking for dcgettext... yes
checking for msgfmt... /usr/bin/msgfmt
checking for gmsgfmt... /usr/bin/msgfmt
checking for xgettext... /usr/bin/xgettext
checking for bison... bison
checking version of bison... 2.3, ok
checking for catalogs to be installed...  af be bg ca cs da de el eo es et eu fi fr ga gl he hr hu id it ja ko ky lt nb nl pl pt pt_BR ro ru rw sk sl sr sv tr uk vi zh_TW
checking for dos file convention... no
checking host system type... (cached) x86_64-unknown-linux-gnu
checking host system type... (cached) x86_64-unknown-linux-gnu
checking for DJGPP environment... no
checking for environ variable separator... :
checking for working re_compile_pattern... yes
checking for getopt_long... yes
configure: WARNING: Included lib/regex.c not used
checking whether strerror_r is declared... yes
checking for strerror_r... yes
checking whether strerror_r returns char *... no
checking for strerror... (cached) yes
checking for strerror_r... (cached) yes
checking for vprintf... yes
checking for doprnt... no
checking for ANSI C header files... (cached) yes
checking for working malloc... yes
checking for working realloc... yes
checking for pcre_exec in -lpcre... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating lib/Makefile
config.status: creating lib/posix/Makefile
config.status: creating src/Makefile
config.status: creating tests/Makefile
config.status: creating po/Makefile.in
config.status: creating intl/Makefile
config.status: WARNING:  intl/Makefile.in seems to ignore the --datarootdir setting
config.status: creating doc/Makefile
config.status: creating m4/Makefile
config.status: creating vms/Makefile
config.status: creating bootstrap/Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
config.status: executing default-1 commands
config.status: creating po/POTFILES
config.status: creating po/Makefile
config.status: executing stamp-h commands

Thanks,
kumar

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

为你鎻心 2024-08-23 10:33:43

为什么你不直接获取 CentOS 的 grep 二进制文件的 SRPM,并将它们的编译选项与你的进行比较呢?我猜想这比让整个 StackOverflow 社区盲目地在黑暗中摸索直到找到东西要有效得多。

编辑:您是否使用具有多字节编码的区域设置? (注意:如果您不知道这意味着什么,那么答案可能是“是”,因为 UTF-8 多年来一直是大多数 Linux 发行版的默认设置,而且 RedHat(以及 CentOS)确实是第一个使用 UTF-8 的。进行切换)。

在这种情况下,GNU grep 就太慢了。这不仅适用于 GNU grep,而且适用于几乎所有进行某种文本处理的 GNU 工具。 FSF 拒绝接受任何提高多字节性能的补丁,除非这些补丁被证明不会减慢固定宽度编码的速度。然而,由于任何提高多字节编码性能的补丁都必须至少在某处包含一些if语句,所以实际上不可能编写一个补丁至少不会因为 if 语句的开销而减慢固定宽度编码的速度。因此,GNU 工具的 UTF-8 性能将继续糟糕,直到时间结束。

无论如何,大多数 Linux 发行商都不会给出 FSF 的想法,并且无论如何都会给 GNU grep 打补丁。 Fedora Rawhide SRPM 包含一个名为 grep-2.5.3-egf-speedup 的补丁。 patch,它将 GNU grep 的 UTF-8 性能提高了几个数量级。 (因为这个补丁已经是 2005 年的了,我假设它也在 CentOS 中使用。)这个补丁也在 Mac OSX、Debian、Ubuntu 等中使用,几乎没有人使用 GNU 分发的 GNU grep。多字节编码中的文本处理永远不会像固定宽度编码中那样快,但它至少应该具有可比性,而不是慢 50 倍(甚至有些人报告的 1500 倍)。

还有另一个名为 dfa-Optional 的补丁,它使 grep 变得简单使用 GNU libc 的正则表达式引擎而不是自己的正则表达式引擎,这不仅在处理 UTF-8 时速度快得多,而且错误也少得多。

因此,您可能需要设置 export LC_ALL=POSIX 来重新运行基准测试。如果这解决了您的问题,您需要应用上述两个补丁之一。

更多信息还可以在这两个 RedHat bug 报告中找到:

这个故事的寓意是:尽管人们普遍认为,Linux 发行商确实知道他们在做什么,至少有时是这样。不要事后猜测他们。

Why don't you just get CentOS's SRPM for the grep binary and compare their compile options to yours? I would guess that this is much more efficient than having the entire StackOverflow community blindly poke around in the dark until they hit something.

EDIT: Are you using a locale with a multibyte encoding? (Note: if you have no idea what that means, then the answer is probably "Yes", since UTF-8 has been the default for most Linux distributions for several years now and indeed RedHat (and thus CentOS) were the very first to make the switch).

In that case, GNU grep is dog slow. And this not only applies to GNU grep but to pretty much all GNU tools that do some kind of text processing. The FSF refuses to accept any patches to improve multibyte performance, unless those patches are proven to not slow down fixed-width encodings. However, since any patch to improve performance for multibyte encodings must at least contain some if statement somewhere, it is actually impossible to write a patch that does not at least slow down fixed-width encodings by at least the overhead of that if statement. Thus, UTF-8 performance of GNU tools will continue to suck until the end of time.

Anyway, most Linux distributors don't give a rat's bleep what the FSF thinks and patch GNU grep anyway. The Fedora Rawhide SRPM contains a patch called grep-2.5.3-egf-speedup.patch, which speeds up the UTF-8 performance of GNU grep by several orders of magnitude. (Since this patch is already from 2005, I assume that it is also used in CentOS.) This patch is also used in Mac OSX, Debian, Ubuntu, ..., pretty much nobody uses GNU grep as distributed by GNU. Text processing in a multibyte encoding will never be as fast as in a fixed-width encoding, but it should at least be comparable, not 50x (or even 1500x as some people have reported) slower.

There's also another patch called dfa-optional, which makes grep simply use GNU libc's regex engine instead of its own, which is not only much faster when dealing with UTF-8 but also has far fewer bugs.

So, you might want to re-run your benchmarks with export LC_ALL=POSIX set. If that fixes your problem, you need to apply either one of the two above-mentioned patches.

More information is also available in these two RedHat bugreports:

The moral of the story: despite popular belief, the Linux distributors do know what they are doing, at least sometimes. Don't second-guess them.

糖粟与秋泊 2024-08-23 10:33:43

您使用 -O2 标志进行编译。为什么不使用 -O3 标志。有关 gcc 可用优化选项的说明,请参阅此处

使用英特尔的 ICC 编译器也有助于提高性能,尽管这实际上取决于应用程序。而且,它不是免费的。

编辑,我刚刚在你的编译行上看到了 -g 标志。删除它,因为它正在打开调试功能,这可能会导致非常严重的性能损失

You compiled with the -O2 flag. Why didn't you use the -O3 flag. See here for an explanation of the optimization options available with gcc.

Using Intel's ICC compiler can also help to improve performance, though this really depends on the app. Also, it's not free.

Edit, I just saw the -g flag on your compile line. Remove that as it's turning on debug stuff and this can cause a pretty serious performance hit

无人接听 2024-08-23 10:33:43

除了 -O 选项之外,另一个需要注意的是,您似乎正在使用调试符号“-g”进行构建。

调试通常会增加二进制文件的大小,并会降低所述二进制文件的性能,我认为 grep 非常稳定,并且您实际上并不需要调试符号。

Another think to note besides the -O options is it looks like you are building with debugging symbols "-g".

Debug usually increases binary size and can reduce performance of said binary, I would image grep is pretty stable and you don't really need debug symbols for it.

无远思近则忧 2024-08-23 10:33:43

您使用什么版本的 GCC? IIRC、GCC 4 进行了重大重新设计,导致部分优化代码暂时失效。

What version of GCC are you using? IIRC, GCC 4 was significantly redesigned, which invalidated some of the optimization code for a while.

九命猫 2024-08-23 10:33:43

由于性能差距如此之大,这可能是算法/代码的差异,而不仅仅是编译器优化级别的差异。是什么让你怀疑编译器?

With that big of a performance gap, it's probably an algorithm/code difference, not just a difference in compiler optimization level. What makes you suspect the compiler?

甩你一脸翔 2024-08-23 10:33:43

glibc 库包含一个正则表达式引擎,而 grep 也包含相同正则表达式引擎的副本,以防您不是针对 glibc 进行构建。 CentOS 附带的 grep 副本是为了使用 glibc 的正则表达式引擎而构建的。看来您本地编译的 grep 副本正在使用自己的正则表达式引擎,这解释了其膨胀的大小。

尽管这两个正则表达式引擎最终源自同一源,但 grep 的副本将缺少一些依赖于内部 glibc 数据结构的重要功能,例如 POSIX 字符等价类和整理元素。

调用configure时,可以使用标志--with-included-regex或--without-included-regex分别强制使用grep或glibc的正则表达式引擎。

The glibc library contains a regex engine, while grep also contains a copy of the same regex engine, in case you're not building against glibc. The copy of grep that came with CentOS is built to use the regex engine from glibc. It appears that your locally compiled copy of grep is using its own regex engine, which explains its inflated size.

Even though the two regex engines ultimately derive from the same source, the copy from grep will be missing some important features that rely on internal glibc data structures, such as POSIX character equivalence classes and collating elements.

When calling configure, you can use the flag --with-included-regex or --without-included-regex to force the use of grep or glibc's regex engine, respectively.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文