preg_match_all `u` 标志依赖于什么？

发布于 2024-12-05 15:07:36 字数 1764 浏览 0 评论 0原文

我在 PHP 应用程序中有一些代码，当我尝试在生产服务器上使用它时，它返回 null，但它在开发服务器上运行良好。这是代码行：

// use the regex unicode support to separate the UTF-8 characters into an array
preg_match_all( '/./us', $str, $match );

u 标志依赖什么？我在启用和禁用 mb_string 的情况下进行了测试，它似乎没有影响它。

我收到的错误是

preg_match_all: Compilation failed:known option bit(s) set at offset -1

more info

这是生产服务器上的选项之一：

'--with-pcre-regex=/opt/pcre'

这里是 pcre 部分

我相信这是 @Wesley 所指的注释：

In  order  process  UTF-8 strings, you must build PCRE to include UTF-8
support in the code, and, in addition,  you  must  call  pcre_compile()
with  the  PCRE_UTF8  option  flag,  or the pattern must start with the
sequence (*UTF8). When either of these is the case,  both  the  pattern
and  any  subject  strings  that  are matched against it are treated as
UTF-8 strings instead of strings of 1-byte characters.

有关如何“构建 PCRE 以包含 UTF-8”的任何链接或提示吗？

通过

pcretest -C<的结果/代码>

PCRE version 6.6 06-Feb-2006
Compiled with
  UTF-8 support
  Unicode properties support
  Newline character is LF
  Internal link size = 2
  POSIX malloc threshold = 10
  Default match limit = 10000000
  Default recursion depth limit = 10000000
  Match recursion uses stack

原文

I have some code in a PHP application that is returning null when I try and use it on the production server, but it works fine on the development server. Here is the line of code:

// use the regex unicode support to separate the UTF-8 characters into an array
preg_match_all( '/./us', $str, $match );

What is the u flag dependent on? I tested with mb_string enabled and disabled and it does not seem to affect it.

The error I'm getting is

preg_match_all: Compilation failed: unknown option bit(s) set at offset -1

more info

this is one of the options on the prodction server:

'--with-pcre-regex=/opt/pcre'

and here are the pcre sections

I believe this is the note @Wesley was referring to:

In  order  process  UTF-8 strings, you must build PCRE to include UTF-8
support in the code, and, in addition,  you  must  call  pcre_compile()
with  the  PCRE_UTF8  option  flag,  or the pattern must start with the
sequence (*UTF8). When either of these is the case,  both  the  pattern
and  any  subject  strings  that  are matched against it are treated as
UTF-8 strings instead of strings of 1-byte characters.

Any links or tips on how to "build PCRE to include UTF-8" ?

via

results of pcretest -C

PCRE version 6.6 06-Feb-2006
Compiled with
  UTF-8 support
  Unicode properties support
  Newline character is LF
  Internal link size = 2
  POSIX malloc threshold = 10
  Default match limit = 10000000
  Default recursion depth limit = 10000000
  Match recursion uses stack

分享到QQ

分享到微博