preg_match 函数在某些 PHP 脚本中无法正常工作
我正在使用 preg_match 函数从我制作的 2 个 PHP 脚本中的文本区域表单中过滤不需要的字符,但在其中一个脚本中似乎不起作用。
这是有问题的脚本:
<?php
//Database connection, etc......
mysql_select_db("etc", $con);
$errmsg = '';
$chido = $_POST['chido'];
$gacho = $_POST['gacho'];
$maestroid = $_POST['maestroid'];
$comentario = $_POST['comment'];
$voto = $_POST['voto'];
if($_POST['enviado']==1) {
if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $comentario))
$errmsg = 1;
if($errmsg == '') {
//here's some queries, etc
}
}
if($errmsg == 1)
echo "ERROR: You inserted invalid characters...";
?>
因此,正如您所看到的, preg_match 只是过滤掉不需要的字符,例如 !"#$%&/() 等。
但是每次我键入特殊字符(例如 'ñ' 或 'á' )时,它都会触发错误代码。
我有一个非常相似的脚本,可以完美地使用相同的 preg_match 并仅过滤不需要的字符:
//Database connection, etc..
mysql_select_db("etc", $con);
$errmsg = '';
if ($_POST['enviado']==1) {
$nombre = $_POST['nombre'];
$apodo = $_POST['apodo'];
$mat1 = $_POST['mat1'];
$mat2 = $_POST['mat2'];
$mat3 = $_POST['mat3'];
if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $nombre))
$errmsg = 1;
if($errmsg == '') {
//more queries after validation
}
}
if($errmsg == 1)
echo "ERROR: etc......."
?>
所以问题是,我在第一个脚本中做错了什么?
我尝试了所有方法,但总是失败并显示错误 有什么建议吗
?
I am using preg_match function to filter unwanted characters from a textarea form in 2 PHP scripts I made, but in one of them seems not to work.
Here's the script with the problem:
<?php
//Database connection, etc......
mysql_select_db("etc", $con);
$errmsg = '';
$chido = $_POST['chido'];
$gacho = $_POST['gacho'];
$maestroid = $_POST['maestroid'];
$comentario = $_POST['comment'];
$voto = $_POST['voto'];
if($_POST['enviado']==1) {
if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $comentario))
$errmsg = 1;
if($errmsg == '') {
//here's some queries, etc
}
}
if($errmsg == 1)
echo "ERROR: You inserted invalid characters...";
?>
So as you can see the preg_match just filter unwanted chracters like !"#$%&/() etc..
But every time I type a special character like 'ñ' or 'á' it triggers the error code.
I have this very similar script that works perfectly with the same preg_match and filters just the unwanted characters:
//Database connection, etc..
mysql_select_db("etc", $con);
$errmsg = '';
if ($_POST['enviado']==1) {
$nombre = $_POST['nombre'];
$apodo = $_POST['apodo'];
$mat1 = $_POST['mat1'];
$mat2 = $_POST['mat2'];
$mat3 = $_POST['mat3'];
if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $nombre))
$errmsg = 1;
if($errmsg == '') {
//more queries after validation
}
}
if($errmsg == 1)
echo "ERROR: etc......."
?>
So the question is, what am I doing wrong in the first script??
I tried everything but always fails and shows the error.
Any suggestion?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
尝试在末尾添加 au 和 i 以使用 unicode
try adding a u at the end along with your i to use unicode
你好,在我使用这个匹配表达式之前:
因为我接受从 a 到 z 的字母、从 0 到 9 的数字和下划线“_”,加号“+”在整个字符串中重复,“/i”代表不敏感的匹配。但我需要接受“ñ”字母。
所以,我尝试并为我工作的是使用这个正则表达式:
我添加了 '\w' 来接受任何单词字符,并在 '/i' 之后添加了 'u' 将模式视为 UTF-16 字符集,而不是UTF-8。
Hi before i was using this match expression:
because i was accepting letters from a to z, digits from 0 to 9 and underscore '_', the plus sign '+' to repeat through the whole string, and the '/i' for insensitive match. But i needed to accept the 'ñ' letter.
So, what i tried and worked for me was using this regex:
I added '\w' to accept any word character and also added an 'u' after '/i' to treat the pattern as UTF-16 character set, instead of UTF-8.
这可能会有所帮助: http://www.phpwact.org/php/i18n/charsets
This might help: http://www.phpwact.org/php/i18n/charsets
我将其添加到表格中。
现在似乎有效。
I added this to the form.
Now seems to work.
为什么指定
/i
却又分别枚举所有大写和小写字母?另外:如果你不规范你的输入,这根本不起作用。考虑一下
ñ
可以是字符 U+F1 或 字符 U+4E 后跟 U+303!Unicode 规范化形式 D 将保证 U+F1 和 U+4E,U+303 都转变成规范分解形式 U+4E,U+303。
Unicode 规范化形式 C 将保证 U+F1 和 U+4E,U+303 都变成形式 U+4E,因为它使用规范分解,然后进行规范组合。
根据您的模式,您似乎需要 NFC 形式。
在 PHP 中,您需要使用
Normalization
类 使其可靠地工作。Why are you specifying
/i
yet enumerating all the upper‐ and lower‐case letters separately?ALSO: This won’t work at all if you don’t normalize your input. Consider how
ñ
can be either character U+F1 or characters U+4E followed by U+303!Unicode Normalization Form D will guarantee that both U+F1 and U+4E,U+303 turn into the canonically decomposed form U+4E,U+303.
Unicode Normalization Form C will guarantee that both U+F1 and U+4E,U+303 turn into form U+4E because it uses canonical decomposition followed by canonical composition.
Based on your pattern, it looks like you want the NFC form.
From PHP, you’ll need to use the
Normalization
class on these to get it working reliably.我不知道这是否有帮助,但我对这些特殊字符有完全相同的问题,这让我疯狂了很多天,最后我明白问题是 html_entities() 命令在运行之前清理字符串preg_match(),将 html_entities() 移到 prey_match() 之后使其工作得很好。
i don't know if this can help but i had exactly the same problem with those kind of special characters and that turned me crazy for many days at the end i understood that the problem was a html_entities() command sanitizing the string before running in preg_match(), moving the html_entities() after prey_match()made it work great.