PHP关键字函数在合并两个数组后中断

发布于 2025-01-04 12:32:59 字数 2950 浏览 0 评论 0原文

此 php 函数检索字符串中使用的常用单词列表并排除单词黑名单。

Array1:a、b、c

虽然默认黑名单很有用,但我需要从数据库将单词添加到黑名单中。

Array2: d,e,f

我添加了 MYSQL,它从服务表中的字段获取附加列表。 我将 \n 从单词分解为一个数组,并在函数开头合并两个数组,以便黑名单现在是

Array3:a,b,c,d,e,f

为了测试,我使用 print_r 来显示数组并它确实合并成功。

问题是这样的...

如果我手动将 d、e、f 添加到默认数组,脚本将返回一个干净的单词列表。 如果我将两个数组合并为一个数组,它会返回单词列表,其中仍包含黑名单单词。

为什么合并的数组与仅添加到默认数组有什么不同?

这是函数

function extractCommonWords($string,$init_blacklist){

    /// the default blacklist words
    $stopWords = array('a','b','c');

    /// select the additional blacklist words from the database
    $gettingblack_sql = "SELECT g_serv_blacklist FROM services WHERE g_serv_id='".$init_blacklist."' LIMIT 1";
    $gettingblack_result = mysql_query($gettingblack_sql) or die(mysql_error());
    $gettingblack_row = mysql_fetch_array($gettingblack_result);
    $removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);

    // this adds the d,e,f array from the database to the default a,b,c blacklist
    $stopWords = array_merge($stopWords,$removingblack_array);

    // replace whitespace
    $string = preg_replace('/\s\s+/i', '', $string); 
    $string = trim($string);

    // only take alphanumerical chars, but keep the spaces and dashes too
    $string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string); 

    // make it lowercase
    $string = strtolower($string); 

    preg_match_all('/\b.*?\b/i', $string, $matchWords);
    $matchWords = $matchWords[0];

    foreach ($matchWords as $key => $item) {
    if ($item == '' || in_array(strtolower($item), $stopWords) || strlen($item) <= 3){
    unset($matchWords[$key]);}}

    $wordCountArr = array();

    if (is_array($matchWords)) {
        foreach ($matchWords as $key => $val) {
            $val = strtolower($val);
            if (isset($wordCountArr[$val])) {
                $wordCountArr[$val]++;
            } else {
                $wordCountArr[$val] = 1;
            }
        }
    }
    arsort($wordCountArr);
    $wordCountArr = array_slice($wordCountArr, 0, 30);
    return $wordCountArr;
}
/// end of function



    /// posted string =  a b c d e f g
    $generate = $_POST["generate"];

    /// the unique id of the row to retrieve additional blacklist keywords from
    $generate_id = $_POST["generate_id"];

    /// run the function by passing the text string and the id 
    $generate = extractCommonWords($generate, $generate_id);

    /// update the database with the result
    $update_data = "UPDATE services SET 
    g_serv_tags='".implode(',', array_keys($generate))."' 
    WHERE g_serv_acct='".$_SESSION["session_id"]."' 
    AND g_serv_id='".$generate_id."' LIMIT 1";
    $update_result = mysql_query($update_data);
    if(!$update_result){die('Invalid query:' . mysql_error());}
    else{echo str_replace(",",", ",implode(',', array_keys($generate)));}
    /// end of database update

This php function retrieves a list of common words used in a string and excludes a blacklist of words.

Array1: a,b,c

Although a default blacklist is useful, I needed to add words to the blacklist from a database.

Array2: d,e,f

I added the MYSQL which gets an additional list from an field in our services table.
I explode \n from the words into an array and merge the two arrays at the beginning of the function so that the blacklist is now

Array3: a,b,c,d,e,f

To test I used print_r to display the array and it does merge successfully.

The problem is this...

If I manually add d,e,f to the default array the script returns a clean list of words.
If I merge the two arrays into one its returning the list of words with the blacklist words still in it.

Why would the merged array be any different than just adding to the default array?

Here is the function

function extractCommonWords($string,$init_blacklist){

    /// the default blacklist words
    $stopWords = array('a','b','c');

    /// select the additional blacklist words from the database
    $gettingblack_sql = "SELECT g_serv_blacklist FROM services WHERE g_serv_id='".$init_blacklist."' LIMIT 1";
    $gettingblack_result = mysql_query($gettingblack_sql) or die(mysql_error());
    $gettingblack_row = mysql_fetch_array($gettingblack_result);
    $removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);

    // this adds the d,e,f array from the database to the default a,b,c blacklist
    $stopWords = array_merge($stopWords,$removingblack_array);

    // replace whitespace
    $string = preg_replace('/\s\s+/i', '', $string); 
    $string = trim($string);

    // only take alphanumerical chars, but keep the spaces and dashes too
    $string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string); 

    // make it lowercase
    $string = strtolower($string); 

    preg_match_all('/\b.*?\b/i', $string, $matchWords);
    $matchWords = $matchWords[0];

    foreach ($matchWords as $key => $item) {
    if ($item == '' || in_array(strtolower($item), $stopWords) || strlen($item) <= 3){
    unset($matchWords[$key]);}}

    $wordCountArr = array();

    if (is_array($matchWords)) {
        foreach ($matchWords as $key => $val) {
            $val = strtolower($val);
            if (isset($wordCountArr[$val])) {
                $wordCountArr[$val]++;
            } else {
                $wordCountArr[$val] = 1;
            }
        }
    }
    arsort($wordCountArr);
    $wordCountArr = array_slice($wordCountArr, 0, 30);
    return $wordCountArr;
}
/// end of function



    /// posted string =  a b c d e f g
    $generate = $_POST["generate"];

    /// the unique id of the row to retrieve additional blacklist keywords from
    $generate_id = $_POST["generate_id"];

    /// run the function by passing the text string and the id 
    $generate = extractCommonWords($generate, $generate_id);

    /// update the database with the result
    $update_data = "UPDATE services SET 
    g_serv_tags='".implode(',', array_keys($generate))."' 
    WHERE g_serv_acct='".$_SESSION["session_id"]."' 
    AND g_serv_id='".$generate_id."' LIMIT 1";
    $update_result = mysql_query($update_data);
    if(!$update_result){die('Invalid query:' . mysql_error());}
    else{echo str_replace(",",", ",implode(',', array_keys($generate)));}
    /// end of database update

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

故人爱我别走 2025-01-11 12:32:59

如果数据库中的额外黑名单是从 Windows 客户端的管理面板中填充的,则每个单词的末尾可能会出现一个杂散的 \r。因此,您的列表将是 a,b,c,d\r,e\r,f\r。

尝试将此行替换

$removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);

为:

$removingblack_array = preg_split('/(\r|\n|\r\n)/', $gettingblack_row["g_serv_blacklist"]);

If the extra blacklist in the database was populated in an admin panel from a Windows client, there is likely to be a stray \r at the end of each word. Thus, your list would be a,b,c,d\r,e\r,f\r.

Try replacing this line:

$removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);

with this:

$removingblack_array = preg_split('/(\r|\n|\r\n)/', $gettingblack_row["g_serv_blacklist"]);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文