PHP/MySQL:突出显示“听起来像”查询结果

发布于 2024-08-12 09:51:09 字数 670 浏览 9 评论 0原文

快速 MYSQL/PHP 问题。如果使用普通搜索查询找不到结果,我将使用“不太严格”的搜索查询作为后备,调整如下:

foreach($find_array as $word) { 
  clauses[] = "(firstname SOUNDS LIKE '$word%' OR lastname SOUNDS LIKE '$word%')";
}
if (!empty($clauses)) $filter='('.implode(' AND ', $clauses).')';
$query = "SELECT * FROM table WHERE $filter";

现在,我使用 PHP 来突出显示结果,例如:

foreach ($find_array as $term_to_highlight){
    foreach ($result as $key => $result_string){
        $result[$key]=highlight_stuff($result_string, $term_to_highlight);
    }
}

但是这个方法当我不知道要突出显示什么时,它就会失败。在运行 mysql 查询时,有什么方法可以找出“声音相似”的匹配是什么?

也就是说,如果有人搜索“Joan”,我希望它突出显示“John”。

Quick MYSQL/PHP question. I'm using a "not-so-strict" search query as a fallback if no results are found with a normal search query, to the tune of:

foreach($find_array as $word) { 
  clauses[] = "(firstname SOUNDS LIKE '$word%' OR lastname SOUNDS LIKE '$word%')";
}
if (!empty($clauses)) $filter='('.implode(' AND ', $clauses).')';
$query = "SELECT * FROM table WHERE $filter";

Now, I'm using PHP to highlight the results, like:

foreach ($find_array as $term_to_highlight){
    foreach ($result as $key => $result_string){
        $result[$key]=highlight_stuff($result_string, $term_to_highlight);
    }
}

But this method falls on its ass when I don't know what to highlight. Is there any way to find out what the "sound-alike" match is when running that mysql query?

That is to say, if someone searches for "Joan" I want it to highlight "John" instead.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

莫相离 2024-08-19 09:51:09

请注意,SOUNDS LIKE 并不像您想象的那样工作。它不等同于 MySQL 中的 LIKE,因为它不支持 % 通配符。

这意味着您的查询在搜索“John”时不会找到“John David”。如果这只是您的后备方案,这可能是可以接受的,但这并不理想。

所以这是一个不同的建议(可能需要改进);首先使用 PHP 的 soundex() 函数来查找您要查找的关键字的 soundex。

$soundex = soundex($word);
$soundexPrefix = substr($soundex, 0, 2); // first two characters of soundex
$sql = "SELECT lastname, firstname ".
    "FROM table WHERE SOUNDEX(lastname) LIKE '$soundexPrefix%' ".
    "OR SOUNDEX(firstname) LIKE '$soundexPrefix%'";

现在,您将获得一个名字和姓氏的列表,这些名字和姓氏在发音上具有模糊的相似性(这可能是很多条目,并且您可能希望增加用于搜索的 soundex 前缀的长度)。然后,您可以计算每个单词的发音与您的搜索词之间的编辑距离,并按该距离进行排序。

其次,您应该查看 MySQL 中的参数化查询,以避免 SQL 注入错误。

Note that SOUNDS LIKE does not work as you think it does. It is not equivalent to LIKE in MySQL, as it does not support the % wildcard.

This means your query will not find "John David" when searching for "John". This might be acceptable if this is just your fallback, but it is not ideal.

So here is a different suggestion (that might need improvement); first use PHPs soundex() function to find the soundex of the keyword you are looking for.

$soundex = soundex($word);
$soundexPrefix = substr($soundex, 0, 2); // first two characters of soundex
$sql = "SELECT lastname, firstname ".
    "FROM table WHERE SOUNDEX(lastname) LIKE '$soundexPrefix%' ".
    "OR SOUNDEX(firstname) LIKE '$soundexPrefix%'";

Now you'll have a list of firstnames and lastnames that has a vague similarity in sounding (this might be a lot entries, and you might want to increase the length of the soundex prefix you use for your search). You can then calculate the Levenshtein distance between the soundex of each word and your search term, and sort by that.

Second, you should look at parameterized queries in MySQL, to avoid SQL injection bugs.

帅气尐潴 2024-08-19 09:51:09

SOUND LIKE 条件只是比较两个单词的 SOUNDEX 键,您可以使用 PHP soundex() 函数生成相同的键。

因此,如果您找到匹配的行并需要找出要突出显示的单词,您可以获取名字和姓氏,然后使用 PHP 查找匹配的行并仅突出显示该单词。

我编写这段代码只是为了尝试一下。 (必须测试我的理论 xD)

<?php
// A space seperated string of keywords, presumably from a search box somewhere.
$search_string = 'John Doe';

// Create a data array to contain the keywords and their matches.
// Keywords are grouped by their soundex keys.
$data = array();
foreach(explode(' ', $search_string) as $_word) {
    $data[soundex($_word)]['keywords'][] = $_word;
}

// Execute a query to find all rows matching the soundex keys for the words.
$soundex_list = "'". implode("','", array_keys($data)) ."'";
$sql = "SELECT id, firstname, lastname
        FROM   sounds_like
        WHERE  SOUNDEX(firstname) IN({$soundex_list})
        OR     SOUNDEX(lastname)  IN({$soundex_list})";
$sql_result = $dbLink->query($sql);

// Add the matches to their respective soundex key in the data array.
// This checks which word matched, the first or last name, and tags
// that word as the match so it can be highlighted later.
if($sql_result) {
    while($_row = $sql_result->fetch_assoc()) {
        foreach($data as $_soundex => &$_elem) {
            if(soundex($_row['firstname']) == $_soundex) {
                $_row['matches'] = 'firstname';
                $_elem['matches'][] = $_row;
            }
            else if(soundex($_row['lastname']) == $_soundex) {
                $_row['matches'] = 'lastname';
                $_elem['matches'][] = $_row;
            }
        }
    }
}

// Print the results as a simple text list.
header('content-type: text/plain');
echo "-- Possible results --\n";

foreach($data as $_group) {
    // Print the keywords for this group's soundex key.
    $keyword_list = "'". implode("', '", $_group['keywords']) ."'";
    echo "For keywords: {$keyword_list}\n";

    // Print all the matches for this group, if any.
    if(isset($_group['matches']) && count($_group['matches']) > 0) {
        foreach($_group['matches'] as $_match) {
            // Highlight the matching word by encapsulatin it in dashes.
            if($_match['matches'] == 'firstname') {
                $_match['firstname'] = "-{$_match['firstname']}-";
            }
            else {
                $_match['lastname'] = "-{$_match['lastname']}-";
            }

            echo " #{$_match['id']}: {$_match['firstname']} {$_match['lastname']}\n";
        }
    }
    else {
        echo " No matches.\n";
    }
}
?>

一个更通用的函数,从字符串中提取匹配的 soundex 单词可能如下所示:

<?php
/**
 * Attempts to find the first word in the $heystack that is a soundex
 * match for the $needle.
 */
function find_soundex_match($heystack, $needle) {
    $words = explode(' ', $heystack);
    $needle_soundex = soundex($needle);
    foreach($words as $_word) {
        if(soundex($_word) == $needle_soundex) {
            return $_word;
        }
    }
    return false;
}
?>

如果我理解正确的话,可以在您之前发布的文章中使用它代码为:

foreach ($find_array as $term_to_highlight){
    foreach ($result as $key => $result_string){
        $match_to_highlight = find_soundex_match($result_string, $term_to_highlight);
        $result[$key]=highlight_stuff($result_string, $match_to_highlight);
    }
}

这不会像第一个片段中更有针对性的代码那样有效。

The SOUND LIKE condition just compares the SOUNDEX key of both words, and you can use the PHP soundex() function to generate the same key.

So, if you found a matching row and needed to find out which word to highlight, you can fetch both the firstname and lastname, and then use PHP to find which one matches and highlight just that word.

I made this code just to try this out. (Had to test my theory xD)

<?php
// A space seperated string of keywords, presumably from a search box somewhere.
$search_string = 'John Doe';

// Create a data array to contain the keywords and their matches.
// Keywords are grouped by their soundex keys.
$data = array();
foreach(explode(' ', $search_string) as $_word) {
    $data[soundex($_word)]['keywords'][] = $_word;
}

// Execute a query to find all rows matching the soundex keys for the words.
$soundex_list = "'". implode("','", array_keys($data)) ."'";
$sql = "SELECT id, firstname, lastname
        FROM   sounds_like
        WHERE  SOUNDEX(firstname) IN({$soundex_list})
        OR     SOUNDEX(lastname)  IN({$soundex_list})";
$sql_result = $dbLink->query($sql);

// Add the matches to their respective soundex key in the data array.
// This checks which word matched, the first or last name, and tags
// that word as the match so it can be highlighted later.
if($sql_result) {
    while($_row = $sql_result->fetch_assoc()) {
        foreach($data as $_soundex => &$_elem) {
            if(soundex($_row['firstname']) == $_soundex) {
                $_row['matches'] = 'firstname';
                $_elem['matches'][] = $_row;
            }
            else if(soundex($_row['lastname']) == $_soundex) {
                $_row['matches'] = 'lastname';
                $_elem['matches'][] = $_row;
            }
        }
    }
}

// Print the results as a simple text list.
header('content-type: text/plain');
echo "-- Possible results --\n";

foreach($data as $_group) {
    // Print the keywords for this group's soundex key.
    $keyword_list = "'". implode("', '", $_group['keywords']) ."'";
    echo "For keywords: {$keyword_list}\n";

    // Print all the matches for this group, if any.
    if(isset($_group['matches']) && count($_group['matches']) > 0) {
        foreach($_group['matches'] as $_match) {
            // Highlight the matching word by encapsulatin it in dashes.
            if($_match['matches'] == 'firstname') {
                $_match['firstname'] = "-{$_match['firstname']}-";
            }
            else {
                $_match['lastname'] = "-{$_match['lastname']}-";
            }

            echo " #{$_match['id']}: {$_match['firstname']} {$_match['lastname']}\n";
        }
    }
    else {
        echo " No matches.\n";
    }
}
?>

A more generalized function, to pull out the matching soundex word from a strings could look like:

<?php
/**
 * Attempts to find the first word in the $heystack that is a soundex
 * match for the $needle.
 */
function find_soundex_match($heystack, $needle) {
    $words = explode(' ', $heystack);
    $needle_soundex = soundex($needle);
    foreach($words as $_word) {
        if(soundex($_word) == $needle_soundex) {
            return $_word;
        }
    }
    return false;
}
?>

Which, if I am understanding it correctly, could be used in your previously posted code as:

foreach ($find_array as $term_to_highlight){
    foreach ($result as $key => $result_string){
        $match_to_highlight = find_soundex_match($result_string, $term_to_highlight);
        $result[$key]=highlight_stuff($result_string, $match_to_highlight);
    }
}

This wouldn't be as efficient tho, as the more targeted code in the first snippet.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文