抓取给定关键字前后的 x 个单词?

发布于 2024-09-18 05:20:30 字数 70 浏览 4 评论 0原文

如何在 PHP 字符串中抓取给定关键字前后的 [x] 个单词?我正在尝试将针对关键字定制的 mysql 查询的结果标记为片段。

How can I go about grabbing [x] number of words before and after a given keyword in a string in PHP? I am trying to tokenize results from a mysql query tailored to the keyword as a snippet.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

过潦 2024-09-25 05:20:30
$string = 'This is a test string to see how to grab words from an arbitrary sentence. It\'s a little hacky (as you can see from the results) - but generally speaking, it works.';

echo $string,'<br />';

function getWords($string,$word,$before=0,$after=0) {
    $stringWords = str_word_count($string,1);
    $myWordPos = array_search($word,$stringWords);

    if (($myWordPos-$before) < 0)
        $before = $myWordPos;
    return array_slice($stringWords,$myWordPos-$before,$before+$after+1);
}

var_dump(getWords($string,'test',2,1));
echo '<br />';
var_dump(getWords($string,'this',2,1));
echo '<br />';
var_dump(getWords($string,'sentence',1,3));
echo '<br />';
var_dump(getWords($string,'little',2,2));
echo '<br />';
var_dump(getWords($string,'you',2,2));
echo '<br />';
var_dump(getWords($string,'results',2,2));
echo '<br />';
var_dump(getWords($string,'works',2,2));

echo '<hr />';


function getWords2($string,$word,$before=0,$after=0) {
    $stringWords = str_word_count($string,1);
    $myWordPos = array_search($word,$stringWords);
    $stringWordsPos = array_keys(str_word_count($string,2));

    if (($myWordPos+$after) >= count($stringWords))
        $after = count($stringWords) - $myWordPos - 1;
    $startPos = $stringWordsPos[$myWordPos-$before];
    $endPos = $stringWordsPos[$myWordPos+$after] + strlen($stringWords[$myWordPos+$after]);

    return substr($string,$startPos,$endPos-$startPos);
}

echo '[',getWords2($string,'test',2,1),']<br />';
echo '[',getWords2($string,'this',2,1),']<br />';
echo '[',getWords2($string,'sentence',1,3),']<br />';
echo '[',getWords2($string,'little',2,2),']<br />';
echo '[',getWords2($string,'you',2,2),']<br />';
echo '[',getWords2($string,'results',2,2),']<br />';
echo '[',getWords2($string,'works',1,3),']<br />';

但如果这个词多次出现,你希望发生什么?或者如果该单词没有出现在字符串中?

编辑

getWords2 的扩展版本,最多返回关键字出现的设定次数

$string = 'PHP is a widely-used general-purpose scripting language that is especially suited for Web development. The current version of PHP is 5.3.3, released on July 22, 2010. The online manual for PHP is an excellent resource for the language syntax and has an extensive list of the built-in and extension functions. Most extensions can be found in PECL. PEAR contains a plethora of community supplied classes. PHP is often paired with the MySQL relational database.';

echo $string,'<br />';

function getWords3($string,$word,$before=0,$after=0,$maxFoundCount=1) {
    $stringWords = str_word_count($string,1);
    $stringWordsPos = array_keys(str_word_count($string,2));

    $foundCount = 0;
    $foundInstances = array();
    while ($foundCount < $maxFoundCount) {
        if (($myWordPos = array_search($word,$stringWords)) === false)
            break;
        ++$foundCount;
        if (($myWordPos+$after) >= count($stringWords))
            $after = count($stringWords) - $myWordPos - 1;
        $startPos = $stringWordsPos[$myWordPos-$before];
        $endPos = $stringWordsPos[$myWordPos+$after] + strlen($stringWords[$myWordPos+$after]);

        $stringWords = array_slice($stringWords,$myWordPos+1);
        $stringWordsPos = array_slice($stringWordsPos,$myWordPos+1);

        $foundInstances[] = substr($string,$startPos,$endPos-$startPos);
    }
    return $foundInstances;
}

var_dump(getWords3($string,'PHP',2,2,3));
$string = 'This is a test string to see how to grab words from an arbitrary sentence. It\'s a little hacky (as you can see from the results) - but generally speaking, it works.';

echo $string,'<br />';

function getWords($string,$word,$before=0,$after=0) {
    $stringWords = str_word_count($string,1);
    $myWordPos = array_search($word,$stringWords);

    if (($myWordPos-$before) < 0)
        $before = $myWordPos;
    return array_slice($stringWords,$myWordPos-$before,$before+$after+1);
}

var_dump(getWords($string,'test',2,1));
echo '<br />';
var_dump(getWords($string,'this',2,1));
echo '<br />';
var_dump(getWords($string,'sentence',1,3));
echo '<br />';
var_dump(getWords($string,'little',2,2));
echo '<br />';
var_dump(getWords($string,'you',2,2));
echo '<br />';
var_dump(getWords($string,'results',2,2));
echo '<br />';
var_dump(getWords($string,'works',2,2));

echo '<hr />';


function getWords2($string,$word,$before=0,$after=0) {
    $stringWords = str_word_count($string,1);
    $myWordPos = array_search($word,$stringWords);
    $stringWordsPos = array_keys(str_word_count($string,2));

    if (($myWordPos+$after) >= count($stringWords))
        $after = count($stringWords) - $myWordPos - 1;
    $startPos = $stringWordsPos[$myWordPos-$before];
    $endPos = $stringWordsPos[$myWordPos+$after] + strlen($stringWords[$myWordPos+$after]);

    return substr($string,$startPos,$endPos-$startPos);
}

echo '[',getWords2($string,'test',2,1),']<br />';
echo '[',getWords2($string,'this',2,1),']<br />';
echo '[',getWords2($string,'sentence',1,3),']<br />';
echo '[',getWords2($string,'little',2,2),']<br />';
echo '[',getWords2($string,'you',2,2),']<br />';
echo '[',getWords2($string,'results',2,2),']<br />';
echo '[',getWords2($string,'works',1,3),']<br />';

But what do you want to happen if the word appears multiple times? Or if the word doesn't appear in the string?

EDIT

Extended version of getWords2 to return up to a set number of occurrences of the keyword

$string = 'PHP is a widely-used general-purpose scripting language that is especially suited for Web development. The current version of PHP is 5.3.3, released on July 22, 2010. The online manual for PHP is an excellent resource for the language syntax and has an extensive list of the built-in and extension functions. Most extensions can be found in PECL. PEAR contains a plethora of community supplied classes. PHP is often paired with the MySQL relational database.';

echo $string,'<br />';

function getWords3($string,$word,$before=0,$after=0,$maxFoundCount=1) {
    $stringWords = str_word_count($string,1);
    $stringWordsPos = array_keys(str_word_count($string,2));

    $foundCount = 0;
    $foundInstances = array();
    while ($foundCount < $maxFoundCount) {
        if (($myWordPos = array_search($word,$stringWords)) === false)
            break;
        ++$foundCount;
        if (($myWordPos+$after) >= count($stringWords))
            $after = count($stringWords) - $myWordPos - 1;
        $startPos = $stringWordsPos[$myWordPos-$before];
        $endPos = $stringWordsPos[$myWordPos+$after] + strlen($stringWords[$myWordPos+$after]);

        $stringWords = array_slice($stringWords,$myWordPos+1);
        $stringWordsPos = array_slice($stringWordsPos,$myWordPos+1);

        $foundInstances[] = substr($string,$startPos,$endPos-$startPos);
    }
    return $foundInstances;
}

var_dump(getWords3($string,'PHP',2,2,3));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文