lucene 文本搜索的无效字符

发布于 2024-10-13 04:47:57 字数 2501 浏览 1 评论 0原文

在我的 IndexController 上

    public function buildAction()
    {

    $index = Zend_Search_Lucene::create(APPLICATION_PATH . '/indexes');     

    foreach ($this->pages as $p) {
        $doc = new Zend_Search_Lucene_Document();

        $doc->addField(Zend_Search_Lucene_Field::unIndexed('page_id', $p['page_id']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_name', $p['page_name']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_headline', $p['page_headline']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_content', $p['page_content']));


        $index->addDocument($doc);
    }
    $index->optimize();
    $this->view->indexSize = $index->numDocs();
    }

,我收到错误

[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222

,并且变量

$this->pages

包含从维基百科复制的文本数组,并且我收到字符错误 - (不是 -)和 ö 我收到错误(我相信)。我在 Lucene 外来字符问题 中遇到了类似的问题,它没有解释在哪里做什么。如果我知道在哪里做什么以及一点解释

更新::iconv,我将不胜感激

 iconv support          enabled
 iconv implementation   glibc
 iconv library version  2.12.1 

On my IndexController i have

    public function buildAction()
    {

    $index = Zend_Search_Lucene::create(APPLICATION_PATH . '/indexes');     

    foreach ($this->pages as $p) {
        $doc = new Zend_Search_Lucene_Document();

        $doc->addField(Zend_Search_Lucene_Field::unIndexed('page_id', $p['page_id']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_name', $p['page_name']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_headline', $p['page_headline']));

        $doc->addField(Zend_Search_Lucene_Field::text('page_content', $p['page_content']));


        $index->addDocument($doc);
    }
    $index->optimize();
    $this->view->indexSize = $index->numDocs();
    }

and i am getting error

[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58
[Tue Jan 18 16:23:32 2011] [error] [client 127.0.0.1] PHP Notice:  iconv(): Detected an illegal character in input string in /usr/share/php/libzend-framework-php/Zend/Search/Lucene/Field.php on line 222

and variable

$this->pages

contain array of text copied from wikipedia and i am getting error for characters — (not -) and ö for which i am getting error(i believe). i got relevent similar question at Lucene foreign chars problem which doesn't explain where to do what. Please i would be grateful if i know where to do what and also a little bit of explanation

UPDATES::iconv

 iconv support          enabled
 iconv implementation   glibc
 iconv library version  2.12.1 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

轮廓§ 2024-10-20 04:47:57

尝试将其添加到您的引导程序中:

Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
    new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ()
);

Try adding this to your bootstrap:

Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
    new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ()
);
若相惜即相离 2024-10-20 04:47:57

除了在引导代码中为基于文本的索引添加第三个参数编码之外

$doc->addField(Zend_Search_Lucene_Field::text('page_name', $p['page_name'], 'UTF-8'));

Except in the bootsrap code to add the third parameter encoding for text-based indexes

$doc->addField(Zend_Search_Lucene_Field::text('page_name', $p['page_name'], 'UTF-8'));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文