Lucene 和突出显示文本字体问题

发布于 2024-12-21 08:02:54 字数 1257 浏览 0 评论 0原文

我正在使用 Lucene HighLighter,并取得了成功。这是我的代码:

                    StringBuffer sb = new StringBuffer();
        for (int t = 0; t < fields.length; t++) {
            SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(
                    "<span class=\"highlight\">", "</span>");
            Highlighter highlighter = new Highlighter(formatter,
                    new QueryScorer(parser.parse(queryString)));

            if (d.get(fields[t]) != null) {
                hilites = highlighter.getBestFragments(analyzer, fields[t],
                        d.get(fields[t]), 3);
                int l = hilites.length;
                // System.out.println("hilites length: "+l);
                if (l > 0) {

                    for (int x = 0; x < l; x++) {
                        sb.append(hilites[x]).append("...");
                    }

                }
            }

        }

问题出在我的搜索结果/突出显示的文本上,字符是乱码。这是因为缺少字体吗?

这是我的突出显示文本:

**on Educational Materials ~ ATS Job Board ""OR~C'C" .. III DUES United States Full... ? SL[I!," Full Memberhsip - Domestic membership is for residents residing in the United States. Dues...**

注意时髦的文本!

任何帮助将不胜感激。

Im using Lucene HighLighter, with success. Here is my code:

                    StringBuffer sb = new StringBuffer();
        for (int t = 0; t < fields.length; t++) {
            SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(
                    "<span class=\"highlight\">", "</span>");
            Highlighter highlighter = new Highlighter(formatter,
                    new QueryScorer(parser.parse(queryString)));

            if (d.get(fields[t]) != null) {
                hilites = highlighter.getBestFragments(analyzer, fields[t],
                        d.get(fields[t]), 3);
                int l = hilites.length;
                // System.out.println("hilites length: "+l);
                if (l > 0) {

                    for (int x = 0; x < l; x++) {
                        sb.append(hilites[x]).append("...");
                    }

                }
            }

        }

The problem is on my search results/highlighted text, the characters are garbled. Is this due to missing fonts?

Here is my Highlight text:

**on Educational Materials ~ ATS Job Board ""OR~C'C" .. III DUES United States Full... ? SL[I!," Full Memberhsip - Domestic membership is for residents residing in the United States. Dues...**

Notice the funky text!

Any help would be greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

念﹏祤嫣 2024-12-28 08:02:54

“乱码问题”可能与 Lucene 无关,而是与 XML 编码有关。您是否将“contentType”设置为“text/html;charset=UTF-8”?

The 'Garbled Text Problem' is probably not related to Lucene, but XML encoding. Did you set the 'contentType' to "text/html;charset=UTF-8"?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文