如何使用lucene.net中的SynonymAnalyzer函数

发布于 2024-12-11 14:04:33 字数 319 浏览 0 评论 0原文

我在我的应用程序中使用 Lucene.Net.SynonymEngine.dll 作为参考

我在使用 synonymAnalyzer、ISynonymEngine 等功能时遇到问题

我尝试使用

SynonymAnalyzer syn = new SynonymAnalyzer(ISynonymEngine engine);

Analyzer a =new SynonymAnalyzer(ISynonymEngine engine);

但似乎都不起作用,有人可以帮忙吗? 先感谢您...

i have used Lucene.Net.SynonymEngine.dll as reference in my application

i have problem using the functions like synonymAnalyzer, ISynonymEngine

i have tried using

SynonymAnalyzer syn = new SynonymAnalyzer(ISynonymEngine engine);
and

Analyzer a =new SynonymAnalyzer(ISynonymEngine engine);

but neither seems to work, can anyone help ?
thank you in advance...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

無心 2024-12-18 14:04:33
public class SynonymAnalyzer : Analyzer
{
    public ISynonymEngine SynonymEngine { get; private set; }

    public SynonymAnalyzer(ISynonymEngine engine)
    {
        SynonymEngine = engine;
    }

    public override TokenStream TokenStream
    (string fieldName, System.IO.TextReader reader)
    {
        //create the tokenizer
        TokenStream result = new StandardTokenizer(reader);

        //add in filters
        // first normalize the StandardTokenizer
        result = new StandardFilter(result); 

        // makes sure everything is lower case
        result = new LowerCaseFilter(result);

        // use the default list of Stop Words, provided by the StopAnalyzer class.
        result = new StopFilter(result, StopAnalyzer.ENGLISH_STOP_WORDS); 

        // injects the synonyms. 
        result = new SynonymFilter(result, SynonymEngine); 

        //return the built token stream.
        return result;
    }
}
public class SynonymAnalyzer : Analyzer
{
    public ISynonymEngine SynonymEngine { get; private set; }

    public SynonymAnalyzer(ISynonymEngine engine)
    {
        SynonymEngine = engine;
    }

    public override TokenStream TokenStream
    (string fieldName, System.IO.TextReader reader)
    {
        //create the tokenizer
        TokenStream result = new StandardTokenizer(reader);

        //add in filters
        // first normalize the StandardTokenizer
        result = new StandardFilter(result); 

        // makes sure everything is lower case
        result = new LowerCaseFilter(result);

        // use the default list of Stop Words, provided by the StopAnalyzer class.
        result = new StopFilter(result, StopAnalyzer.ENGLISH_STOP_WORDS); 

        // injects the synonyms. 
        result = new SynonymFilter(result, SynonymEngine); 

        //return the built token stream.
        return result;
    }
}
心房的律动 2024-12-18 14:04:33

你可以像下面这样创建你的分析器,

SynonymAnalyzer sa = new SynonymAnalyzer(new XmlSynonymEngine(yourXmlFilesPath)); 

但首先你应该为同义词创建一个 xml 文件

<?xml version="1.0" encoding="utf-8" ?>
<synonyms>
  <group>
    <syn>fast</syn>
    <syn>quick</syn>
    <syn>rapid</syn>
  </group>

  <group>
    <syn>slow</syn>
    <syn>decrease</syn>
  </group>

  <group>
    <syn>google</syn>
    <syn>search</syn>
  </group>

  <group>
    <syn>check</syn>
    <syn>lookup</syn>
    <syn>look</syn>
  </group>

</synonyms>

------ 编辑 ---------

查看 ISynonymEngine 的原始实现

public class MySynonyms : Lucene.Net.SynonymEngine.ISynonymEngine
{
    public IEnumerable<string> GetSynonyms(string word)
    {
        if (word == "quick") return  new List<string>{"fast"};
        return new List<string>();
    }
}
SynonymAnalyzer sa = new SynonymAnalyzer(new MySynonyms());

you can create your analyzer like below

SynonymAnalyzer sa = new SynonymAnalyzer(new XmlSynonymEngine(yourXmlFilesPath)); 

But first you should create an xml file for synonyms

<?xml version="1.0" encoding="utf-8" ?>
<synonyms>
  <group>
    <syn>fast</syn>
    <syn>quick</syn>
    <syn>rapid</syn>
  </group>

  <group>
    <syn>slow</syn>
    <syn>decrease</syn>
  </group>

  <group>
    <syn>google</syn>
    <syn>search</syn>
  </group>

  <group>
    <syn>check</syn>
    <syn>lookup</syn>
    <syn>look</syn>
  </group>

</synonyms>

------ EDIT ---------

See the primitive implementation of ISynonymEngine

public class MySynonyms : Lucene.Net.SynonymEngine.ISynonymEngine
{
    public IEnumerable<string> GetSynonyms(string word)
    {
        if (word == "quick") return  new List<string>{"fast"};
        return new List<string>();
    }
}
SynonymAnalyzer sa = new SynonymAnalyzer(new MySynonyms());
蓝咒 2024-12-18 14:04:33

SynonymFilter C# 类

public class SynonymFilter : TokenFilter
{
    public ISynonymEngine SynonymEngine { get; private set; }       
    private Queue<string> splittedQueue = new Queue<string>();        
    private readonly ITermAttribute _termAttr;
    private readonly IPositionIncrementAttribute _posAttr;
    private readonly ITypeAttribute _typeAttr;
    private State currentState;

    public SynonymFilter(TokenStream input, ISynonymEngine synonymEngine)
        : base(input)
    {
        if (synonymEngine == null)
            throw new ArgumentNullException("synonymEngine");

        SynonymEngine = synonymEngine;
        _termAttr = AddAttribute<ITermAttribute>();
        _posAttr = AddAttribute<IPositionIncrementAttribute>();
        _typeAttr = AddAttribute<ITypeAttribute>();
    }


    public override bool IncrementToken()
    {
        if (splittedQueue.Count > 0)
        {
            string splitted = splittedQueue.Dequeue();
            RestoreState(currentState);
            _termAttr.SetTermBuffer(splitted);
            _posAttr.PositionIncrement = 0;
            return true;
        }

        if (!input.IncrementToken())
            return false;

        var currentTerm = new string(_termAttr.TermBuffer(), 0, _termAttr.TermLength());            
        IEnumerable<string> synonyms = SynonymEngine.GetSynonyms(currentTerm);

        if (synonyms == null)
        {
            return false;
        }        
        foreach (string syn in synonyms)
        {                
            if (!currentTerm.Equals(syn))
            {
                splittedQueue.Enqueue(syn);
            }
        }            
        return true;
    }
}

SynonymFilter C# class

public class SynonymFilter : TokenFilter
{
    public ISynonymEngine SynonymEngine { get; private set; }       
    private Queue<string> splittedQueue = new Queue<string>();        
    private readonly ITermAttribute _termAttr;
    private readonly IPositionIncrementAttribute _posAttr;
    private readonly ITypeAttribute _typeAttr;
    private State currentState;

    public SynonymFilter(TokenStream input, ISynonymEngine synonymEngine)
        : base(input)
    {
        if (synonymEngine == null)
            throw new ArgumentNullException("synonymEngine");

        SynonymEngine = synonymEngine;
        _termAttr = AddAttribute<ITermAttribute>();
        _posAttr = AddAttribute<IPositionIncrementAttribute>();
        _typeAttr = AddAttribute<ITypeAttribute>();
    }


    public override bool IncrementToken()
    {
        if (splittedQueue.Count > 0)
        {
            string splitted = splittedQueue.Dequeue();
            RestoreState(currentState);
            _termAttr.SetTermBuffer(splitted);
            _posAttr.PositionIncrement = 0;
            return true;
        }

        if (!input.IncrementToken())
            return false;

        var currentTerm = new string(_termAttr.TermBuffer(), 0, _termAttr.TermLength());            
        IEnumerable<string> synonyms = SynonymEngine.GetSynonyms(currentTerm);

        if (synonyms == null)
        {
            return false;
        }        
        foreach (string syn in synonyms)
        {                
            if (!currentTerm.Equals(syn))
            {
                splittedQueue.Enqueue(syn);
            }
        }            
        return true;
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文