问使用Lucene.net进行精确短语搜索
EN

Stack Overflow用户

提问于 2009-05-11 18:26:23

回答 2查看 13.4K关注 0票数 11

我在使用Lucene.NET 2.0.0.4搜索精确短语时遇到问题

例如，我正在搜索"scope attribute sets the variable“(包括引号)，但没有得到任何匹配，我已经100%确认了该短语的存在。

有没有人能建议我哪里错了？Lucene.NET甚至支持这一点吗？像往常一样，API文档没有太多帮助，我读过的一些CodeProject文章也没有特别涉及到这一点。

使用以下代码创建索引：

Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", true);

Analyzer analyzer = new Lucene.Net.Analysis.SimpleAnalyzer();

IndexWriter indexWriter = new Lucene.Net.Index.IndexWriter(dir, analyzer,true);

//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();

Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field(
    "content", File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.TOKENIZED);

doc.Add(fldContent);

//write the document to the index
indexWriter.AddDocument(doc);

然后，我使用以下命令搜索一个短语：

//state the file location of the index
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", false);

//create an index searcher that will perform the search
IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(dir);

QueryParser qp = new QueryParser("content", new SimpleAnalyzer());

// txtSearch.Text  Contains a phrase such as "this is a phrase" 
Query q=qp.Parse(txtSearch.Text);  


//execute the query
Lucene.Net.Search.Hits hits = searcher.Search(q);

目标文档大约是7MB的纯文本。

我已经看过这个previous question，但是我不想要接近搜索，只想要一个精确的短语搜索。

lucene

lucene.net

云直播特惠9.9元起

针对高并发播放、高并发推流、超低延迟等不同直播场景，提供极速、稳定、专业的一站式云端直播处理服务

回答 2

Stack Overflow用户

回答已采纳

发布于 2009-05-11 21:18:44

您尚未启用术语职位。如下所示创建字段可以解决您的问题。

Lucene.Net.Documents.Field fldContent = 
    new Lucene.Net.Documents.Field("content", 
        File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.TOKENIZED, 
    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);

票数 13

Stack Overflow用户

发布于 2009-06-24 12:21:41

Shashikant Kore is correct with his answer，你需要启用定期职位...

但是，我建议不要将文档的文本存储在字段中，除非您绝对需要它在搜索结果中返回给您……将存储设置为'NO‘可能有助于减小索引的大小。

Lucene.Net.Documents.Field fldContent = 
    new Lucene.Net.Documents.Field("content", 
        File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.NO,
    Lucene.Net.Documents.Field.Index.TOKENIZED, 
    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);