我已经让我的搜索器工作得很好,但是它确实会返回过时的结果。我的网站很像NerdDinner,过去的事情就变得无关紧要了。
我现在是这样索引的
注意:我的例子在VB.NET中,但我不在乎是否用C#给出示例
Public Function AddIndex(ByVal searchableEvent As [Event]) As Boolean Implements ILuceneService.AddIndex
Dim writer As New IndexWriter(luceneDirectory, New StandardAnalyzer(), False)
Dim doc As Document = New Document
doc.Add(New Field("id", searchableEvent.ID, Field.Store.YES, Field.Index.UN_TOKENIZED))
doc.Add(New Field("fullText", FullTextBuilder(searchableEvent), Field.Store.YES, Field.Index.TOKENIZED))
doc.Add(New Field("user", If(searchableEvent.User.UserName = Nothing,
"User" & searchableEvent.User.ID,
searchableEvent.User.UserName),
Field.Store.YES,
Field.Index.TOKENIZED))
doc.Add(New Field("title", searchableEvent.Title, Field.Store.YES, Field.Index.TOKENIZED))
doc.Add(New Field("location", searchableEvent.Location.Name, Field.Store.YES, Field.Index.TOKENIZED))
doc.Add(New Field("date", searchableEvent.EventDate, Field.Store.YES, Field.Index.UN_TOKENIZED))
writer.AddDocument(doc)
writer.Optimize()
writer.Close()
Return True
End Function
注意我是如何有一个存储事件日期的“日期”索引的。
我的搜索结果是这样的
''# code omitted
Dim reader As IndexReader = IndexReader.Open(luceneDirectory)
Dim searcher As IndexSearcher = New IndexSearcher(reader)
Dim parser As QueryParser = New QueryParser("fullText", New StandardAnalyzer())
Dim query As Query = parser.Parse(q.ToLower)
''# We're using 10,000 as the maximum number of results to return
''# because I have a feeling that we'll never reach that full amount
''# anyways. And if we do, who in their right mind is going to page
''# through all of the results?
Dim topDocs As TopDocs = searcher.Search(query, Nothing, 10000)
Dim doc As Document = Nothing
''# loop through the topDocs and grab the appropriate 10 results based
''# on the submitted page number
While i <= last AndAlso i < topDocs.totalHits
doc = searcher.Doc(topDocs.scoreDocs(i).doc)
IDList.Add(doc.[Get]("id"))
i += 1
End While
''# code omitted
我确实尝试了下面的方法,但是没有结果(抛出一个NullReferenceException)。
While i <= last AndAlso i < topDocs.totalHits
If Date.Parse(doc.[Get]("date")) >= Date.Today Then
doc = searcher.Doc(topDocs.scoreDocs(i).doc)
IDList.Add(doc.[Get]("id"))
i += 1
End If
End While
我还找到了以下文档,但我无法理解它的正面或反面
http://lucene.apache.org/java/1_4_3/api/org/apache/lucene/search/DateFilter.html
发布于 2010-12-30 11:05:03
您将链接到Lucene1.4.3的api文档。Lucene.Net目前为2.9.2。我想升级应该到了。
首先,您正在大量使用Store.Yes。存储字段将使索引更大,这可能是性能问题。您的日期问题可以很容易地通过以"yyyyMMddHHmmssfff“格式将日期存储为字符串来解决(这是非常高的分辨率,下降到毫秒)。您可能希望降低分辨率,以创建更少的标记以减少索引大小。
var dateValue = DateTools.DateToString(searchableEvent.EventDate, DateTools.Resolution.MILLISECOND);
doc.Add(new Field("date", dateValue, Field.Store.YES, Field.Index.NOT_ANALYZED));
然后将筛选器应用于搜索(第二个参数,当前传递空/空)。
var dateValue = DateTools.DateToString(DateTime.Now, DateTools.Resolution.MILLISECOND);
var filter = FieldCacheRangeFilter.NewStringRange("date",
lowerVal: dateValue, includeLower: true,
upperVal: null, includeUpper: false);
var topDocs = searcher.Search(query, filter, 10000);
您可以使用将常规查询与BooleanQuery相结合的RangeQuery来完成此操作,但这也会影响评分(这是根据查询计算的,而不是筛选器)。为了简单起见,您还可能希望避免修改查询,因此您知道执行的是什么查询。
发布于 2010-12-30 10:59:48
您可以将多个查询与一个BooleanQuery
合并。由于Lucene只搜索文本,请注意索引中的日期字段必须按日期中最重要的部分排序,即以IS8601格式("2010-11-02T20:49:16.000000+00:00")。
示例:
Lucene.Net.Index.Term searchTerm = new Lucene.Net.Index.Term("fullText", searchTerms);
Lucene.Net.Index.Term dateRange = new Lucene.Net.Index.Term("date", "2010*");
Lucene.Net.Search.Query termQuery = new Lucene.Net.Search.TermQuery(searchTerm);
Lucene.Net.Search.Query dateRangeQuery = new Lucene.Net.Search.WildcardQuery(dateRange);
Lucene.Net.Search.BooleanQuery query = new Lucene.Net.Search.BooleanQuery();
query.Add(termQuery, BooleanClause.Occur.MUST);
query.Add(dateRangeQuery, BooleanClause.Occur.MUST);
或者,如果通配符不够精确,可以添加一个RangeQuery
:
Lucene.Net.Search.Query termQuery = new Lucene.Net.Search.TermQuery(searchTerm);
Lucene.Net.Index.Term date1 = new Lucene.Net.Index.Term("date", "2010-11-02*");
Lucene.Net.Index.Term date2 = new Lucene.Net.Index.Term("date", "2010-11-03*");
Lucene.Net.Search.Query dateRangeQuery = new Lucene.Net.Search.RangeQuery(date1, date2, true);
Lucene.Net.Search.BooleanQuery query = new Lucene.Net.Search.BooleanQuery();
query.Add(termQuery, BooleanClause.Occur.MUST);
query.Add(dateRangeQuery, BooleanClause.Occur.MUST);
https://stackoverflow.com/questions/4565303
复制