19年前我写 go,20年开始写 rust,最近两个月对 flutter 很感兴趣,去年在一次技术大会上了解到字节跳动飞书团队的客户端底层是基于 rust 开发的 sdk 来跨平台,ux 是利用各自平台的 api,所以我想底层用 rust 跨平台,ux 层使用 flutter 来跨平台做点东西,因此有了下面全文搜索引擎这个 flutter 插件。
这是我基于 Rust 全文搜索库 Tantivy 开发的一个 Flutter plugin,支持 IOS 和 Android
http://github.com/yiv/full_search
Full Text Search
This is a full text search Flutter plugin which build on Tantivy.
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust.
Schema
The schema describes the type (string, text, u64) of a field as well as how it should be handled.
Schema field type supported:
Query Terms
Operators likeAND,OR,TOMUST BEinUPPERCASE
simple terms: "e.g.: Barack Obama are simply tokenized using tantivy's SimpleTokenizer, hence becoming ["barack", "obama"]. The terms are then searched within the default terms of the query parser.
e.g. If body and title are default fields, our example terms are ["title:barack", "body:barack", "title:obama", "body:obama"]. By default, all tokenized and indexed fields are default fields.
multiple termsare handled as an OR : any document containing at least one of the term will go through the scoring.
This behavior is slower, but is not a bad idea if the user is sorting by relevance : The user typically just scans through the first few documents in order of decreasing relevance and will stop when the documents are not relevant anymore.
boolean operators: AND, OR. AND takes precedence over OR, so that a AND b OR c is interpreted as (a AND b) OR c.
In addition to the boolean operators, the -, + can help define. These operators are sufficient to express all queries using boolean operators. For instance x AND y OR z can be written ((+x +y) z). In addition, these operators can help define "required optional" queries. (+x y) matches the same document set as simply x, but y will help refining the score.
negative terms: By prepending a term by a -, a term can be excluded from the search. This is useful for disambiguating a query. e.g. apple -fruit
must terms: By prepending a term by a +, a term can be made required for the search.
phrase terms: Quoted terms become phrase searches on fields that have positions indexed. e.g., title:"Barack Obama" will only find documents that have "barack" immediately followed by "obama".
range terms: Range searches can be done by specifying the start and end bound. These can be inclusive or exclusive. e.g., title:[a TO c} will find all documents whose title contains a word lexicographically between a and c (inclusive lower bound, exclusive upper bound). Inclusive bounds are [], exclusive are {}.
date values: The query parser supports rfc3339 formatted dates. For example "2002-10-02T15:00:00.05Z"
all docs query: A plain * will match all documents in the index.
Need to know
Up to now, one SearchEngine for one kind of data type, it's not the right way. Need to be changed.
Underground, this plugin build on 'static or shared native library of rust', it uses Dart FFI to call the rust code
Document can't be deleted bytextfield, please usestring
Example
/// create a instance of SearchEngine and set it up
finalengine = SearchEngine();
SearchEngine.setup();
/// get the path which used to store index files
final_path =awaitgetApplicationDocumentsDirectory().path;
/// define the schema of the data which wanted to be indexed
final_schema = r'{"id": "i64", "timestamp": "date", "content": "text"}';
/// open a exists one or create a new one on the device
engine.openOrCreate(_path, _schema);
/// encode the data object to a json string
final_doc = jsonEncode(dataObject);
/// start to index and store the data
await engine.index(_doc);
/// give the query keywords and the field which to search on
finalres = await engine.search('关键字 关键词', ['content'],1,10);
/// remove a specify document by giving a field of u64 and it's value
/// document can't be deleted by text field, please use string field
await engine.deleteByU64('id',141906710246850560);
领取专属 10元无门槛券
私享最新 技术干货