
gremlin 其实是一个逐级过滤的运行机制,比如下面的一个简单的gremlin查询语句:
g.V().hasLabel("label").has("prop","value")运行原理就是:
当数据量很大时,这个代价非常大,因此需要做查询优化。
hugegraph 的优化方案是,HugeGraphStepStrategy 中将has条件提取出来,然后走索引优化,减少读取的数据量。
TraversalUtil.extractHasContainer:
public static void extractHasContainer(HugeGraphStep<?, ?> newStep,
Traversal.Admin<?, ?> traversal) {
Step<?, ?> step = newStep;
do {
step = step.getNextStep();
if (step instanceof HasStep) {
HasContainerHolder holder = (HasContainerHolder) step;
for (HasContainer has : holder.getHasContainers()) {
if (!GraphStep.processHasContainerIds(newStep, has)) {
newStep.addHasContainer(has);
}
}
TraversalHelper.copyLabels(step, step.getPreviousStep(), false);
traversal.removeStep(step);
}
} while (step instanceof HasStep || step instanceof NoOpBarrierStep);
}hugegraph 通过IndexLabel 来定义索引类型,描述索引的约束信息。
g.V().has("city", "北京")查询"city属性值是北京"的全部顶点
g.V().has ("city", "北京").has('street', '中关村街道')查询"city属性值是北京且street属性值是中关村"的全部顶点,或者g.V() .has("city", "北京")查询"city属性值是北京"的全部顶点
secondary index的查询都是基于"是"或者"相等"的查询条件,不支持"部分匹配"
g.V().has("age", P.gt(18))查询"age属性值大于18"的顶点。除了P.gt()以外,还支持P.gte(), P.lte(), P.lt(), P.eq(), P.between(), P.inside()和P.outside()等g.V().has("address", Text .contains('大厦')查询"address属性中包含大厦"的全部顶点
search index的查询是基于"是"或者"包含"的查询条件
g.V().has ("city", "北京").has("age", P.between(18, 30))查询"city属性是北京且年龄大于等于18小于30"的全部顶点
摘录自 https://hugegraph.github.io/hugegraph-doc/clients/hugegraph-client.html
Secondary 和 Range是最常用的索引。
我们通过源代码来分析索引存储过程。 核心代码在GraphIndexTransaction.updateIndex函数里:
/**
* Update index(user properties) of vertex or edge
* @param ilId the id of index label
* @param element the properties owner
* @param removed remove or add index
*/
protected void updateIndex(Id ilId, HugeElement element, boolean removed) {
SchemaTransaction schema = this.params().schemaTransaction();
IndexLabel indexLabel = schema.getIndexLabel(ilId);
E.checkArgument(indexLabel != null,
"Not exist index label with id '%s'", ilId);
// Collect property values of index fields
List<Object> allPropValues = new ArrayList<>();
int fieldsNum = indexLabel.indexFields().size();
int firstNullField = fieldsNum;
for (Id fieldId : indexLabel.indexFields()) {
HugeProperty<Object> property = element.getProperty(fieldId);
if (property == null) {
E.checkState(hasNullableProp(element, fieldId),
"Non-null property '%s' is null for '%s'",
this.graph().propertyKey(fieldId) , element);
if (firstNullField == fieldsNum) {
firstNullField = allPropValues.size();
}
allPropValues.add(INDEX_SYM_NULL);
} else {
E.checkArgument(!INDEX_SYM_NULL.equals(property.value()),
"Illegal value of index property: '%s'",
INDEX_SYM_NULL);
allPropValues.add(property.value());
}
}
if (firstNullField == 0 && !indexLabel.indexType().isUnique()) {
// The property value of first index field is null
return;
}
// Not build index for record with nullable field (except unique index)
List<Object> propValues = allPropValues.subList(0, firstNullField);
// Expired time
long expiredTime = element.expiredTime();
// Update index for each index type
switch (indexLabel.indexType()) {
case RANGE_INT:
case RANGE_FLOAT:
case RANGE_LONG:
case RANGE_DOUBLE:
E.checkState(propValues.size() == 1,
"Expect only one property in range index");
Object value = NumericUtil.convertToNumber(propValues.get(0));
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);
break;
case SEARCH:
E.checkState(propValues.size() == 1,
"Expect only one property in search index");
value = propValues.get(0);
Set<String> words = this.segmentWords(value.toString());
for (String word : words) {
this.updateIndex(indexLabel, word, element.id(),
expiredTime, removed);
}
break;
case SECONDARY:
// Secondary index maybe include multi prefix index
for (int i = 0, n = propValues.size(); i < n; i++) {
List<Object> prefixValues = propValues.subList(0, i + 1);
// prefixValues is list or set , should create index for
// each item
if(prefixValues.get(0) instanceof Collection) {
for (Object propValue :
(Collection<Object>) prefixValues.get(0)) {
value = escapeIndexValueIfNeeded(propValue.toString());
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);
}
}else {
value = ConditionQuery.concatValues(prefixValues);
value = escapeIndexValueIfNeeded((String) value);
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);
}
}
break;
case SHARD:
value = ConditionQuery.concatValues(propValues);
value = escapeIndexValueIfNeeded((String) value);
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);
break;
case UNIQUE:
value = ConditionQuery.concatValues(allPropValues);
assert !value.equals("");
Id id = element.id();
// TODO: add lock for updating unique index
if (!removed && this.existUniqueValue(indexLabel, value, id)) {
throw new IllegalArgumentException(String.format(
"Unique constraint %s conflict is found for %s",
indexLabel, element));
}
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);
break;
default:
throw new AssertionError(String.format(
"Unknown index type '%s'", indexLabel.indexType()));
}
}当用户的查询语义是:某属性值大于、小于、大于等于、小于等于、等于某个界限,或者属性值属于某个区间时,适合使用范围索引。比如:“年龄”、“价格”、“得分”等取值比较连续的属性。
范围索引处理方式如下:
E.checkState(propValues.size() == 1,
"Expect only one property in range index");
Object value = NumericUtil.convertToNumber(propValues.get(0));
this.updateIndex(indexLabel, value, element.id(),
expiredTime, removed);updateIndex 代码:
private void updateIndex(IndexLabel indexLabel, Object propValue,
Id elementId, long expiredTime, boolean removed) {
HugeIndex index = new HugeIndex(this.graph(), indexLabel);
index.fieldValues(propValue);
index.elementIds(elementId, expiredTime);
if (removed) {
this.doEliminate(this.serializer.writeIndex(index));
} else {
this.doAppend(this.serializer.writeIndex(index));
}
}这里我们来探索Serializer是如何做的,比如Binary:
Id id = index.id();
HugeType type = index.type();
byte[] value = null;
if (!type.isNumericIndex() && indexIdLengthExceedLimit(id)) {
id = index.hashId();
// Save field-values as column value if the key is a hash string
value = StringEncoding.encode(index.fieldValues().toString());
}
entry = newBackendEntry(type, id);
entry.column(this.formatIndexName(index), value);
entry.subId(index.elementId());
if (index.hasTtl()) {
entry.ttl(index.ttl());
}索引的id:
public static Id formatIndexId(HugeType type, Id indexLabelId,
Object fieldValues) {
if (type.isStringIndex()) {
String value = "";
if (fieldValues instanceof Id) {
value = IdGenerator.asStoredString((Id) fieldValues);
} else if (fieldValues != null) {
value = fieldValues.toString();
}
/*
* Modify order between index label and field-values to put the
* index label in front(hugegraph-1317)
*/
String strIndexLabelId = IdGenerator.asStoredString(indexLabelId);
return SplicingIdGenerator.splicing(strIndexLabelId, value);
} else {
assert type.isRangeIndex();
int length = type.isRange4Index() ? 4 : 8;
BytesBuffer buffer = BytesBuffer.allocate(4 + length);
buffer.writeInt(SchemaElement.schemaId(indexLabelId));
if (fieldValues != null) {
E.checkState(fieldValues instanceof Number,
"Field value of range index must be number:" +
" %s", fieldValues.getClass().getSimpleName());
byte[] bytes = number2bytes((Number) fieldValues);
buffer.write(bytes);
}
return buffer.asId();
}
}protected byte[] formatIndexName(HugeIndex index) {
BytesBuffer buffer;
Id elemId = index.elementId();
if (!this.indexWithIdPrefix) {
int idLen = 1 + elemId.length();
buffer = BytesBuffer.allocate(idLen);
} else {
Id indexId = index.id();
HugeType type = index.type();
if (!type.isNumericIndex() && indexIdLengthExceedLimit(indexId)) {
indexId = index.hashId();
}
int idLen = 1 + elemId.length() + 1 + indexId.length();
buffer = BytesBuffer.allocate(idLen);
// Write index-id
buffer.writeIndexId(indexId, type);
}
// Write element-id
buffer.writeId(elemId);
// Write expired time if needed
if (index.hasTtl()) {
buffer.writeVLong(index.expiredTime());
}
return buffer.bytes();
}formatIndexName 决定了column name:
最后写入存储后端时,
@Override
public void insert(Session session, BackendEntry entry) {
assert !entry.columns().isEmpty();
for (BackendColumn col : entry.columns()) {
assert entry.belongToMe(col) : entry;
session.put(this.table(), col.name, col.value);
}
}对于range 索引,key的前缀是Int的indexLabelId,中间是索引值的bytes,后缀是elementid,因此range索引天然是有序的。
存储结构:
index_label_id | field_values | element_ids对于二级索引,也是:
indexLabelId | fieldValues | element_idsfield_values: 属性的值,可以是单个属性,也可以是多个属性拼接而成index_label_id: 索引标签的Idelement_ids: 顶点或边的Id查询要从GraphTransaction的query开始分析,针对ConditionQuery条件查询,会调用optimizeQueries优化查询。
public QueryResults<BackendEntry> query(Query query) {
if (!(query instanceof ConditionQuery)) {
LOG.debug("Query{final:{}}", query);
return super.query(query);
}
QueryList<BackendEntry> queries = this.optimizeQueries(query,
super::query);
LOG.debug("{}", queries);
return queries.empty() ? QueryResults.empty() :
queries.fetch(this.pageSize);
}optimizeQueries 会将condtion query flatten展开(比如in查询,展开成多个查询),然后针对每个cq做查询。
针对每个cq,会调用indexQuery走索引查询。
protected <R> QueryList<R> optimizeQueries(Query query,
QueryResults.Fetcher<R> fetcher) {
QueryList<R> queries = new QueryList<>(query, fetcher);
for (ConditionQuery cq: ConditionQueryFlatten.flatten(
(ConditionQuery) query)) {
// Optimize by sysprop
Query q = this.optimizeQuery(cq);
/*
* NOTE: There are two possibilities for this query:
* 1.sysprop-query, which would not be empty.
* 2.index-query result(ids after optimization), which may be empty.
*/
if (q == null) {
queries.add(this.indexQuery(cq), this.batchSize);
} else if (!q.empty()) {
queries.add(q);
}
}
return queries;
}索引查询,核心代码在 GraphIndexTransaction.queryIndex
@Watched(prefix = "index")
public IdHolderList queryIndex(ConditionQuery query) {
// Index query must have been flattened in Graph tx
query.checkFlattened();
// NOTE: Currently we can't support filter changes in memory
if (this.hasUpdate()) {
throw new HugeException("Can't do index query when " +
"there are changes in transaction");
}
// Can't query by index and by non-label sysprop at the same time
List<Condition> conds = query.syspropConditions();
if (conds.size() > 1 ||
(conds.size() == 1 && !query.containsCondition(HugeKeys.LABEL))) {
throw new HugeException("Can't do index query with %s and %s",
conds, query.userpropConditions());
}
// Query by index
query.optimized(OptimizedType.INDEX);
if (query.allSysprop() && conds.size() == 1 &&
query.containsCondition(HugeKeys.LABEL)) {
// Query only by label
return this.queryByLabel(query);
} else {
// Query by userprops (or userprops + label)
return this.queryByUserprop(query);
}
}会先做一些检查,然后判断是否有属性条件,如果没有则直接查询对应label,否则走queryByUserprop,根据属性值查询结果。
@Watched(prefix = "index")
private IdHolderList queryByUserprop(ConditionQuery query) {
// Get user applied label or collect all qualified labels with
// related index labels
Set<MatchedIndex> indexes = this.collectMatchedIndexes(query);
if (indexes.isEmpty()) {
Id label = query.condition(HugeKeys.LABEL);
throw noIndexException(this.graph(), query, label);
}
// Value type of Condition not matched
boolean paging = query.paging();
if (!validQueryConditionValues(this.graph(), query)) {
return IdHolderList.empty(paging);
}
// Do index query
IdHolderList holders = new IdHolderList(paging);
for (MatchedIndex index : indexes) {
for (IndexLabel il : index.indexLabels()) {
validateIndexLabel(il);
}
if (paging && index.indexLabels().size() > 1) {
throw new NotSupportException("joint index query in paging");
}
if (index.containsSearchIndex()) {
// Do search-index query
holders.addAll(this.doSearchIndex(query, index));
} else {
// Do secondary-index, range-index or shard-index query
IndexQueries queries = index.constructIndexQueries(query);
assert !paging || queries.size() <= 1;
IdHolder holder = this.doSingleOrJointIndex(queries);
holders.add(holder);
}
/*
* NOTE: need to skip the offset if offset > 0, but can't handle
* it here because the query may a sub-query after flatten,
* so the offset will be handle in QueryList.IndexQuery
*
* TODO: finish early here if records exceeds required limit with
* FixedIdHolder.
*/
}
return holders;
}queryByUserprop 会先查询出匹配的索引(collectMatchedIndexes),如果没匹配到索引,就会报错。
如果匹配到多个索引,依次查询,如果是search索引,走doSearchIndex,反之先constructIndexQueries,然后doSingleOrJointIndex。
搜索索引,之所以特殊处理,因为要分词:
@Watched(prefix = "index")
private IdHolderList doSearchIndex(ConditionQuery query,
MatchedIndex index) {
query = this.constructSearchQuery(query, index);
// Sorted by matched count
IdHolderList holders = new SortByCountIdHolderList(query.paging());
List<ConditionQuery> flatten = ConditionQueryFlatten.flatten(query);
for (ConditionQuery q : flatten) {
if (!q.noLimit() && flatten.size() > 1) {
// Increase limit for union operation
increaseLimit(q);
}
IndexQueries queries = index.constructIndexQueries(q);
assert !query.paging() || queries.size() <= 1;
IdHolder holder = this.doSingleOrJointIndex(queries);
// NOTE: ids will be merged into one IdHolder if not in paging
holders.add(holder);
}
return holders;
}private ConditionQuery constructSearchQuery(ConditionQuery query,
MatchedIndex index) {
ConditionQuery originQuery = query;
Set<Id> indexFields = new HashSet<>();
// Convert has(key, text) to has(key, textContainsAny(word1, word2))
for (IndexLabel il : index.indexLabels()) {
if (il.indexType() != IndexType.SEARCH) {
continue;
}
Id indexField = il.indexField();
String fieldValue = (String) query.userpropValue(indexField);
Set<String> words = this.segmentWords(fieldValue);
indexFields.add(indexField);
query = query.copy();
query.unsetCondition(indexField);
query.query(Condition.textContainsAny(indexField, words));
}
// Register results filter to compare property value and search text
query.registerResultsFilter(elem -> {
for (Condition cond : originQuery.conditions()) {
Object key = cond.isRelation() ? ((Relation) cond).key() : null;
if (key instanceof Id && indexFields.contains(key)) {
// This is an index field of search index
Id field = (Id) key;
assert elem != null;
HugeProperty<?> property = elem.getProperty(field);
String propValue = propertyValueToString(property.value());
String fieldValue = (String) originQuery.userpropValue(field);
if (this.matchSearchIndexWords(propValue, fieldValue)) {
continue;
}
return false;
}
if (!cond.test(elem)) {
return false;
}
}
return true;
});
return query;
}普通索引,也是先构造索引查询:
ublic IndexQueries constructIndexQueries(ConditionQuery query) {
// Condition query => Index Queries
if (this.indexLabels().size() == 1) {
/*
* Query by single index or composite index
*/
IndexLabel il = this.indexLabels().iterator().next();
ConditionQuery indexQuery = constructQuery(query, il);
assert indexQuery != null;
return IndexQueries.of(il, indexQuery);
} else {
/*
* Query by joint indexes
*/
IndexQueries queries = buildJointIndexesQueries(query, this);
assert !queries.isEmpty();
return queries;
}
}如果只匹配到一个索引,直接走这个索引,最简单的情况,
如果匹配到多个索引,这个时候要走联合查询了(buildJointIndexesQueries)
最后,通过doSingleOrJointIndex来获取结果:
@Watched(prefix = "index")
private IdHolder doSingleOrJointIndex(IndexQueries queries) {
if (queries.size() == 1) {
return this.doSingleOrCompositeIndex(queries);
} else {
return this.doJointIndex(queries);
}
}如果queries.size > 1,代表要走联合索引。但是一般db一次查询通常直走一个索引,hugegraph也差不多:
@Watched(prefix = "index")
private IdHolder doJointIndex(IndexQueries queries) {
if (queries.oomRisk()) {
LOG.warn("There is OOM risk if the joint operation is based on a " +
"large amount of data, please use single index + filter " +
"instead of joint index: {}", queries.rootQuery());
}
// All queries are joined with AND
Set<Id> intersectIds = null;
boolean filtering = false;
IdHolder resultHolder = null;
for (Map.Entry<IndexLabel, ConditionQuery> e : queries.entrySet()) {
IndexLabel indexLabel = e.getKey();
ConditionQuery query = e.getValue();
assert !query.paging();
if (!query.noLimit() && queries.size() > 1) {
// Unset limit for intersection operation
query.limit(Query.NO_LIMIT);
}
/*
* Try to query by joint indexes:
* 1 If there is any index exceeded the threshold, transform into
* partial index query, then filter after back-table.
* 1.1 Return the holder of the first index that not exceeded the
* threshold if there exists one index, this holder will be used
* as the only query condition.
* 1.2 Return the holder of the first index if all indexes exceeded
* the threshold.
* 2 Else intersect holders for all indexes, and return intersection
* ids of all indexes.
*/
IdHolder holder = this.doIndexQuery(indexLabel, query);
if (resultHolder == null) {
resultHolder = holder;
}
assert this.indexIntersectThresh > 0; // default value is 1000
Set<Id> ids = ((BatchIdHolder) holder).peekNext(
this.indexIntersectThresh).ids();
if (ids.size() >= this.indexIntersectThresh) {
// Transform into filtering
filtering = true;
query.optimized(OptimizedType.INDEX_FILTER);
} else if (filtering) {
assert ids.size() < this.indexIntersectThresh;
resultHolder = holder;
break;
} else {
if (intersectIds == null) {
intersectIds = ids;
} else {
CollectionUtil.intersectWithModify(intersectIds, ids);
}
if (intersectIds.isEmpty()) {
break;
}
}
}
if (filtering) {
return resultHolder;
} else {
assert intersectIds != null;
return new FixedIdHolder(queries.asJointQuery(), intersectIds);
}
}读取到id后,就是根据id读取结果,过滤结果了。
关键代码在AbstractTransaction:
@Watched(prefix = "tx")
public QueryResults<BackendEntry> query(Query query) {
LOG.debug("Transaction query: {}", query);
/*
* NOTE: it's dangerous if an IdQuery/ConditionQuery is empty
* check if the query is empty and its class is not the Query itself
*/
if (query.empty() && !query.getClass().equals(Query.class)) {
throw new BackendException("Query without any id or condition");
}
Query squery = this.serializer.writeQuery(query);
// Do rate limit if needed
RateLimiter rateLimiter = this.graph.readRateLimiter();
if (rateLimiter != null && query.resultType().isGraph()) {
double time = rateLimiter.acquire(1);
if (time > 0) {
LOG.debug("Waited for {}s to query", time);
}
BackendEntryIterator.checkInterrupted();
}
this.beforeRead();
try {
return new QueryResults<>(this.store.query(squery), query);
} finally {
this.afterRead(); // TODO: not complete the iteration currently
}
}逐级往下,核心代码在writeQueryCondition:
@Override
protected Query writeQueryCondition(Query query) {
HugeType type = query.resultType();
if (!type.isIndex()) {
return query;
}
ConditionQuery cq = (ConditionQuery) query;
if (type.isNumericIndex()) {
// Convert range-index/shard-index query to id range query
return this.writeRangeIndexQuery(cq);
} else {
assert type.isSearchIndex() || type.isSecondaryIndex() ||
type.isUniqueIndex();
// Convert secondary-index or search-index query to id query
return this.writeStringIndexQuery(cq);
}
}如果是rangeindex 索引,会转换为scan indexlabelid:start - indexlabelid:end 的查询
private Query writeRangeIndexQuery(ConditionQuery query) {
Id index = query.condition(HugeKeys.INDEX_LABEL_ID);
E.checkArgument(index != null, "Please specify the index label");
List<Condition> fields = query.syspropConditions(HugeKeys.FIELD_VALUES);
E.checkArgument(!fields.isEmpty(),
"Please specify the index field values");
HugeType type = query.resultType();
Id start = null;
if (query.paging() && !query.page().isEmpty()) {
byte[] position = PageState.fromString(query.page()).position();
start = new BinaryId(position, null);
}
RangeConditions range = new RangeConditions(fields);
if (range.keyEq() != null) {
Id id = formatIndexId(type, index, range.keyEq(), true);
if (start == null) {
return new IdPrefixQuery(query, id);
}
E.checkArgument(Bytes.compare(start.asBytes(), id.asBytes()) >= 0,
"Invalid page out of lower bound");
return new IdPrefixQuery(query, start, id);
}
Object keyMin = range.keyMin();
Object keyMax = range.keyMax();
boolean keyMinEq = range.keyMinEq();
boolean keyMaxEq = range.keyMaxEq();
if (keyMin == null) {
E.checkArgument(keyMax != null,
"Please specify at least one condition");
// Set keyMin to min value
keyMin = NumericUtil.minValueOf(keyMax.getClass());
keyMinEq = true;
}
Id min = formatIndexId(type, index, keyMin, false);
if (!keyMinEq) {
/*
* Increase 1 to keyMin, index GT query is a scan with GT prefix,
* inclusiveStart=false will also match index started with keyMin
*/
increaseOne(min.asBytes());
keyMinEq = true;
}
if (start == null) {
start = min;
} else {
E.checkArgument(Bytes.compare(start.asBytes(), min.asBytes()) >= 0,
"Invalid page out of lower bound");
}
if (keyMax == null) {
keyMax = NumericUtil.maxValueOf(keyMin.getClass());
keyMaxEq = true;
}
Id max = formatIndexId(type, index, keyMax, false);
if (keyMaxEq) {
keyMaxEq = false;
increaseOne(max.asBytes());
}
return new IdRangeQuery(query, start, keyMinEq, max, keyMaxEq);
}如果是其他索引,则转换为前缀匹配查询:
private Query writeStringIndexQuery(ConditionQuery query) {
E.checkArgument(query.allSysprop() &&
query.conditions().size() == 2,
"There should be two conditions: " +
"INDEX_LABEL_ID and FIELD_VALUES" +
"in secondary index query");
Id index = query.condition(HugeKeys.INDEX_LABEL_ID);
Object key = query.condition(HugeKeys.FIELD_VALUES);
E.checkArgument(index != null, "Please specify the index label");
E.checkArgument(key != null, "Please specify the index key");
Id prefix = formatIndexId(query.resultType(), index, key, true);
return prefixQuery(query, prefix);
}查询到rocksdb后端的时候:
protected BackendColumnIterator queryBy(Session session, Query query) {
// Query all
if (query.empty()) {
return this.queryAll(session, query);
}
// Query by prefix
if (query instanceof IdPrefixQuery) {
IdPrefixQuery pq = (IdPrefixQuery) query;
return this.queryByPrefix(session, pq);
}
// Query by range
if (query instanceof IdRangeQuery) {
IdRangeQuery rq = (IdRangeQuery) query;
return this.queryByRange(session, rq);
}
// Query by id
if (query.conditions().isEmpty()) {
assert !query.ids().isEmpty();
// NOTE: this will lead to lazy create rocksdb iterator
return new BackendColumnIteratorWrapper(new FlatMapperIterator<>(
query.ids().iterator(), id -> this.queryById(session, id)
));
}
// Query by condition (or condition + id)
ConditionQuery cq = (ConditionQuery) query;
return this.queryByCond(session, cq);
}前缀查询:
protected BackendColumnIterator queryByPrefix(Session session,
IdPrefixQuery query) {
int type = query.inclusiveStart() ?
Session.SCAN_GTE_BEGIN : Session.SCAN_GT_BEGIN;
type |= Session.SCAN_PREFIX_END;
return session.scan(this.table(), query.start().asBytes(),
query.prefix().asBytes(), type);
}range查询:
protected BackendColumnIterator queryByRange(Session session,
IdRangeQuery query) {
byte[] start = query.start().asBytes();
byte[] end = query.end() == null ? null : query.end().asBytes();
int type = query.inclusiveStart() ?
Session.SCAN_GTE_BEGIN : Session.SCAN_GT_BEGIN;
if (end != null) {
type |= query.inclusiveEnd() ?
Session.SCAN_LTE_END : Session.SCAN_LT_END;
}
return session.scan(this.table(), start, end, type);
}查询后,在BinarySerializer中,通过readIndex还原为index:
@Override
public HugeIndex readIndex(HugeGraph graph, ConditionQuery query,
BackendEntry bytesEntry) {
if (bytesEntry == null) {
return null;
}
BinaryBackendEntry entry = this.convertEntry(bytesEntry);
// NOTE: index id without length prefix
byte[] bytes = entry.id().asBytes();
HugeIndex index = HugeIndex.parseIndexId(graph, entry.type(), bytes);
Object fieldValues = null;
if (!index.type().isRangeIndex()) {
fieldValues = query.condition(HugeKeys.FIELD_VALUES);
if (!index.fieldValues().equals(fieldValues)) {
// Update field-values for hashed or encoded index-id
index.fieldValues(fieldValues);
}
}
this.parseIndexName(graph, query, entry, index, fieldValues);
return index;
}parseIndexId 和parseIndexName 是存储的decode操作,代码类似,一个存,一个读。
这里提一个问题,要对符合条件的结果做全局排序怎么优化?
比如,我们需要按更新时间(update_time)排序,当没有其他条件时,可以将排序转换为update_time>0 的查询,因为range索引默认是有序的,从小到大(详见上面的存储结构分析)。
如果要倒序怎么办?
但是,这种查询,在有其他条件时就无效了,详见doJointIndex,这种情况如何优化了?
我们下期再聊。