【背景】
最近有项目需求用到多键索引,Mongodb中字段值支持多键索引主要包括嵌套文档、数组以及数组嵌套文档.例如联系包括手机、固定电话、邮箱、微信、QQ等,对于字段值存储类型不一样,决定创建多键索引也不一样同时性能也存在差异,例如数组值(包括数组以及数组文档),创建多键索引时会为数组中每个元素都创建索引键,如数组中元素特别多,相应索引也会特别大,创建多键索引或者组合索引时最多只支持一个数组值.
嵌套文档:“telephone”:{"cellphone":"0211234567","mobilephone":13888888888}
数组:“telephone”:["0211234567",13888888888]
数组文档:“联系”:[“telephone”:{"cellphone":"0211234567","mobilephone":13888888888}]
【数组值创建并高效使用索引】
1、集合中文档信息
备注:ratings字段值是数组.
xiaoxu:PRIMARY> db.survey.findOne({});
{
"_id" : ObjectId("5f2ff576eb7de181ebe814f9"),
"item" : {
"name" : "Katie",
"manufactured" : 16
},
"ratings" : [
16,
116
],
"user_id" : 16,
"boolean" : true,
"added_at" : ISODate("2020-08-09T13:09:10.791Z"),
"number" : 662
}
xiaoxu:PRIMARY>
2、创建多键索引 db.survey.createIndex({"ratings":1},{background:1});
备注:创建多键索引不需要显示关键字,如字段值中包括数组值或者嵌套文档情况,这个存储引擎自动创建为多键盘索引,如4.2版本之前不加background:1,则前台创建索引,加DB级别排他锁(大表后果很严重),4.2开始加不加background:1都是online创建索引.
3、范围查找ratings
逻辑:
1、查询ratings数组中存在至少1个元素同时满足大于等于3且小于等于6【类似and逻辑 ,此时多键索引边界可以合并为【【3,6】】】--
db.survey.find({ ratings:{ $elemMatch: { $gte: 3, $lte: 6}}} )
2、查询ratings数组中存在至少1个元素大于等于3且至少1元素小于等于6或者存在一个元素同时满足大于等于3且小于等于6【类似or逻辑,【【3,+∞】】or【【-∞,6】】,此时执行计划只有使用or的一边,使用大于等于3然后过来另外一个值是否满足小于等于6或者反过来】
db.survey.find({ ratings:{ $gte: 3, $lte: 6} )
【具体执行计划】
db.survey.find({ ratings:{ $elemMatch: { $gte: 3, $lte: 6}}} )
xiaoxu:PRIMARY> db.survey.find( { ratings : { $elemMatch: { $gte: 3, $lte: 6 } } } ).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 4,
"executionTimeMillis" : 0,
"totalKeysExamined" : 4,
"totalDocsExamined" : 4,
"executionStages" : {
"stage" : "FETCH",
"filter" : {
"ratings" : {
"$elemMatch" : {
"$lte" : 6,
"$gte" : 3
}
}
},
"nReturned" : 4,
"executionTimeMillisEstimate" : 0,
"works" : 5,
"advanced" : 4,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 4,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 4,
"executionTimeMillisEstimate" : 0,
"works" : 5,
"advanced" : 4,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"ratings" : 1
},
"indexName" : "ratings_1",
"isMultiKey" : true,
"multiKeyPaths" : {
"ratings" : [
"ratings"
]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"ratings" : [
"[3.0, 6.0]"
]
},
"keysExamined" : 4,
"seeks" : 1,
"dupsTested" : 4,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
}
db.survey.find({ ratings:{ $gte: 3, $lte: 6} )
xiaoxu:PRIMARY> db.survey.find( { ratings : { $gte: 3, $lte: 6 } } ).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 7,
"executionTimeMillis" : 0,
"totalKeysExamined" : 7,
"totalDocsExamined" : 7,
"executionStages" : {
"stage" : "FETCH",
"filter" : {
"ratings" : {
"$gte" : 3
}
},
"nReturned" : 7,
"executionTimeMillisEstimate" : 0,
"works" : 9,
"advanced" : 7,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 7,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 7,
"executionTimeMillisEstimate" : 0,
"works" : 8,
"advanced" : 7,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"ratings" : 1
},
"indexName" : "ratings_1",
"isMultiKey" : true,
"multiKeyPaths" : {
"ratings" : [
"ratings"
]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"ratings" : [
"[-inf.0, 6.0]"
]
},
"keysExamined" : 7,
"seeks" : 1,
"dupsTested" : 7,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
}
xiaoxu:PRIMARY>
4、等值查询ratings
逻辑:
1、查询数组值完全匹配100,200 2个元素且顺序是相同
备注:先匹配数组内任何元素满足100,然后返回文档判断是否满足整个数组,
如果100值特别多,100存在超过80%,最终结果集只有1个,那么回集合过滤匹配效果特别差
db.survey.find({ ratings:[100,200]})
备注:keysExamined=12,实际上等于100只有11个,多扫描一个索引key没有搞定
清楚
xiaoxu:PRIMARY> db.survey.find({ratings:100}).count()
11
xiaoxu:PRIMARY> db.survey.find({ratings:[100,200]}).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 5,
"executionTimeMillis" : 0,
"totalKeysExamined" : 12,
"totalDocsExamined" : 11,
"executionStages" : {
"stage" : "FETCH",
"filter" : {
"ratings" : {
"$eq" : [
100,
200
]
}
},
"nReturned" : 5,
"executionTimeMillisEstimate" : 0,
"works" : 13,
"advanced" : 5,
"needTime" : 7,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 11,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 11,
"executionTimeMillisEstimate" : 0,
"works" : 13,
"advanced" : 11,
"needTime" : 1,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"ratings" : 1
},
"indexName" : "ratings_1",
"isMultiKey" : true,
"multiKeyPaths" : {
"ratings" : [
"ratings"
]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"ratings" : [
"[100.0, 100.0]",
"[[ 100.0, 200.0 ], [ 100.0, 200.0 ]]"
]
},
"keysExamined" : 12,
"seeks" : 2,
"dupsTested" : 11,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
}
如不需要考虑顺序,使用$all操作符
db.survey.find({ ratings:{$all:[100,200]}})
这个执行只需要判断另外一个元素是否等于200.而不是整合数组匹配,扫描刚好是11
个索引key
xiaoxu:PRIMARY> db.survey.find({ratings:{$all:[100,200]}}).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 5,
"executionTimeMillis" : 0,
"totalKeysExamined" : 11,
"totalDocsExamined" : 11,
"executionStages" : {
"stage" : "FETCH",
"filter" : {
"ratings" : {
"$eq" : 200
}
},
2、查询数组中至少一个元素等于100,效率还可以
db.survey.find({ ratings:100})
xiaoxu:PRIMARY> db.survey.find({ratings:100}).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 2,
"executionTimeMillis" : 0,
"totalKeysExamined" : 2,
"totalDocsExamined" : 2,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 2,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 2,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 2,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 2,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 2,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"ratings" : 1
},
"indexName" : "ratings_1",
"isMultiKey" : true,
"multiKeyPaths" : {
"ratings" : [
"ratings"
]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"ratings" : [
"[100.0, 100.0]"
]
},
"keysExamined" : 2,
"seeks" : 1,
"dupsTested" : 2,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
}
5、通过数组长度查询数据
逻辑:
查询数组长度为1的对象,注意这个$size无法使用索引都是collscan.
db.survey.find({ratings:{$size:1}})
xiaoxu:PRIMARY> db.survey.find({ratings:{$size:1}}).
explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 0,
"executionTimeMillis" : 718,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1000019,
"executionStages" : {
"stage" : "COLLSCAN",
"filter" : {
"ratings" : {
"$size" : 1
}
},
"nReturned" : 0,
"executionTimeMillisEstimate" : 621,
"works" : 1000021,
"advanced" : 0,
"needTime" : 1000020,
"needYield" : 0,
"saveState" : 7813,
"restoreState" : 7813,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1000019
}
}
xiaoxu:PRIMARY>
6、通过数组索引位置来查询数据
逻辑:
1、数组索引位置从0开始,当对数组创建索引时,使用索引位置查询时,是无法使用多键索引,必须创建单独索引,例如第二个元素位置, db.survey.createIndex({"ratings.1":1}).其他位置以此内推方式创建索引.
2、创建数组索引还是按照数组索引位置创建索引,根据业务实际需求,做到创建索引能够提升效率,而不是创建低效或者无用索引。
3、查询单个元素,此时索引则不是多键索引,就是单个标量值,标量表示是字符串或者数字,而不是数组或者嵌套文档.
xiaoxu:PRIMARY> db.survey.find({"ratings.0":100}).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 10,
"executionTimeMillis" : 498,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1000019,
"executionStages" : {
"stage" : "COLLSCAN",
"filter" : {
"ratings.0" : {
"$eq" : 100
}
},
"nReturned" : 10,
"executionTimeMillisEstimate" : 450,
"works" : 1000021,
"advanced" : 10,
"needTime" : 1000010,
"needYield" : 0,
"saveState" : 7812,
"restoreState" : 7812,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1000019
}
}
xiaoxu:PRIMARY>
xiaoxu:PRIMARY> db.survey.find({"ratings.1":100}).
explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 505,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1000019,
"executionStages" : {
"stage" : "COLLSCAN",
"filter" : {
"ratings.1" : {
"$eq" : 100
}
},
"nReturned" : 1,
"executionTimeMillisEstimate" : 410,
"works" : 1000021,
"advanced" : 1,
"needTime" : 1000019,
"needYield" : 0,
"saveState" : 7812,
"restoreState" : 7812,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1000019
}
}
xiaoxu:PRIMARY> db.survey.createIndex({"ratings.1":1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 5,
"numIndexesAfter" : 6,
"ok" : 1
}
xiaoxu:PRIMARY> db.survey.find({"ratings.1":100}).explain("executionStats").executionStats;
{
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 1,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"ratings.1" : 1
},
"indexName" : "ratings.1_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"ratings.1" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"ratings.1" : [
"[100.0, 100.0]"
]
},
"keysExamined" : 1,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
}
本次主要介绍数组多键索引,还包括数组嵌套文档、嵌套文档等多键索引,本次内容主要来自官方文档,主要补充执行计划等信息,并没有深入研究,只是作为入门学习了解,希望对大家有帮助。