count(distinct xx) 优化
postgres=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLETime: 89.938 mspostgres=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t;INSERT 0 1000000Time: 14849.045 ms (00:14.849)postgres=# analyze t1;ANALYZETime: 1340.387 ms (00:01.340)postgres=# explain (verbose) select count(distinct f2) from t1;QUERY PLAN\\---------------------------------------------------------------------------------------------------------------------------------------Aggregate (cost=103320.00..103320.01 rows=1 width=8)Output: count(DISTINCT f2)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..100820.00 rows=1000000 width=33)Output: f2-> Seq Scan on public.t1 (cost=0.00..62720.00 rows=1000000 width=33)Output: f2(6 rows)Time: 0.748 mspostgres=# select count(distinct f2) from t1;count\\---------1000000(1 row)Time: 6274.684 ms (00:06.275)postgres=# select count(distinct f2) from t1 where f1 <10;count\\-------9(1 row)Time: 19.261 ms
如上 count(distinct f2) 发生在 cn 节点,对于 TP 类业务,需要操作的数据量少的情况下,性能开销没有问题,且比下推执行的性能开销还要小。
但对于一次要操作的数据量比较大的 AP 类业务,网络传输就会成为瓶颈,如下是改写后的执行计划:
postgres=# explain (verbose) select count(1) from (select f2 from t1 group by f2) as t ;QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------------------------------------------------Finalize Aggregate (cost=355600.70..355600.71 rows=1 width=8)Output: count(1)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=355600.69..355600.70 rows=1 width=0)Output: PARTIAL count(1)-> Partial Aggregate (cost=355500.69..355500.70 rows=1 width=8)Output: PARTIAL count(1)-> Group (cost=340500.69..345500.69 rows=1000000 width=33)Output: t1.f2Group Key: t1.f2-> Sort (cost=340500.69..343000.69 rows=1000000 width=0)Output: t1.f2Sort Key: t1.f2-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=216192.84..226192.84 rows=1000000 width=0)Output: t1.f2Distribute results by S: f2-> Group (cost=216092.84..221092.84 rows=1000000 width=33)Output: t1.f2Group Key: t1.f2-> Sort (cost=216092.84..218592.84 rows=1000000 width=33)Output: t1.f2Sort Key: t1.f2-> Seq Scan on public.t1 (cost=0.00..62720.00 rows=1000000 width=33)Output: t1.f2(23 rows)
改写后,并行推到 dn 执行,此时查看执行的效果,可以看到对于大量数据计算的 AP 类业务,性能提高了5倍。
postgres=# select count(1) from (select f2 from t1 group by f2) as t ;count\\---------1000000(1 row)Time: 1328.431 ms (00:01.328)postgres=# select count(1) from (select f2 from t1 where f1<10 group by f2) as t ;count\\-------9(1 row)Time: 24.991 mspostgres=#
增大 work_mem 减少 io 访问
增大 work_mem 后,性能提高了40倍,因为 work_mem 足够放下 filter 的数据,不需要再做 Materialize 物化,filter 由原来的 subplan 变成了 hash subplan,直接在内存 hash 表中 filter,性能提升。
注意:
work_mem 默认不宜过大,建议在某个具体的查询语句中再根据需要进行调整即可。
postgres=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLETime: 70.545 mspostgres=# CREATE TABLE t2(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLETime: 61.913 mspostgres=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000) as t;INSERT 0 1000Time: 48.866 mspostgres=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,50000) as t;INSERT 0 50000Time: 792.858 mspostgres=# analyze t1;ANALYZETime: 175.946 mspostgres=# analyze t2;ANALYZETime: 318.802 mspostgres=#postgres=# explain select * from t1 where f2 not in (select f2 from t2);QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------------------------------Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..2076712.50 rows=500 width=367)-> Seq Scan on t1 (cost=0.00..2076712.50 rows=500 width=367)Filter: (NOT (SubPlan 1))SubPlan 1-> Materialize (cost=0.00..4028.00 rows=50000 width=33)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33)-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33)(7 rows)Time: 0.916 mspostgres=# select * from t1 where f2 not in (select f2 from t2);f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12----+----+----+----+----+----+----+----+----+-----+-----+-----(0 rows)Time: 4226.825 ms (00:04.227)postgres=# set work_mem to '8MB';SETTime: 0.289 mspostgres=# explain select * from t1 where f2 not in (select f2 from t2);QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------------------------Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=3365.00..3577.50 rows=500 width=367)-> Seq Scan on t1 (cost=3365.00..3577.50 rows=500 width=367)Filter: (NOT (hashed SubPlan 1))SubPlan 1-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33)-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33)(6 rows)Time: 0.890 mspostgres=# select * from t1 where f2 not in (select f2 from t2);f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12----+----+----+----+----+----+----+----+----+-----+-----+-----(0 rows)Time: 105.249 mspostgres=#
not in 改写为 anti join
上文通过增大计算内存提高性能,但内存不可能无限扩大,如下通过改写语句来提高查询的性能。
postgres=# explain select * from t1 left outer join t2 on t1.f2 = t2.f2 where t2.f2 is null;QUERY PLAN\\-------------------------------------------------------------------------------------------------------------------------------------------------Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=6405.00..9260.75 rows=1 width=734)-> Hash Anti Join (cost=6405.00..9260.75 rows=1 width=734)Hash Cond: (t1.f2 = t2.f2)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..682.00 rows=1000 width=367)Distribute results by S: f2-> Seq Scan on t1 (cost=0.00..210.00 rows=1000 width=367)-> Hash (cost=21940.00..21940.00 rows=50000 width=367)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..21940.00 rows=50000 width=367)Distribute results by S: f2-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=367)(10 rows)Time: 1.047 mspostgres=# select * from t1 left outer join t2 on t1.f2 = t2.f2 where t2.f2 is null;f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12----+----+----+----+----+----+----+----+----+-----+-----+-----+----+----+----+----+----+----+----+----+----+-----+-----+-----(0 rows)Time: 107.233 mspostgres=#
也可以修改 not exists:
postgres=# explain select * from t1 where not exists( select 1 from t2 where t1.f2=t2.f2);QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------------------------------Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=3865.00..4078.75 rows=1 width=367)-> Hash Anti Join (cost=3865.00..4078.75 rows=1 width=367)Hash Cond: (t1.f2 = t2.f2)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..682.00 rows=1000 width=367)Distribute results by S: f2-> Seq Scan on t1 (cost=0.00..210.00 rows=1000 width=367)-> Hash (cost=5240.00..5240.00 rows=50000 width=33)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..5240.00 rows=50000 width=33)Distribute results by S: f2-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33)(10 rows)Time: 0.974 mspostgres=# select * from t1 where not exists( select 1 from t2 where t1.f2=t2.f2);f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12----+----+----+----+----+----+----+----+----+-----+-----+-----(0 rows)Time: 42.944 mspostgres=#
分布 key join+limit 优化
数据准备:
postgres=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLEpostgres=# CREATE TABLE t2(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLEpostgres=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t;INSERT 0 1000000postgres=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t;INSERT 0 1000000postgres=# analyze t1;ANALYZEpostgres=# analyze t2;ANALYZEpostgres=#postgres=# \\timingTiming is on.postgres=# explain select t1.* from t1,t2 where t1.f1=t2.f1 limit 10;QUERY PLAN\\----------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.25..1.65 rows=10 width=367)-> Merge Join (cost=0.25..140446.26 rows=1000000 width=367)Merge Cond: (t1.f1 = t2.f1)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..434823.13 rows=1000000 width=367)-> Index Scan using t1_f1_key on t1 (cost=0.12..62723.13 rows=1000000 width=367)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..71823.13 rows=1000000 width=4)-> Index Only Scan using t2_f1_key on t2 (cost=0.12..62723.13 rows=1000000 width=4)(7 rows)Time: 1.372 mspostgres=# explain analyze select t1.* from t1,t2 where t1.f1=t2.f1 limit 10;QUERY PLAN\\--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.25..1.65 rows=10 width=367) (actual time=2675.437..2948.199 rows=10 loops=1)-> Merge Join (cost=0.25..140446.26 rows=1000000 width=367) (actual time=2675.431..2675.508 rows=10 loops=1)Merge Cond: (t1.f1 = t2.f1)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..434823.13 rows=1000000 width=367) (actual time=1.661..1.704 rows=10 loops=1)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..71823.13 rows=1000000 width=4) (actual time=2673.761..2673.783 rows=10 loops=1)Planning time: 0.358 msExecution time: 2973.948 ms(7 rows)Time: 2976.008 ms (00:02.976)postgres=#
以上执行计划是在 cn 上执行,merge join 需要把要 join 的数据拉回 cn 再排序,然后再 join,这里主切的开销在于网络,优化方法是让语句推下去计算,如下所示,两者相差150倍的性能,一般情况下,如果需要拉取大量的数据回 cn 计算,则下推执行的效率会更好。
postgres=# set prefer_olap to on;SETTime: 0.291 mspostgres=# explain select t1.* from t1,t2 where t1.f1=t2.f1 limit 10;QUERY PLAN\\--------------------------------------------------------------------------------------------------------------------------------Limit (cost=100.25..101.70 rows=10 width=367)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.25..101.70 rows=10 width=367)-> Limit (cost=0.25..1.65 rows=10 width=367)-> Merge Join (cost=0.25..140446.26 rows=1000000 width=367)Merge Cond: (t1.f1 = t2.f1)-> Index Scan using t1_f1_key on t1 (cost=0.12..62723.13 rows=1000000 width=367)-> Index Only Scan using t2_f1_key on t2 (cost=0.12..62723.13 rows=1000000 width=4)(7 rows)Time: 1.061 mspostgres=# explain analyze select t1.* from t1,t2 where t1.f1=t2.f1 limit 10;QUERY PLAN\\---------------------------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=100.25..101.70 rows=10 width=367) (actual time=1.527..3.899 rows=10 loops=1)-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.25..101.70 rows=10 width=367) (actual time=1.525..1.529 rows=10 loops=1)Planning time: 0.360 msExecution time: 18.193 ms(4 rows)Time: 19.921 ms
非分布 key join 使用 hash join 性能一般更好
为提高 TP 类业务查询的性能,经常需要对一些字段建立索引,使用有索引字段 join 时,系统往往也会使用 Merge Cond 和 nestloop。
mydb=# CREATE TABLE t1(f1 serial not null,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLETime: 481.042 msmydb=# create index t1_f1_idx on t1(f2);CREATE INDEXTime: 85.521 msmydb=# CREATE TABLE t2(f1 serial not null,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1);NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command.CREATE TABLETime: 75.973 msmydb=# create index t2_f1_idx on t2(f2);CREATE INDEXTime: 29.890 msmydb=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t;INSERT 0 1000000Time: 16450.623 ms (00:16.451)mydb=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t;INSERT 0 1000000Time: 17218.738 ms (00:17.219)mydb=# analyze t1;ANALYZETime: 2219.341 ms (00:02.219)mydb=# analyze t2;ANALYZETime: 1649.506 ms (00:01.650)mydb=#--merge joinmydb=# explain select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\-------------------------------------------------------------------------------------------------------------------------Limit (cost=100.25..102.78 rows=10 width=367)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.25..102.78 rows=10 width=367)-> Limit (cost=0.25..2.73 rows=10 width=367)-> Merge Join (cost=0.25..248056.80 rows=1000000 width=367)Merge Cond: (t1.f2 = t2.f2)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.12..487380.85 rows=1000000 width=367)Distribute results by S: f2-> Index Scan using t1_f1_idx on t1 (cost=0.12..115280.85 rows=1000000 width=367)-> Materialize (cost=100.12..155875.95 rows=1000000 width=33)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.12..153375.95 rows=1000000 width=33)Distribute results by S: f2-> Index Only Scan using t2_f1_idx on t2 (cost=0.12..115275.95 rows=1000000 width=33)(12 rows)Time: 4.183 msmydb=# explain analyze select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\-------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=100.25..102.78 rows=10 width=367) (actual time=6555.346..6556.296 rows=10 loops=1)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.25..102.78 rows=10 width=367) (actual time=6555.343..6555.349 rows=10 loops=1)Planning time: 0.473 msExecution time: 6569.828 ms(4 rows)Time: 6614.439 ms (00:06.614)--nested loopmydb=# set enable_mergejoin to off;SETTime: 0.422 msmydb=# explain select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\--------------------------------------------------------------------------------------------------------------------Limit (cost=100.12..103.57 rows=10 width=367)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.12..103.57 rows=10 width=367)-> Limit (cost=0.12..3.52 rows=10 width=367)-> Nested Loop (cost=0.12..339232.00 rows=1000000 width=367)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..434740.00 rows=1000000 width=367)Distribute results by S: f2-> Seq Scan on t1 (cost=0.00..62640.00 rows=1000000 width=367)-> Materialize (cost=100.12..100.31 rows=1 width=33)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.12..100.30 rows=1 width=33)Distribute results by S: f2-> Index Only Scan using t2_f1_idx on t2 (cost=0.12..0.27 rows=1 width=33)Index Cond: (f2 = t1.f2)(12 rows)Time: 1.033 msmydb=# explain analyze select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\-------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=100.12..103.57 rows=10 width=367) (actual time=5608.326..5609.571 rows=10 loops=1)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.12..103.57 rows=10 width=367) (actual time=5608.323..5608.349 rows=10 loops=1)Planning time: 0.347 msExecution time: 5669.901 ms(4 rows)Time: 5672.584 ms (00:05.673)mydb=# set enable_nestloop to off;SETTime: 0.436 msmydb=# explain select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\-------------------------------------------------------------------------------------------------------------------------Limit (cost=85983.00..85984.94 rows=10 width=367)-> Remote Subquery Scan on all (dn001,dn002) (cost=85983.00..85984.94 rows=10 width=367)-> Limit (cost=85883.00..85884.89 rows=10 width=367)-> Hash Join (cost=85883.00..274580.00 rows=1000000 width=367)Hash Cond: (t1.f2 = t2.f2)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..434740.00 rows=1000000 width=367)Distribute results by S: f2-> Seq Scan on t1 (cost=0.00..62640.00 rows=1000000 width=367)-> Hash (cost=100740.00..100740.00 rows=1000000 width=33)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..100740.00 rows=1000000 width=33)Distribute results by S: f2-> Seq Scan on t2 (cost=0.00..62640.00 rows=1000000 width=33)(12 rows)Time: 1.141 msmydb=# explain analyze select t1.* from t1,t2 where t1.f2=t2.f2 limit 10;QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=85983.00..85984.94 rows=10 width=367) (actual time=1083.691..1085.962 rows=10 loops=1)-> Remote Subquery Scan on all (dn001,dn002) (cost=85983.00..85984.94 rows=10 width=367) (actual time=1083.688..1083.699 rows=10 loops=1)Planning time: 0.530 msExecution time: 1108.830 ms(4 rows)Time: 1117.713 ms (00:01.118)mydb=#
exists 优化
exists 在数据量比较大情况下,一般使用的是 Semi Join ,在 work_mem 足够大的情况下使用的是 hash join,性能会更好。如下,性能提升大约一倍。
postgres=# show work_mem;work_mem\\----------4MB(1 row)Time: 0.298 mspostgres=# explain select count(1) from t1 where exists(select 1 from t2 where t2.t1_f1=t1.f1);QUERY PLAN\\-----------------------------------------------------------------------------------------------------------------------Finalize Aggregate (cost=242218.32..242218.33 rows=1 width=8)-> Remote Subquery Scan on all (dn001,dn002) (cost=242218.30..242218.32 rows=1 width=0)-> Partial Aggregate (cost=242118.30..242118.31 rows=1 width=8)-> Hash Semi Join (cost=110248.00..242118.30 rows=505421 width=0)Hash Cond: (t1.f1 = t2.t1_f1)-> Seq Scan on t1 (cost=0.00..17420.00 rows=1000000 width=4)-> Hash (cost=79340.00..79340.00 rows=3000000 width=4)-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..79340.00 rows=3000000 width=4)Distribute results by S: t1_f1-> Seq Scan on t2 (cost=0.00..52240.00 rows=3000000 width=4)(10 rows)Time: 1.091 mspostgres=# select count(1) from t1 where exists(select 1 from t2 where t2.t1_f1=t1.f1);count\\--------500000(1 row)Time: 3779.401 ms (00:03.779)postgres=# set work_mem to '128MB';SETTime: 0.368 mspostgres=# explain select count(1) from t1 where exists(select 1 from t2 where t2.t1_f1=t1.f1);QUERY PLAN\\------------------------------------------------------------------------------------------------------------------Finalize Aggregate (cost=101763.76..101763.77 rows=1 width=8)-> Remote Subquery Scan on all (dn001,dn002) (cost=101763.75..101763.76 rows=1 width=0)-> Partial Aggregate (cost=101663.75..101663.76 rows=1 width=8)-> Hash Join (cost=89660.00..101663.75 rows=505421 width=0)Hash Cond: (t2.t1_f1 = t1.f1)-> Remote Subquery Scan on all (dn001,dn002) (cost=59840.00..69443.00 rows=505421 width=4)Distribute results by S: t1_f1-> HashAggregate (cost=59740.00..64794.21 rows=505421 width=4)Group Key: t2.t1_f1-> Seq Scan on t2 (cost=0.00..52240.00 rows=3000000 width=4)-> Hash (cost=17420.00..17420.00 rows=1000000 width=4)-> Seq Scan on t1 (cost=0.00..17420.00 rows=1000000 width=4)(12 rows)Time: 4.739 mspostgres=# select count(1) from t1 where exists(select 1 from t2 where t2.t1_f1=t1.f1);count\\--------500000(1 row)Time: 1942.037 ms (00:01.942)postgres=#