查询的T-SQL版本
让我们设置一个包含一些数据的简单表:
DROP TABLE IF EXISTS #OrdersTable
CREATE TABLE #OrdersTable
(
Id int,
Custid int
);
INSERT INTO #OrdersTable (Id, Custid) VALUES (1, 71);
INSERT INTO #OrdersTable (Id, Custid) VALUES (2, 71);
INSERT INTO #OrdersTable (Id, Custid) VALUES (3, 71);
INSERT INTO #OrdersTable (Id, Custid) VALUES (4, 72);
INSERT INTO #OrdersTable (Id, Custid) VALUES (5, 72);
INSERT INTO #OrdersTable (Id, Custid) VALUES (6, 72);
INSERT INTO #OrdersTable (Id, Custid) VALUES (7, 73);
INSERT INTO #OrdersTable (Id, Custid) VALUES (8, 74);
INSERT INTO #OrdersTable (Id, Custid) VALUES (9, 74);
在本例中,客户71和72各有3个订单。客户73有1个订单。客户74有2个订单。
假设我们想知道订单量最大的客户。
以下查询:
SELECT TOP (1) WITH TIES Custid
FROM #OrdersTable
GROUP BY Custid
ORDER BY COUNT(*) DESC;
结果如下:
Custid
-----------
71
72
(2 rows affected)
向EF核心的朴素转换
给定以下类:
public class Order
{
public int Id { get; set; }
public int Custid { get; set; }
}
和以下数据:
var OrdersTable = new List<Order>()
{
new Order(){ Id = 1, Custid = 71},
new Order(){ Id = 2, Custid = 71},
new Order(){ Id = 3, Custid = 71},
new Order(){ Id = 4, Custid = 72},
new Order(){ Id = 5, Custid = 72},
new Order(){ Id = 6, Custid = 72},
new Order(){ Id = 7, Custid = 73},
new Order(){ Id = 8, Custid = 74},
new Order(){ Id = 9, Custid = 74},
};
以下是查询到EF Core的简单转换:
var n = OrdersTable.GroupBy(order => order.Custid).Select(grouping => grouping.Count()).Max();
var custids = OrdersTable.GroupBy(order => order.Custid).Where(grouping => grouping.Count() == n).Select(grouping => grouping.Key);
使用以下命令显示数据:
foreach (var custid in custids)
Console.WriteLine(custid);
我们得到:
71
72
问题
让我们把T-SQL和EF核心版本放在一起。T-SQL:
SELECT TOP (1) WITH TIES Custid
FROM #OrdersTable
GROUP BY Custid
ORDER BY COUNT(*) DESC;
EF核心:
var n = OrdersTable.GroupBy(order => order.Custid).Select(grouping => grouping.Count()).Max();
var custids = OrdersTable.GroupBy(order => order.Custid).Where(grouping => grouping.Count() == n).Select(grouping => grouping.Key);
我的问题是,在EF Core中有没有更有效的方法来实现这个查询?
完整的程序
演示上述查询的完整C#控制台程序:
using System;
using System.Collections.Generic;
using System.Linq;
namespace EfCoreTop1Ties
{
public class Order
{
public int Id { get; set; }
public int Custid { get; set; }
}
internal class Program
{
static void Main(string[] args)
{
var OrdersTable = new List<Order>()
{
new Order(){ Id = 1, Custid = 71},
new Order(){ Id = 2, Custid = 71},
new Order(){ Id = 3, Custid = 71},
new Order(){ Id = 4, Custid = 72},
new Order(){ Id = 5, Custid = 72},
new Order(){ Id = 6, Custid = 72},
new Order(){ Id = 7, Custid = 73},
new Order(){ Id = 8, Custid = 74},
new Order(){ Id = 9, Custid = 74},
};
var n = OrdersTable.GroupBy(order => order.Custid).Select(grouping => grouping.Count()).Max();
var custids = OrdersTable.GroupBy(order => order.Custid).Where(grouping => grouping.Count() == n).Select(grouping => grouping.Key);
foreach (var custid in custids)
Console.WriteLine(custid);
}
}
}
发布于 2021-11-13 19:10:40
您应该能够通过一个查询做到这一点:
var n = OrdersTable
.GroupBy(order => order.Custid)
.Select(grouping => new { CustomerId = grouping.Key, OrderCount = grouping.Count() })
.OrderByDescending(g => g.OrderCount)
.ToList();
var maxCount = n.First().OrderCount;
var custIds = n.Where(g => g.OrderCount == maxCount)
.Select(g => g.CustomerId)
.ToList();
在OrdersTable是DbContext DBSet的情况下,这将导致对数据库的1次查询。匹配该计数的返回项的检查是在内存中根据结果完成的,而不需要返回。
如果您正在处理一个特别大的数据集,您可以考虑一些合理的假设,例如,如果有数千个客户,那么最大的订单数可能是100还是1000。
var n = OrdersTable
.GroupBy(order => order.Custid)
.Select(grouping => new { CustomerId = grouping.Key, OrderCount = grouping.Count() })
.OrderByDescending(g => g.OrderCount)
.Take(100)
.ToList();
if (n.All(g => g.OrderCount == n.First().OrderCount)
// Redo query with larger threshold.
如果所有返回的行恰好具有相同的顺序计数,它将再次运行查询。您可能希望以不同的方式处理这种情况,例如,当所有客户都有0个订单时。(如果在这种情况下,您有选择要比较的客户的标准,例如按城市/州等)
由于有一个非常大的数据表要覆盖,另一个选择是查询最大计数,前提是您的实体配置了导航属性,以便您的客户实体可以与其Orders集合相关:
var maxOrderCount = dbContext.Customers
.OrderByDescending(x => x.Orders.Count)
.Select(x => x.Orders.Count)
.First();
var customers = dbContext.Customers
.Where(x => x.Orders.Count == maxOrderCount)
.ToList();
如果您只需要客户ID,则在ToList()
之前添加.Select(x => x.CustomerId)
。它运行两个查询,但它们是相当简单的查询,并且只返回所需的数据,而不是可能返回所有客户ID/数据。
https://stackoverflow.com/questions/69959838
复制相似问题