如何创建大小均匀的组

我有一张桌子,按日期提供汽车信息。

car_id date  1 20160101 2 20160102 1 20160103 2 20160104 1 20160105 1 20160106 1 20160107 1 20160108 1 20160109 1 20160110 

如果汽车有<= 5条记录,则只有一组。 CASE car_id = 2

如果汽车有<= 10条记录,那么将有两个均匀分组。 CASE car_id = 1按日期CASE car_id = 1

  • 20160101 – 20160106第GROUP 1
  • 20160107 – 20160110 GROUP 2

如果汽车有’> 10’那么将有三个均匀分布的组。

渴望输出。

 car_id date group_id 1 20160101 1 1 20160103 1 1 20160105 1 1 20160106 1 ------------------------ 1 20160107 2 1 20160108 2 1 20160109 2 1 20160110 2 ------------------------ 2 20160102 1 2 20160104 1 

我尝试使用ntile()但不能使组号为dinamic。

SQL小提琴演示

 SELECT car_id, "date", ntile(3) over (partition by car_id order by "date") as group_id FROM Table1 

如果可以直接在C# LINQ上完成加分点,否则我在postgres上创建一个函数。

aditional info我将显示不同颜色(组)的历史汽车信息所以具有少量数据的汽车将以单一颜色显示。 最大颜色数量为3。

你可以使用ntile

 SELECT car_id, "date", ntile(CASE WHEN c <= 5 THEN 1 WHEN c <= 10 THEN 2 ELSE 3 END) OVER (PARTITION BY car_id ORDER BY "date") AS group_id FROM (SELECT car_id, "date",COUNT(*) OVER(PARTITION BY car_id) AS c FROM Table1) AS s 

SqlFiddleDemo

输出:

 ╔════════╦══════════╦══════════╗ ║ car_id ║ date ║ group_id ║ ╠════════╬══════════╬══════════╣ ║ 1 ║ 20160101 ║ 1 ║ ║ 1 ║ 20160103 ║ 1 ║ ║ 1 ║ 20160105 ║ 1 ║ ║ 1 ║ 20160106 ║ 1 ║ ║ 1 ║ 20160107 ║ 2 ║ ║ 1 ║ 20160108 ║ 2 ║ ║ 1 ║ 20160109 ║ 2 ║ ║ 1 ║ 20160110 ║ 2 ║ ║ 2 ║ 20160102 ║ 1 ║ ║ 2 ║ 20160104 ║ 1 ║ ╚════════╩══════════╩══════════╝ 

我会使用row_number()count()手动计算组:

 select t1.*, (case when cnt <= 5 then 1 when car_id * 2 <= cnt then 1 else 2 end) as grp from (select t1.*, row_number() over (partition by car_id order by date) as seqnum, count(*) over (partition by car_id) as cnt from table1 ) t where cnt <= 10 order by car_id, grp, date; 

SQL小提琴演示

正如戈登所说。

  • 首先计算每个car_id的日期数
  • 然后根据cnt分配我想要的组数
  • 然后使用groups来进行ntile(grp)分析函数

 WITH car_dates_count as ( select t1.*, count(*) over (partition by car_id) as cnt from table1 t1 ), car_groups as ( select cdc.*, (case when cnt <= 5 then 1 when cnt <= 10 then 2 else 3 end) as grp from car_dates_count cdc ) SELECT *, ntile(grp) over (partition by car_id order by "date") as group_id FROM car_groups; 

OUTPUT

 | car_id | date | cnt | grp | group_id | |--------|----------|-----|-----|----------| | 1 | 20160101 | 9 | 2 | 1 | \ | 1 | 20160102 | 9 | 2 | 1 | | | 1 | 20160103 | 9 | 2 | 1 | | | 1 | 20160104 | 9 | 2 | 1 | | | 1 | 20160105 | 9 | 2 | 1 | |=> (cnt 9 <= 10) Mean two groups | 1 | 20160106 | 9 | 2 | 2 | | | 1 | 20160107 | 9 | 2 | 2 | | | 1 | 20160108 | 9 | 2 | 2 | | | 1 | 20160109 | 9 | 2 | 2 | / -------------------------------------------- | 2 | 20160101 | 5 | 1 | 1 | \ | 2 | 20160102 | 5 | 1 | 1 | | | 2 | 20160103 | 5 | 1 | 1 | |=> (cnt 5 <= 5) Mean one group | 2 | 20160104 | 5 | 1 | 1 | | | 2 | 20160105 | 5 | 1 | 1 | / -------------------------------------------- | 3 | 20160101 | 16 | 3 | 1 | \ | 3 | 20160102 | 16 | 3 | 1 | | | 3 | 20160103 | 16 | 3 | 1 | | | 3 | 20160104 | 16 | 3 | 1 | | | 3 | 20160105 | 16 | 3 | 1 | | | 3 | 20160106 | 16 | 3 | 1 | | | 3 | 20160107 | 16 | 3 | 2 | | | 3 | 20160108 | 16 | 3 | 2 | |=> (cnt 16 > 10) Mean three groups | 3 | 20160109 | 16 | 3 | 2 | | | 3 | 20160110 | 16 | 3 | 2 | | | 3 | 20160111 | 16 | 3 | 2 | | | 3 | 20160112 | 16 | 3 | 3 | | | 3 | 20160113 | 16 | 3 | 3 | | | 3 | 20160114 | 16 | 3 | 3 | | | 3 | 20160115 | 16 | 3 | 3 | | | 3 | 20160116 | 16 | 3 | 3 | / 

假设您已按如下方式定义Car类,则可以使用Linq执行此操作。

 public class Car { public int car_id; public DateTime date; // additional info } 

我们需要两次应用分组,一次是在car_id上, car_id一次是在均匀分割为多个组时。 我更喜欢使用Linq

  var grouped = cars.GroupBy(c=>c.car_id) .Select(c => new { car_grp_id = c.Key, splits = c.Select((s,i)=> new { grp_id = i/(c.Count() <= 5 ? 5 :(c.Count() %2 ==0)? c.Count() /2 : (c.Count() /3 +1)), item = s }), }) .Select(s=> new { grouponcars = s.splits.GroupBy(g=>g.grp_id) .Select(x=> new { group_id = x.Key, cars = x.Select(y=>y.item) }) }) .ToList(); 

产量

 groupid : 0 - Car_Id : 1 - Date -1/1/2016 12:00:00 AM groupid : 0 - Car_Id : 1 - Date -1/3/2016 12:00:00 AM groupid : 0 - Car_Id : 1 - Date -1/5/2016 12:00:00 AM groupid : 0 - Car_Id : 1 - Date -1/6/2016 12:00:00 AM ---------------------------------------- groupid : 1 - Car_Id : 1 - Date -1/7/2016 12:00:00 AM groupid : 1 - Car_Id : 1 - Date -1/8/2016 12:00:00 AM groupid : 1 - Car_Id : 1 - Date -1/9/2016 12:00:00 AM groupid : 1 - Car_Id : 1 - Date -1/10/2016 12:00:00 AM ---------------------------------------- groupid : 0 - Car_Id : 2 - Date -1/2/2016 12:00:00 AM groupid : 0 - Car_Id : 2 - Date -1/4/2016 12:00:00 AM 

检查此Demo