
Download free books at BookBooN.com
An Introduction to Relational Database Theory
 
134 
Building on The Foundation
Definition of SUMMARIZE PER 
 
SUMMARIZE r1 PER ( r2 ) ADD ( sum1 AS a1, …, sumn AS an ), where: 
x  r1 and r2 are relations such that r1 JOIN r2 is defined,  
x  a1, …, an are attribute names not used in the heading of r1, and 
x  sum1, …, sumn are summaries 
is equivalent to 
EXTEND r2 ADD ( xsum1 AS a1, …, xsumn AS an ) 
where each of xsum1, …, xsumn is an aggregate operator invocation such that: 
x  its relation operand is given by  
(RELATION { TUPLE { b1 b1, …, bm bm } } COMPOSE r1 }, 
where b1, …, bm are the attributes of r2 
x  the aggregate operator and the remaining operands, if any, are as 
specified in the corresponding summaries sum1, …, sumn. 
 
We can also use SUMMARIZE PER to obtain the result of Example 5.8, the average mark for each exam, 
as shown in Example 5.10. As in Example 5.8, we must restrict ourselves to the CourseId values 
appearing in EXAM_MARK. 
 
Example 5.10: Average mark for each exam, using SUMMARIZE … PER … 
SUMMARIZE EXAM_MARK PER ( EXAM_MARK { CourseId } ) 
         ADD ( AVG ( mark ) AS AvgMark ) 
 
Here the relation being summarized is the same as the relation providing the PER values. That is very 
commonly the case in practice, sufficiently so to perhaps warrant a further shorthand, and Tutorial D does 
in fact provide one in the form of SUMMARIZE … BY …, as illustrated in Example 5.11. 
 
Example 5.11: Average mark for each exam, using SUMMARIZE … BY … 
SUMMARIZE EXAM_MARK BY { CourseId } 
          ADD ( AVG ( mark ) AS AvgMark ) 
In case the shorthands offered by SUMMARIZE … BY … hardly seem worth the addition to the language, 
consider that the SUMMARIZE operand can be a relation expression of any complexity. To be sure, writing 
it out twice is not much of a problem these days, thanks to copy-and-paste, but reading it twice and 
noticing the special case might be rather burdensome, both for human and computer (for Example 5.10 is 
likely to take significantly longer than Example 5.11 to compute unless the DBMS can notice that the PER 
operand in 5.10 is a projection of the SUMMARIZE operand).