Przygody z oprogramowaniem
  • Start
  • Szkolenia
    • Szkolenia otwarte
    • Katalog szkoleń
  • Usługi
    • Konsulting
    • Mentoring
    • Research & Development
  • Blog
  • Wiedza
    • Strefa wiedzy
    • BFsharp
    • SaaS
  • Klienci
  • Kontakt
0

Conditional Aggregation

23 March, 2007-SQL

Problem

  • We want to count rows based on two distinct criteria.
  • We want to calculate 2 or more aggreagates over overlapping sets.

Setup script

USE tempdb;
CREATE TABLE t1
(
     Id INT IDENTITY PRIMARY KEY,
     Date datetime,
     Number int,
     Flag tinyint,
);

BEGIN TRAN;
SET NOCOUNT ON;
DECLARE @i INT;
SET @i = 0;
WHILE ( @i 

Sample 1

Return row count where Number3 = 1 and row count where Number3 = 2.
The simpliest solution is to write

SELECT COUNT(*) FROM t1 WHERE Flag = 1;
SELECT COUNT(*) FROM t1 WHERE Flag = 2;

However, this approach scans the base table 2x times, yielding excessive IO – 2x 324 page reads on my machine.

Sample 2

Sum Number column for the following time spans: after April, 2008; after April 2009; after April 2010.

SELECT SUM(Number) FROM t1 WHERE Date > '2008-04-30';
SELECT SUM(Number) FROM t1 WHERE Date > '2009-04-30';
SELECT SUM(Number) FROM t1 WHERE Date > '2010-04-30';

The optimalizer used index seek this time, however still the same data are scaned multiple times (193 + 118 + 44 page reads).

Solution

Our aim is to scan the required rows only once, regardless of how many aggregates we want to calculate.
What would happen if we query the data and conditionaly include them in aggregates?

Sample 1 – solution

SELECT COUNT(CASE WHEN Flag = 1 THEN 1 END) as 'Flag1',
     COUNT(CASE WHEN Flag = 2 THEN 1 END) as 'Flag2'
FROM t1

Only one table scan – 324 page reads.

Sample 2 – solution

SELECT SUM(CASE WHEN Date > '2008-04-30' THEN Number END),
     SUM(CASE WHEN Date > '2009-04-30' THEN Number END),
     SUM(CASE WHEN Date > '2010-04-30' THEN Number END)
FROM t1;

Now, data is scaned once with 274 page reads. However, we can optimalize it further. All we need to do is to restrict the set to the rows needed by the largest date range.

SELECT SUM(CASE WHEN Date > '2008-04-30' THEN Number END),
     SUM(CASE WHEN Date > '2009-04-30' THEN Number END),
     SUM(CASE WHEN Date > '2010-04-30' THEN Number END)
FROM t1
WHERE Date > '2008-04-30';

Great – 193 page reads.

Sumary

Conditional aggregation can be used to:

  • optimalize overlapping aggregates which normally would require multiple data passes.
  • non-overlaping aggregates grouped by a column with low selectivity.
  • create custom aggregates.
Czytaj dalej | Dyskutuj
0

Set-based string concatenation

4 January, 2007-SQL

Today, one of my colleagues asked me how to concat numbers from one column into a single string not using cursors. Motivated by his question I decided to write my first Sql-related post.

I’ll present 2 or 3 methods to do this, but first create our sample table:

use tempdb;
create table test
(
    number int
);

insert test values (1);
insert test values (2);
insert test values (3);
insert test values (11);
insert test values (6);

1. First method requires a variable to which the numbers will be appended.

declare @text varchar(100);
set @text = '';

select @text = @text + ',' + cast( number as varchar(11) )
from test
order by number;

select stuff( @text, 1, 1, '' );

We must initialize it with an empty string, otherwise we’ll get null ( unless CONCAT_NULL_YIELDS_NULL is off ). Then we simply remove the preceding comma and our string is ready.

2. We can use coalesce as well

declare @text varchar(100);

select @text = coalesce( @text + ',', '' ) + cast( number as varchar(11) )
from test
order by number;

select @text;

3. Both of the presented techniques works in sql 2000, however in sql 2005 we can make use of the new FOR XML PATH clause:

select stuff(
    (select ',' + cast( number as varchar(11) ) as 'text()'
     from test
     order by number
     for xml path('')
), 1, 1, '' );

We specify row element to be empty ( path(”) ) and the number to be treated as a text node. At the end we only need to remove the first comma. Thanks to this trick we’ll get what we need.

Czytaj dalej | Dyskutuj
  1 2

Kategorie

Architecture BFsharp Blog Business Framework C# CqRS DDD Debugging DSL EntityFramework Formula JavaScript Linq NHibernate SaaS Silverlight SQL Visual Studio WPF Wzorce

O mnie


RSS Feed

© macmichal.pl 2011 Wszystkie prawa zastrzeżone