Sql Excel : Subqueries for Handling Summaries

The most typical place for a subquery is as a replacement for a table in the FROM
clause. After all, the source is a table and a query essentially returns a table, so it
makes a lot of sense to combine queries in this way. From the dataflow
perspective, this use of subqueries is simply to replace one of the sources with a
series of dataflow nodes.
Consider the question: How many zip codes in each state have a population
density greater than the average zip code population density in the state? The
population density is the population divided by the land area, which is in the
column landsqmi.
Lets think about the different data elements needed to answer the question. The
comparison is to the average zip code population density within a state, which is
easily calculated:
SELECT zc.stab, AVG(totpop / landsqmi) as avgpopdensity
FROM ZipCensus zc
WHERE zc.landsqmi > 0
GROUP BY zc.stab
Next, the idea is to combine this information with the original zip code
information in the FROM clause:
SELECT zc.stab, COUNT(*) as numzips,
SUM(CASE WHEN zc.popdensity > zcsum.avgpopdensity
THEN 1 ELSE 0 END) as numdenser
FROM (SELECT zc.*, totpop / landsqmi as popdensity
FROM ZipCensus zc
WHERE zc.landsqmi > 0
) zc JOIN
(SELECT zc.stab, AVG(totpop / landsqmi) as avgpopdensity
FROM ZipCensus zc
WHERE zc.landsqmi > 0
GROUP BY zc.stab) zcsum
ON zc.stab = zcsum.stab
GROUP BY zc.stab

An interesting observation is that the population density of each state is not the
same as the average of the population densities for all the zip codes in the state.
That is, the preceding question is different from: How many zip codes in each
state have a population density greater than the state s population density? The
state s population density would be calculated in zcsum as:
SUM(totpop) / SUM(landsqmi) as statepopdensity
There is a relationship between these two densities. The zip code average gives
each zip code a weight of 1, no matter how big in area or population. The state
average is the weighted average of the zip codes by the land area of the zip codes.
The proportion of zip codes that are denser than the average zip code varies from
about 4% of the zip codes in North Dakota to about 35% in Florida. Never are half
the zip codes denser than the average, although this is theoretically possible. The
density where half the zip codes are denser and half less dense is the median
density rather than the average or average of averages. Averages, average of
averages, and medians are different from each other