841.TXT
=====================================================================
Multi-Level Grouping with SELECT
=====================================================================
PRODUCT: R:BASE VERSION: 5.5 or Higher
=====================================================================
CATALOG: Programming in R:BASE AREA : SELECT
=====================================================================
Database queries don't always start off by sounding difficult. Many
times the query sounds fairly simple, even trivial, and then you find
yourself spending hours, or even days, trying to come up with that
special SELECT command to return the desired data. Not every database
query can be solved using a single SELECT command. Sometimes the
solution can be found easier and faster by using a view.
One database problem that sounds simple but isn't, is when you need
to do multiple level grouping of data. Often this type of problem
involves counting data or putting some type of a qualifier on a group
of data. For example, in a mailing application, you may need to look
at the number of pieces to be mailed for a particular zipcode and
route combination; if there are more than ten pieces for a group then
special handling is required. Maybe you want to know which customers
have placed orders from at least three different sales reps. Or, a
utility company charges different rates depending on the usage of
electricity. You need to generate a list of customers who have been
charged at least two different rates in a month. None of these
problems sound difficult when stated like this, but when you try to
build a query, you find out that the problem is more difficult than
it sounds. The difficulty with building the query is that what really
needs to be done is to use a GROUP BY clause on the results of a GROUP
BY clause. This is easily solved by using views.
Let's look more closely at the power utility application. We have a
table, usage, with account and usage information, and a table of rate
information, rates. Here's some sample data:
The usage table:
Account# ServiceDate HoursUsed
-------- ----------- ----------
AA-0001 08/09/96 22
AA-0002 08/10/96 13
AA-0001 08/11/96 16
AA-0002 08/11/96 11
AA-0001 08/12/96 18
AA-0003 08/10/96 9
AA-0003 08/09/96 7
The rates table:
MinHours MaxHours RatePerHour
---------- ---------- ---------------
0 10 $15.00
10 20 $14.50
20 50 $14.00
50 100 $13.00
100 1000000 $12.00
First, we need to join the two tables and add the rate per hour
information to the rest of the account and usage information. We can
use a view to do this. The view then has all the relevant
information. This is what our view definition would be:
CREATE VIEW rates_view AS +
SELECT Account#, ServiceDate, HoursUsed, RatePerHour +
FROM usage, rates +
WHERE HoursUsed BETWEEN MinHours AND MaxHours
The data contained in the view now includes the rate each account
is charged per day depending on usage. Because this is a small
amount of data, we can look at the data and see that only account
#AA-0001 has been charged more than one rate in August.
Account# ServiceDate HoursUsed RatePerHour
-------- ----------- ---------- ---------------
AA-0001 08/09/96 22 $14.00
AA-0001 08/11/96 16 $14.50
AA-0001 08/12/96 18 $14.50
AA-0002 08/10/96 13 $14.50
AA-0002 08/11/96 11 $14.50
AA-0003 08/09/96 7 $15.00
AA-0003 08/10/96 9 $15.00
But assuming we have hundreds, or even thousands, of rows, what
command will give us the information we need? Using the GROUP BY
option of the SELECT command, we can group the data in the view by
account number and rate.
SELECT Account#, RatePerHour FROM rates_view +
GROUP BY Account#, RatePerHour
Account# RatePerHour
-------- ---------------
AA-0001 $14.00
AA-0001 $14.50
AA-0002 $14.50
AA-0003 $15.00
This shows us the accounts that are charged more than one rate, but
the results include all accounts. We only want to display the
accounts charged more than one rate. Up to this point the query is
fairly straightforward. Most users with some knowledge of SQL would
be able to generate this list. The difficulty comes when trying to
restrict the list to just accounts charged more than one rate. Any
number of HAVING clauses and sub-SELECTS can be added to the above
SELECT command, but turning the SELECT command into a view makes the
problem simple. Use this command to generate the view:
CREATE VIEW Rates_View2 AS +
SELECT Account#, RatePerHour FROM rates_view +
GROUP BY Account#, RatePerHour
One feature of the GROUP BY option of SELECT is the HAVING clause. The
HAVING clause is a way to specify conditions on groups, much like the
WHERE clause specifies conditions on rows. The HAVING clause counts
the rows in a group, for example, and only returns data where there
is more than one row in the group. We can't use a HAVING clause on the
original SELECT command because each group is for a single rate;
remember the data is grouped by both account number and rate. What we
can do is take the results of this SELECT (now a view) and group the
data just by account number.
SELECT Account#, COUNT(*) FROM Rates_View2 +
GROUP BY Account#
Account# COUNT (*)
-------- ----------
AA-0001 2
AA-0002 1
AA-0003 1
Note that only the account charged more than one rate, AA-0001, has
more than one row in the view data and has a count greater than one.
Adding a HAVING clause onto this SELECT returns just the data we
need.
SELECT Account# FROM Rates_View2 +
GROUP BY Account# HAVING COUNT(*) > 1
account#
--------
AA-0001
Only the one account that was charged two different rates is
returned. By using sequences of views we are able to get the
information we need from the database quickly and easily. Once we
can retrieve the account numbers, the SELECT command can be used in
a sub-SELECT with other R:BASE commands to edit data or print
reports.