841.TXT
     =====================================================================
     Multi-Level Grouping with SELECT
     =====================================================================
     PRODUCT:  R:BASE                   VERSION:  5.5 or Higher
     =====================================================================
     CATALOG:  Programming in R:BASE    AREA   :  SELECT
     =====================================================================
 
     Database queries don't always start off by sounding difficult. Many
     times the query sounds fairly simple, even trivial, and then you find
     yourself spending hours, or even days, trying to come up with that
     special SELECT command to return the desired data. Not every database
     query can be solved using a single SELECT command. Sometimes the
     solution can be found easier and faster by using a view.
 
     One database problem that sounds simple but isn't, is when you need
     to do multiple level grouping of data. Often this type of problem
     involves counting data or putting some type of a qualifier on a group
     of data. For example, in a mailing application, you may need to look
     at the number of pieces to be mailed for a particular zipcode and
     route combination; if there are more than ten pieces for a group then
     special handling is required. Maybe you want to know which customers
     have placed orders from at least three different sales reps. Or, a
     utility company charges different rates depending on the usage of
     electricity. You need to generate a list of customers who have been
     charged at least two different rates in a month. None of these
     problems sound difficult when stated like this, but when you try to
     build a query, you find out that the problem is more difficult than
     it sounds. The difficulty with building the query is that what really
     needs to be done is to use a GROUP BY clause on the results of a GROUP
     BY clause. This is easily solved by using views.
 
     Let's look more closely at the power utility application. We have a
     table, usage, with account and usage information, and a table of rate
     information, rates. Here's some sample data:
 
     The usage table:
 
      Account# ServiceDate HoursUsed
      -------- ----------- ----------
      AA-0001  08/09/96         22
      AA-0002  08/10/96         13
      AA-0001  08/11/96         16
      AA-0002  08/11/96         11
      AA-0001  08/12/96         18
      AA-0003  08/10/96          9
      AA-0003  08/09/96          7
 
     The rates table:
 
      MinHours   MaxHours   RatePerHour
      ---------- ---------- ---------------
               0         10          $15.00
              10         20          $14.50
              20         50          $14.00
              50        100          $13.00
             100    1000000          $12.00
 
     First, we need to join the two tables and add the rate per hour
     information to the rest of the account and usage information. We can
     use a view to do this. The view then has all the relevant
     information. This is what our view definition would be:
 
     CREATE VIEW rates_view AS +
     SELECT Account#, ServiceDate, HoursUsed, RatePerHour +
     FROM usage, rates +
     WHERE HoursUsed BETWEEN MinHours AND MaxHours
 
     The data contained in the view now includes the rate each account
     is charged per day depending on usage. Because this is a small
     amount of data, we can look at the data and see that only account
     #AA-0001 has been charged more than one rate in August.
 
      Account# ServiceDate HoursUsed  RatePerHour
      -------- ----------- ---------- ---------------
      AA-0001  08/09/96         22          $14.00
      AA-0001  08/11/96         16          $14.50
      AA-0001  08/12/96         18          $14.50
      AA-0002  08/10/96         13          $14.50
      AA-0002  08/11/96         11          $14.50
      AA-0003  08/09/96          7          $15.00
      AA-0003  08/10/96          9          $15.00
 
     But assuming we have hundreds, or even thousands, of rows, what
     command will give us the information we need? Using the GROUP BY
     option of the SELECT command, we can group the data in the view by
     account number and rate. 
 
     SELECT Account#, RatePerHour FROM rates_view +
     GROUP BY Account#, RatePerHour
 
     Account# RatePerHour
     -------- ---------------
     AA-0001           $14.00
     AA-0001           $14.50
     AA-0002           $14.50
     AA-0003           $15.00
 
     This shows us the accounts that are charged more than one rate, but
     the results include all accounts. We only want to display the
     accounts charged more than one rate. Up to this point the query is
     fairly straightforward. Most users with some knowledge of SQL would
     be able to generate this list. The difficulty comes when trying to
     restrict the list to just accounts charged more than one rate. Any
     number of HAVING clauses and sub-SELECTS can be added to the above
     SELECT command, but turning the SELECT command into a view makes the
     problem simple. Use this command to generate the view:
 
     CREATE VIEW Rates_View2 AS +
     SELECT Account#, RatePerHour FROM rates_view +
     GROUP BY Account#, RatePerHour
 
    One feature of the GROUP BY option of SELECT is the HAVING clause. The
    HAVING clause is a way to specify conditions on groups, much like the
    WHERE clause specifies conditions on rows. The HAVING clause counts
    the rows in a group, for example, and only returns data where there
    is more than one row in the group. We can't use a HAVING clause on the
    original SELECT command because each group is for a single rate;
    remember the data is grouped by both account number and rate. What we
    can do is take the results of this SELECT (now a view) and group the
    data just by account number.
 
     SELECT Account#, COUNT(*) FROM Rates_View2 +
     GROUP BY Account#
 
      Account# COUNT (*)
      -------- ----------
      AA-0001           2
      AA-0002           1
      AA-0003           1
 
     Note that only the account charged more than one rate, AA-0001, has
     more than one row in the view data and has a count greater than one.
     Adding a HAVING clause onto this SELECT returns just the data we
     need.
 
     SELECT Account# FROM Rates_View2 +
     GROUP BY Account# HAVING COUNT(*) > 1
 
      account#
      --------
      AA-0001
 
     Only the one account that was charged two different rates is
     returned. By using sequences of views we are able to get the
     information we need from the database quickly and easily. Once we
     can retrieve the account numbers, the SELECT command can be used in
     a sub-SELECT with other R:BASE commands to edit data or print
     reports.