DOCUMENT #680
     =======================================================================
     BASIC STATISTICS:  COUNT, AVERAGE, SUM, MINIMUM, MAXIMUM
     =======================================================================
     PRODUCT:  R:BASE                   VERSION :  3.1 or Higher
     =======================================================================
     CATALOG:  Programming In R:BASE    AREA    :  Other
     =======================================================================
 
 
     R:BASE contains over 70 SuperMath functions for computations in addition 
     to the simple addition, subtraction, multiplication and division oper-
     ations. However, the SuperMath functions don't apply when doing a count, 
     average, sum, minimum, or maximum on a group of rows from a table.
 
     The SuperMath functions LAVG, LMIN, and LMAX work across a row, i.e. 
     they compute the average, minimum and maximum column values for a 
     group of columns from a given row, they do not return the average, 
     minimum or maximum value for a group of rows from a given column. Use 
     the AVG, MIN or MAX function with the SELECT or COMPUTE command to 
     return the value for a column. The LAVG, LMIN and LMAX SuperMath 
     functions are not interchangeable with the AVG, MIN and MAX functions. 
     The SELECT and COMPUTE commands also have COUNT and SUM functions.
 
 
     In addition to the functions used with SELECT and COMPUTE, count, 
     average, sum, minimum, and maximum are calculated using a row-by-row 
     processing method and the R:BASE programming language. A program does 
     row-by-row processing using DECLARE CURSOR or performs the calculations 
     with one command using SELECT or COMPUTE. Note that COMPUTE does provide 
     additional calculations not available with SELECT: standard deviation 
     and variance.
 
     A report is processed row-by-row; expressions are evaluated with each 
     row that is read from the driving table or view. A report uses basically 
     the same procedure to calculate the count, average, sum, minimum, or m
     aximum as in a program. Functions in the SELECT command can also be used 
     in reports, but are evaluated on every row, not just once. This can 
     slow down a report. The COMPUTE command functions are not available in 
     reports.
 
     
     Before you start
     ================
     Before you start, you need to know a little bit about your data. Does 
     the column you're doing computations on contain NULL values? A NULL 
     value is how R:BASE represents no data entered in a field. The SELECT 
     and COMPUTE functions do not include NULL values in the computation, 
     but with row-by-row processing methods you need to consider NULLs and 
     explicitly exclude them, if necessary, to return a correct result.
 
     Also, when doing row-by-row processing, you must SET ZERO ON. The 
     setting of ZERO tells R:BASE how to treat NULLs in calculations. If 
     ZERO is OFF, then calculations involving NULLs return NULL. When ZERO 
     is ON, NULLs are treated as if they were 0. SET ZERO ON to ensure totals 
     accumulate correctly and comparisons evaluate properly.
 
     The count, average, sum, minimum, and maximum described below work on 
     INTEGER, REAL, DOUBLE or CURRENCY datatype columns. They can be done on 
     all the rows in a table or on a group of rows by adding an appropriate 
     WHERE clause to the command. In a report, rows are grouped by using 
     breakpoints. 
 
 
 
     COUNT
     =====
     
     Using one command
     -----------------
     SELECT COUNT(*) FROM tblname
     - counts all the rows in a table, including NULL values
     - same as COMPUTE ROWS FROM tblname
     - can be used in a report, is evaluated for every row
 
     
     SELECT COUNT(colname) FROM tblname
     - counts all the rows where the specified column is not NULL
     - store the result in a variable
     SELECT COUNT(colname) INTO varname FROM tblname
     - same as COMPUTE COUNT colname FROM tblname
     - can be used in a report, is evaluated for every row; format is similar 
       to a lookup expression
     varname INTEGER = COUNT(colname) IN tblname WHERE linkcol = linkcol
 
 
     Using row-by-row processing
     ---------------------------
     SET VAR vcount INTEGER = (.vcount + 1)
     - often called a counter variable
     - used in a program in WHILE or other looping structure
     - counts the number of rows processed
     - increments itself by one as each row is read
     - counts all rows, including NULLs
     - must use SET ZERO ON to accumulate the count
     - in a report is used without the SET VAR
     vcount INTEGER = (.vcount + 1)
 
     
     Conditional count
     -----------------
     SET VAR vcount INTEGER = (IFEQ(name,value,(.vcount+1),.vcount))
     - compares name and value, if they are equal the count is incremented 
       by one; name and value can be columns (reports only), variables or 
       constants
     - NULL values are considered equal
     - function can be nested if name and value are non-TEXT datatypes
     - use in a report without the SET VAR 
     vcount INTEGER = (IFEQ(name,value,(.vcount+1),.vcount))
 
 
     SUM
     ===
 
     Using one command
     -----------------
     SELECT SUM(colname) FROM tblname
     - calculates the sum of all non-NULL values in the specified column
     - store the result in a variable
     SELECT SUM(colname) INTO varname FROM tblname
     - variable is automatically datatyped based on the column
     - same as COMPUTE SUM colname FROM tblname
     - can be used in a report, is evaluated for every row, format is 
       similar to a lookup expression
       varname = SUM(colname) IN tblname WHERE linkcol = linkcol
 
     
     Using row-by-row processing
     ---------------------------
     SET VAR vtotal  = (.vtotal + value)
     - must initialize the variables
     - often called an accumulator variable or a running total
     - value can be a column (reports only), variable or constant
     - adds value to the variable vtotal on a row-by-row basis
     - must use SET ZERO ON to accumulate the total
     - use in a report without the SET VAR
     vtotal  = (.vtotal + value)
 
     Conditional sum
     ---------------
     SET VAR vtotal  = (IFEQ(name, value1,(.vtotal + value2),
     .vtotal))
     - compares name and value1, if they are equal, adds value2 to the 
       variable vtotal
     - name, value1 and value2 can be columns (reports only), variables or 
       constants
     - function can be nested if name and value1 are non-TEXT datatypes
     - use in a report without the SET VAR 
     vtotal  = (IFEQ(name, value1,(.vtotal + value2),.vtotal))
 
 
     AVERAGE
     =======
 
     Using one command
     -----------------
     SELECT AVG(colname) FROM tblname 
     - calculates the average of all non-NULL values in the specified column
     - store the result in a variable
     SELECT AVG(colname) INTO varname FROM tblname
     - variable is automatically datatyped based on the column
     - same as COMPUTE AVG colname FROM tblname
     - can be used in a report, is evaluated for every row, format is similar 
       to a lookup expression
     varname = AVG(colname) IN tblname WHERE linkcol = linkcol
 
     
     Using row-by-row processing
     ---------------------------
     An average is a sum divided by a count. First count the rows, then sum 
     the rows, then divide the sum by the count. The sum and the count can 
     be the result of SELECT functions, of row-by-row processing, or of a 
     conditional operation. If there are NULLs in the data, be sure to 
     exclude them from the count. If you don't, the average will be wrong. 
     They will be counted and included in the average as zero values.
 
     SELECT COUNT(*) INTO var_count FROM tblname
     SELECT SUM(colname) INTO var_sum FROM tblname
     SET VAR var_avg = (.var_sum/.var_count)
 
 
     MINIMUM
     =======
 
     Using one command
     -----------------
     SELECT MIN(colname) FROM tblname
     - calculates the minimum value of all non-NULL values in the specified 
       column
     - store the result in a variable
     SELECT MIN(colname) INTO varname FROM tblname
     - variable is automatically datatyped based on the column
     - same as COMPUTE MIN colname FROM tblname
     - can be used in a report, is evaluated for every row, format is similar 
       to a lookup expression
     variable = MIN(colname) IN tblname WHERE linkcol = linkcol
 
     Using row-by-row processing
     ---------------------------
     In a program, you get the first row from the table, then compare that 
     value with each succeeding row. If the new value is less, it becomes 
     the minimum. For example,
 
       SET VAR vmin = colname IN tblname WHERE LIMIT=1
       DECLARE c1 CURSOR FOR SELECT colname FROM tblname
       OPEN c1
       FETCH c1 INTO vcol ind1
       WHILE SQLCODE <> 100 THEN
         SET VAR vmin = (IFLT(.vcol,.vmin,.vcol,.vmin))
         FETCH c1 INTO vcol ind1
       ENDW
 
     In a report, the process is similar, but because you can only set 
     variables, the logic is a little bit different. Use a counter variable 
     to determine the initial value; if the counter is 1, it is the first 
     row. Then, compare each row, replacing the minimum if the new value is 
     smaller. If computing the minimum for each group, reset the variable 
     vcount at the breakpoint.
 
       vcount INTEGER = (.vcount + 1)
       vmin  = (IFEQ(.vcount,1,colname,(IFLT(colname,.vmin,
       colname,.vmin)) ))
 
 
     MAXIMUM
     =======
 
     SELECT MAX(colname) FROM tblname
     - calculates the maximum of all non-NULL values in the specified column
     - store the result in a variable
     SELECT MAX(colname) INTO varname FROM tblname
     - variable is automatically datatyped based on the column
     - same as COMPUTE MAX colname FROM tblname
     - can be used in a report, is evaluated every row, format is similar to a 
lookup expression
     varname = MAX(colname) IN tblname WHERE linkcol = linkcol
 
     Using row-by-row processing
     ---------------------------
     In a program you get the first value from the table, then compare that 
     with each succeeding row. If the new value is greater, it becomes the 
     maximum For example,
 
       SET VAR vmax = colname IN tblname WHERE LIMIT=1
       DECLARE c1 CURSOR FOR SELECT colname FROM tblname
       OPEN c1
       FETCH c1 INTO vcol ind1
       WHILE SQLCODE <> 100 THEN
         SET VAR vmax = (IFGT(.vcol,.vmax,.vcol,.vmax))
         FETCH c1 INTO vcol ind1
       ENDW
 
     In a report, the process is similar, but because you can only set 
     variables, the logic is a little bit different. Use a counter variable 
     to determine the initial value, if the counter is 1, it is the first 
     row. Then compare each row, replacing the maximum if the new value is 
     greater. If computing the maximum for each group, reset the variable 
     vcount at the breakpoint.
 
       vcount INTEGER = (.vcount + 1)
       vmax  = (IFEQ(.vcount,1,colname,(IFGT(colname,.vmax,
       colname,.vmax)) ))