820.TXT
     =====================================================================
     Optimizing Application Code
     =====================================================================
     PRODUCT:  R:BASE                   VERSION:  4.5+ and above
     =====================================================================
     CATALOG:  General Inforation       AREA   :  Memory Management
     =====================================================================
 
     Because the most obvious or logical approach to a programming task is 
     not necessarily the fastest, optimizing application code is a 
     necessary skill. This article gives some specific examples of 
     different ways to accomplish the same tasks and shows the speed 
     differences between them. It also provides some general techniques for 
     improving the performance of your applications.
 
     How can you tell what is the fastest way to accomplish a programming 
     task? The only way to tell for sure is to try each command or set of 
     commands, time it, and compare the times. Often a sequence of commands 
     will be fast on one database but slow on another because the structure 
     (tables and columns) and data of the databases are different. Speed 
     can vary from database to database and from machine to machine.
 
     Times given below for different programming techniques are for 
     comparison purposes only. Actual times that you will see using the 
     same methods will vary on the basis of your hardware and your 
     database.
 
     Timing Code
     Use the following commands to find the execution time for a section of 
     application code: 
 
     SET TIME FORMAT HH:MM:SS.sss
     SET VAR vstart = .#time
     -- commands to time go here
     SET VAR vend = .#time
     SET VAR vdiff = (.vend - .vstart)
     SET VAR vtime = (RTIME(0,0,0,.vdiff))
 
     The result, stored in the variable vtime, is in an easy to read hours, 
     minutes, and seconds format.
 
     When checking the performance of a SELECT or a PRINT command, set the 
     output to a temporary file and set LINES to 0. This tests the time to 
     select all the data for the command. For example:
 
     SET LINES 0
     SET TIME FORMAT HH:MM:SS.sss
     SET VAR vstart = .#time
     OUTPUT temp.dat
       PRINT invoices
     OUTPUT SCREEN
     SET VAR vend = .#time
     SET VAR vdiff = (.vend - .vstart)
     SET VAR vtime = (RTIME(0,0,0,.vdiff))
     SET LINES 20
 
     To test how long it takes to retrieve a single row of data, use the 
     WHERE clause, WHERE LIMIT = 1. The LIMIT keyword can be combined with 
     other WHERE clause conditions to limit the number of rows retrieved. 
     For example:
 
     SELECT * FROM transmaster WHERE transdate +
       BETWEEN 1/1/95 AND 6/31/95 AND LIMIT=1
 
     This command retrieves the first row that meets the WHERE clause 
     condition.
 
     Use the Most Current Version of R:BASE Microrim is continually 
     optimizing R:BASE and each new release is faster than the previous 
     one. In addition to performance enhancements included in a new 
     release, there are feature enhancements that enable you to modify 
     and reduce your program code for further performance increases.
 
     An application written in R:BASE 3.1, for example, used DECLARE 
     CURSOR to update a column in one table with values from another 
     table. This process took 0:00:07.14. 
 
     DECLARE C1 CURSOR FOR SELECT invid,sum(tprice) FROM transx +
       GROUP BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1,vprice ind2
     WHILE sqlcode <> 100 THEN
       UPDATE invoice SET invtotal = .vprice WHERE invid=.vinvid
       FETCH c1 INTO vinvid ind1,vprice ind2
     ENDW
 
     R:BASE 4.0 added the multi-table option to the UPDATE command. 
     Instead of a DECLARE CURSOR, you now can do the update using a view 
     and one UPDATE command. By keeping abreast of feature enhancements in 
     new versions and updating application code, the process time was 
     reduced almost in half to 0:00:04.29.
 
     CREATE VIEW v_trans (invid, amount) AS SELECT invid, SUM(tprice) +
       FROM transx GROUP BY invid
     UPDATE invoice SET invtotal = amount FROM invoice ,v_trans T2 +
       WHERE invoice .invid = T2.invid
 
     Think Globally Instead of Row-by-Row
     One of the best ways to achieve significant speed improvements is to 
     use a global command rather than a row-by- row command. Instead of 
     doing row-by-row processing using a DECLARE CURSOR, for example, try 
     using a single UPDATE or INSERT command, which takes advantage of the 
     SQL select capabilities of R:BASE.
 
     For example, suppose you want to create a table to hold information 
     about how many of each product a customer has purchased. The most 
     logical approach might be to set up a cursor to go through the detail 
     table and count the rows for each customer, product combination and 
     then insert that data into the new table. On a table with 3500 rows, 
     processing all the rows took 1:15:00 (one hour, 15 minutes). By 
     replacing the DECLARE CURSOR loop with a single insert command, the 
     time required to process all the rows was cut to 0:09:00 (nine 
     minutes).
 
     Compare the two pieces of code:
 
     1. The DECLARE CURSOR code. Commands are repeated for each customer, 
     product combination.
 
     DECLARE c1 CURSOR FOR SELECT cust#, partno +
        FROM cproddet GROUP BY cust# ,partno
     OPEN c1
     FETCH c1 INTO vcust ind1, vpartno ind2
     WHILE SQLCODE <> 100 THEN
       SELECT COUNT(*) INTO vcount FROM cproddet +
         WHERE cust# = .vcust AND partno = .vpartno
      INSERT INTO custprod (cust#, partno, copies) +
         VALUES (.vcust, .vpartno, .vcount)
      FETCH c1 INTO vcust ind1, vpartno ind2
     ENDW
     DROP CURSOR c1
 
     2. The SQL code. A single command is used instead of a series of 
     repeated commands. One command is almost always faster than repeating 
     a group of commands.
 
     INSERT INTO custprod (cust#, partno, copies) +
       SELECT cust#, partno, count(*) +
       FROM cproddet GROUP BY cust# ,partno
 
     Reduce the Number of Expressions in Reports
     Because all report expressions are processed for each row of data read 
     from the driving table or view of the report, reducing the number of 
     expressions increases the printing speed of a report. An alternative 
     to reducing the number of expressions is to use the section evaluation 
     feature added in R:BASE 4.5 Plus!. This feature allows you to indicate 
     the section, F1, for example, at which an expression is evaluated. The 
     expression is then not evaluated at every row, but only when that 
     section is processed. Either method can reduce printing time almost by 
     half.
 
     If you have many lookups in a report, basing the report on a view 
     instead of on a table will usually be faster. A view combines all the 
     data just once; a lookup needs to find the data for each row in the 
     driving table, even if the lookup is printing only on a header line. 
     A sample report, based on a table with 8000 rows and containing 14 
     lookup expressions (24 total expressions), took 0:20:45.54 to print.
 
     As an alternative, create a four-table view and base the report on the 
     view. Remove the lookup expressions from the report, leaving only 10 
     expressions to be processed each row. Printing the report by using a 
     view instead of lookup expressions takes only 0:15:16.98, a definite 
     performance improvement.
 
     Using the report section evaluation feature introduced in 4.5 Plus! is 
     even better. The original report with 14 lookups prints in only 
     0:12:57.41 when the lookup expressions are specified to evaluate only 
     at the section (header or footer) where they are located. The report 
     based on the view does not have a significant performance increase 
     using the section evaluation as most of the expressions are summing 
     data and must be evaluated at the detail section. The performance 
     increase you see depends on your report design and data.
 
     R:BASE does not require a global variable to be equated to a report 
     variable; you can locate global variables on the report or reference 
     them directly in an expression. For example, you don't need to 
     define RPTDATE = . #DATE and then locate RPTDATE; just locate #DATE 
     directly. Set constant variables outside the report. They are 
     evaluated only once instead of for every row.
 
     Abbreviate Commands to Four Characters
     R:BASE parses command lines into 4-byte tokens_the fewer tokens to 
     read, the faster the command executes. Abbreviating commands to 
     three characters does not provide any additional performance.
 
     Combine Commands Whenever Possible
     Many R:BASE commands allow access to more than one column or variable 
     and don't require using separate commands for each. This reduces the 
     number of times R:BASE must parse and read a command. The fewer 
     commands to read, the faster R:BASE executes. The most frequently used 
     commands that fit into this category are SET VAR, UPDATE, SELECT, 
     COMPUTE, and WRITE.
 
     For example, assign data types to variables and set the variables to 
     values in one command, like this: 
 
     SET VAR v1 TEXT = 'abcd', v2 INTEGER = 100, +
             v3 DATE = .#date
 
     Make Cursors Faster
     Sometimes a task can be accomplished only by stepping through each row 
     in a table. If you need to use a DECLARE CURSOR routine, make it as 
     fast as possible by using the following tips:
 
       Include all columns and computations in the DECLARE CURSOR statement 
       so you can FETCH the values once instead of doing SET VAR commands 
       repeatedly in the WHILE loop. Take advantage of SQL SELECT when 
       doing your DECLARE CURSOR. This can reduce the number of times the 
       commands in the WHILE loop must be repeated, thus improving 
       performance. Compare the following two code excerpts:
 
     1.This example using the SUM function directly in the DECLARE CURSOR 
     statement took about three minutes on an 8,000-row transaction table:
 
     DECLARE C1 CURSOR FOR SELECT invid, SUM(tprice) +
       FROM transx GROUP BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1, vprice ind2
     WHILE sqlcode <> 100 THEN
       UPDATE invoice SET invtotal = .vprice +
         WHERE invid = .vinvid
       FETCH c1 INTO vinvid ind1, vprice ind2
     ENDW
 
     2.By contrast, doing the SUM within the WHILE loop for each row took 
     about 12 minutes, or four times as long. 
 
     SET VAR vtotprice CURR = 0
     SET VAR voldinv = invid IN transx WHERE LIMIT=1
     DECLARE C1 CURSOR FOR SELECT invid, tprice +
       FROM transx ORDER BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1, vprice ind2
     WHILE sqlcode <> 100 THEN
       SET VAR vtotprice = (.vtotprice + .vprice)
       IF vinvid <> voldinv THEN
         UPDATE invoice SET invtotal = .vtotprice
           WHERE invid = .voldinv
         SET VAR vtotprice = 0
       ENDIF
       SET VAR voldinv = .vinvid
       FETCH c1 INTO vinvid ind1, vprice ind2
     ENDW
 
       Fetch the index columns into variables and then use WHERE 
       colname=.varname instead of WHERE CURRENT OF cursor. Using an 
       indexed where clause is slightly faster.
 
       When nesting cursors, do all the DECLAREs at the top of the file. 
       Nest the OPEN and FETCH commands, not the DECLARE CURSOR.
 
       When nesting cursors, use the RESET option on the OPEN cursor 
       command. R:BASE does not need to reprocess the entire query when 
       you use RESET, instead, R:BASE just re-evaluates the WHERE clause.
 
     Try Both Sub-Selects and Multi-Table Selects
     In R:BASE 4.5 and higher, a multi-table select uses indexes to join 
     the tables, a correlated sub-select uses indexes to restrict the list 
     of values to be compared within the sub-select, and a plain 
     sub-select uses a temporary index created by R:BASE. Because the 
     index is already created, a multi-table select or a correlated 
     sub-select is slightly faster than a plain sub-select. The amount of 
     data selected, number of matches, and number of unique matches all 
     affect the performance.
 
     To illustrate, compare the results obtained by using the different 
     techniques on two tables, each with 1,000 unique rows (each row in 
     table1 has only one match in table2).
 
     1.A multi-table select, shown below, took 0:00:01.54.
 
     SELECT * FROM table1,table2 +
       WHERE table1.linkcol=table2.linkcol
 
     2.A correlated sub-select, which looked at every row in table1 but 
     needed to look at only one row in table2 for each row in table1, took 
     0:00:01.37. The code is shown below:
 
     SELECT collist FROM table1 WHERE linkcol IN +
      (SELECT linkcol FROM table2 +
       WHERE table2.linkcol=table1.linkcol)
 
     3.A plain sub-select, using a temporary index built by R:BASE when 
     the command executed took slightly longer, 0:00:2.31. This example 
     code, is shown below:
 
     SELECT collist FROM table1 WHERE linkcol IN +
     (SELECT linkcol FROM table2)
 
     SELECT is a command in which your data can affect the performance. 
     The results you see and the method you find that works the best 
     depends on the number of rows and distribution of data in the tables 
     you are combining. Also, the different commands retrieve different 
     columns of data for display.
 
     Use the MICRORIM variables
     In R:BASE 4.5, optimization variables were added to allow application 
     developers more control over the R:BASE environment. The ones most 
     commonly used allow you to increase the default buffers for reading 
     in data and indexes. The variables MICRORIM_F2MAXBLK and 
     MICRORIM_TMPMAXBLK increase the number of buffers for reading in 
     data. The variables MICRORIM_F3MAXBLK and MICRORIM_BTMAXBLK increase 
     the number of buffers for reading in indexes.
 
     To use the variables, set them to the desired number of buffers before 
     connecting to a database. Usually you set all four variables. For 
     example:
 
     SET VAR MICRORIM_F2MAXBLK = 64, +
             MICRORIM_TMPMAXBLK = 64, +
             MICRORIM_F3MAXBLK = 256, +
             MICRORIM_F2MAXBLK = 256
     CONNECT concomp
 
     In the above example of joining two 1,000 row tables, the performance 
     of the plain sub-select was significantly improved by setting the 
     MICRORIM variables. The multi-table select and the correlated 
     sub-select are already fairly well optimized with their use of 
     indexes and the performance improvement is not as great. Experiment 
     with using these variables to see if they will increase performance of 
     your application.
 
     For more information about the MICRORIM variables refer to the R:BASE 
     4.5 Plus! Startup & New Features Guide, Inside R:BASE, or R:BASE 5.1 
     on-line Help (Reference Topics, Optimizing R:BASE 4.5 and Higher) 
     Databases, Cache Memory Buffers).
 
     Practice Smart Indexing
     In a WHERE clause, R:BASE uses only one index (except when joining 
     tables). If more than one indexed column is referenced in a WHERE 
     clause, R:BASE looks for what it considers the "best" index_the one 
     that places the greatest restriction on the number of rows returned. 
     An index on a column that contains unique values is always better 
     than an index on a column that can have duplicate values. R:BASE also 
     looks to see what operator is being used. R:BASE first looks for an 
     indexed column using =, then checks for IS NULL, and finally looks at 
     BETWEEN.
 
     The following example uses the index on indexcol2 because it uses = 
     and R:BASE assumes that comparison restricts the query to fewer rows 
     than the BETWEEN:
 
     WHERE indexcol1 BETWEEN .var1 AND .var2 +
       AND indexcol2 = .var3
 
     The next example uses the index on indexcol2 because both conditions 
     use = and indexcol2 is an index with unique data and indexcol1 is an 
     index on a column with duplicate data. The unique index is the "best" 
     index.
 
     WHERE indexcol1 =.var1 AND indexcol2 = .var2
 
     This example won't use indexes because the conditions are joined with 
     OR. R:BASE does not use indexes to retrieve data if conditions are 
     joined with OR, or if the indexed column is being compared to an 
     expression.
 
     WHERE indexcol1 =.var1 OR indexcol2 = .var2
 
     You can force R:BASE to use a particular index by placing the 
     comparison value in parentheses, making it an expression.
 
     WHERE indexcol1 =.var1 AND indexcol2 = (.var2)
 
     By knowing which indexed column R:BASE will use in a WHERE clause, you 
     can structure your conditions so that R:BASE uses the most unique 
     (and thus most helpful) index. Knowing your data helps you to modify 
     commands and gain speed. 
 
     Use Multi-Column Indexes
     If you routinely use the same set of columns in the WHERE clause, 
     consider defining them as a multi-column index rather than as separate 
     indexes. A multi-column index is faster when a set of columns is used 
     in a sort clause or in a WHERE clause. While the columns in a 
     multi-column index can be used separately, the index is most 
     effective when all the columns in the index are used. For example, if 
     you always retrieve records using a job# and a po#, define a 
     multi-column index for the two columns, rather than two, single-column 
     indexes.
 
     Drop Indexes to Load Data
     If you are loading or inserting many rows of data at one time, it can 
     be faster to drop the indexes, load the data, and then rebuild the 
     indexes. It is much faster for R:BASE to build the indexes in one 
     chunk rather than having them updated with each row as it is loaded.
 
     Be Creative
     R:BASE almost always offers two or more ways of doing something. Look 
     for an alternative method for accomplishing a task_it just might be 
     considerably faster.
 
     Review Your Code
     Did you write that piece of code three months ago, six months ago, or 
     perhaps longer ago still? As you learn more about R:BASE, about your 
     data, and about how people are using the database and your 
     application, you'll easily recognize areas in your code that can be 
     modified for performance improvements. You'll also want to incorporate
     new techniques and new features from the current version of R:BASE.
 
     Have someone else review your code. Someone else might see areas for 
     improvement and suggest other techniques that you miss because you are 
     so familiar with your work. It is true that two heads are better than 
     one.
 
     Other Sources
     These are just some of the techniques you can use to optimize your 
     code to achieve the maximum speed potential of R:BASE. Some of these 
     techniques provide dramatic improvements in speed; others provide 
     smaller and less noticeable speed improvements. Taken together, these
     techniques can significantly increase the overall speed of your 
     application. Remember that the times given here are relative_the 
     actual times you see vary on the basis of your database (structure 
     and data) and your hardware.
 
     You can find more information about optimization techniques in Inside 
     R:BASE (available from the Microrim Sales Department, 1-800-628-6990), 
     in various issues of the R:BASE Exchange, in the R:BASE Reference 
     Manual, and in on-line Help.