814.TXT
=====================================================================
Selecting Random Data
=====================================================================
PRODUCT: R:BASE VERSION: ALL
=====================================================================
CATALOG: General Information AREA : Data Manipulation
=====================================================================
The easiest way to select a group of random records from a table is to
use the MOD function. When records are selected using the MOD function
in a WHERE clause, you select a truly random set of data from
throughout the data table. Both old and new records are automatically
included and you can easily generate different numbers of random
records.
The MOD function calculates a remainder. To use the MOD function for
selecting random data, you need a unique INTEGER column in the table,
such as an ID number. If the ID number column is not an INTEGER data
type, add an INTEGER autonumber column to the table. To select random
rows of data, you select rows where the ID number is evenly divisible
by another number_the MOD is 0. The smaller the number you divide by,
the more records you select.
For example, the following command selects rows where the ID number is
evenly divisible by 3:
SELECT * FROM CUSTOMER WHERE (MOD(custid,3)) = 0
The most records that can be selected using the MOD function is when
two is used as the divisor. The expression (MOD(custid,2)) = 0 returns
approximately half the rows of data from a sequentially numbered
column.
To select fewer records, increase the number. For example:
SELECT * FROM CUSTOMER WHERE (MOD(custid,9)) = 0
To determine the number of records selected, use the aggregate
function COUNT:
SELECT COUNT(*) FROM CUSTOMER +
WHERE (MOD(custid,9)) = 0
The MOD function can be used with any command that accepts a WHERE
clause; you can select data, print a report, browse a view, or
perform calculations on the selected set of data. You can even add
additional WHERE clause conditions to the query. For example, to send
a mailing to a random group of customers in a particular state use the
following command:
LBLPRINT mlabel4 +
WHERE (MOD(custid,3)) = 0 AND custstate = 'CA'
When you use the MOD function in a WHERE clause, the data is selected,
but there is no record of which ID numbers were chosen. Use the same
WHERE clause on an INSERT command to store the ID number of the
randomly selected records in a table for future reference.
An Example
Let's look at how using the MOD function to select random records can
be applied to a situation where promotional mailings are sent to
groups of customers_every month a mailing is sent to customers who
haven't ordered anything in at least 30 days. Two tables are needed to
store information about the mailings: mailers_information about the
mailer, and mailed_information about when and to whom the mailer was
sent.
The mailers table has the following columns:
Column Data Description
Name Type
MailID INTEGER An autonumber column uniquely
identifying each promotional mailer.
Mail_Code TEXT 8 An internal company code used to
identify the mailer. This code is
printed on the address label and links
with the orders table.
Descr NOTE A description of the mailer.
Number_Sen INTEGER The number of customers sent this
t particular mailing.
One row of data is entered in the mailers table for each promotional
mailer. Depending on the type of mailings, additional columns can be
added to the mailers table, such as columns for storing a promotion
start and end date.
The mailed table has the following columns:
Column Data Description
Name Type
MailID INTEGER Identifies the mailer that was sent,
links with the mailers table.
Custid INTEGER Identifies the customer the mailer was
sent to, links with the customer
table.
Mail_Date DATE The date the mailer was sent.
Many rows of data are entered in the mailed table for each mailer; one
row corresponds to each customer who was sent the mailer. Using this
table, you can determine the effectiveness of your mailers.
When a mailer is prepared, the mailers table is loaded with a row of
data. The Number_Sent column is left empty until the mailer is
actually sent. When a mailer is sent, labels are printed using the MOD
function in the WHERE clause to identify a random group of customers,
the customer ID numbers are inserted into the mailed table, and the
mailers table is updated with the number sent. The same WHERE clause
is used on each command.
*(mailprt.cmd)
-- print the labels for a random group of
-- customers who haven't purchased anything in
-- at least 30 days
LBLPRINT mlabel +
WHERE (MOD(custid,3)) = 0 AND custid IN +
(SELECT custid FROM orders WHERE (#DATE-30) > +
(SELECT MAX(orderdate) FROM orders T1 WHERE +
T1.custid = orders.custid) )
-- insert rows into the mailed table
INSERT INTO mailed +
SELECT custid, .#DATE, 'L1201-3' FROM customer +
WHERE (MOD(custid,3)) = 0 AND custid IN +
(SELECT custid FROM orders WHERE (#DATE-30) > +
(SELECT MAX(orderdate) FROM orders T1 WHERE +
T1.custid = orders.custid) )
-- count the number of customers selected
SELECT COUNT(*) INTO vnum i1 FROM CUSTOMER +
WHERE (MOD(custid,3)) = 0 AND custid IN +
(SELECT custid FROM orders WHERE (#DATE-30) > +
(SELECT MAX(orderdate) FROM orders T1 WHERE +
T1.custid = orders.custid) )
-- update the mailers table with the number sent
UPDATE mailers SET number sent = .vnum +
WHERE mail_code = 'L1201-3'
To look for customers who have received more than one mailing, use the
following command:
BROWSE * FROM customer +
WHERE custid IN +
(SELECT custid FROM mailed +
GROUP BY custid HAVING COUNT(*) > 1)
To see if customers who received mailings then placed an order, check
the data in the orders table using the following command.
BROWSE * FROM orders +
WHERE custid IN (SELECT custid FROM mailed)