""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
BUILDING CONNECTED SOLUTIONS
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
PRODUCT : R:BASE VERSION : 3.1
CATEGORY : PROGRAMMING SUBCATEGORY : TOOLS
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
From Michael Mayer, founder of Mayer Labs, Ltd., 24063 Western Ave.,
Park Forest, IL 60466. Michael develops high performance R:BASE
applications specializing in the Programmer Interface (PI). He is a
member of the Steering Committee of the Chicago Area R:BASE User
Group. He can be reached at 708-481-4536.
About two years ago, a client contacted me for help with an unusual
business problem. The client explained that they had a UNIX system in
their factory that was controlling dozens of different analog and
digital I/O (input/output) devices. He went on to say that their UNIX
control system was working satisfactorily, but there were two small
problems. The UNIX System didn't have a way to permanently store the
data, and they could not create the exact report they desired. He
pointed at a box of several hundred 1.2 megabyte diskettes on the
floor. He mentioned that they have more boxes back at the factory just
like that one. We both looked down at the box and shook our heads.
The Client's Needs
""""""""""""""""""
Specifically, they needed to be able to capture the data permanently,
and to be able to generate real-time reports from the data at any
time. Sounded simple enough, so I asked, "Tell me, just how much data
are we talking about each day?" The response was "Between 1.5 and 2
megabytes a day, right now, but we plan on adding four more UNIX
systems and quadrupling our production capacity by the end of the
year." They wanted to feed all of this data into a single computer for
archival and reporting purposes. The total amount of data to be
managed would be in the range of 2.5 to 3 gigabytes a year!
Next, I asked, "And what would you like to do with all this data?" The
response was that they simply wanted to go back to the data and create
this one special report. He handed me a 70 page real-time printout
that was well beyond even the capabilities of the Extended Report
Writer (XRW).
Report Requirements
"""""""""""""""""""
The report required two separate columns, linked throughout with the
common thread of time stamps. The left side of the report carried what
we call "process data" and the right side of the report carried "alarm
data." It's sort of an action/reaction type of report. They needed to
be synchronized in real time, but the two columns of data were coming
from separate sources, so they may not have the exact same time
stamps. Further, each report would need to consist of a combination of
production phases, 34 in total, all of which could be called in any
order at any time during the report. Data was gathered from each input
source at variable rates, ranging from 15 seconds to 15 minutes.
Another requirement was that the person requesting the report would
identify the increments that were to be reported, such as every one,
two, five, 10 or 15 minutes.
As the conversation went on, many other major issues came out, like
the fact that the report had to be completely printed in under 15
minutes from the time they started it. Also, that each report would
require some 250,000 rows of data be loaded and processed in order to
print. The only real questions in their mind was if I could do it and
how long it would take, as they needed it last month. Eying the box of
disks on the floor, I had a few more questions of my own, for example,
how many more times in my life will I hear that line?
There were other technical requirements, but the point here is that to
the average person, this clearly is NOT an R:BASE database
application. For that matter, it's probably impossible to handle it
all on a PC in any sort of reasonable manner. As systems grow, there
will be more and more times that we as developers will be faced with
problems that require creative solutions that stretch our imaginations
and capabilities. This was definitely one of them.
The Solution
""""""""""""
To solve the problem, I broke it down into manageable segments. First,
I needed to decide how to store gigabytes of data. The thought of
trying to sort a multi-gigabyte database on a PC gave me the chills,
let alone the backup, restoration, and integrity issues that such a
large database would entail.
My solution was to use a WORM (Write Once Read Many) optical disk
drive. WORM drives are great for applications that have to store and
report on a lot of data but don't need to have the data changed or
edited. You write the data on the WORM drive once, and then you can
read it as many times as you need for a report as long as you don't
try to change what's already there. WORM drives are designed to handle
large quantities of data with speed.
Storing Data on the WORM
""""""""""""""""""""""""
I knew I couldn't store the data on the WORM disks in the traditional
R:BASE file format. Too many problems would exist when data from
multiple databases would have to be joined at report time. Storing
data on the WORM disk would have to be handled in daily batches. I
chose to write this data to disk in ASCII format after loading each
batch of UNIX data.
The R:BASE application generates a unique file name based on the date
(for example, 11-21-90.DAT). We use these .DAT files to generate the
"process data"--the left side of the report. We handle the "alarm
data" in a similar fashion, using the date as the file name but giving
the "alarm data" files a .PLM extension (for process alarm).
Next Step of the Solution
"""""""""""""""""""""""""
At this point, we can store unlimited data, but we still needed a
little more information on each WORM disk. Because each WORM disk
could only store 470 megabytes per side (940 megabytes total), it was
possible that a report would require that the user flip or change
disks in order to get all of the required files for the report.
To address this issue, I wrote a simple program in the C programming
language. The C routine writes a VOLUME ID name on each side of the
WORM disk whenever a new WORM disk is formatted. The VOLUME ID
identifies each side of each disk.
The C routine specifies the main UNIX input source, the year, and the
quarter in a serialized format. Another C routine prompts the user to
put the correct WORM disk in, and then verifies that it's the correct
disk and disk side before loading data at report time.
The VOLUME ID label provides the control method. The C program prompts
the user when a new WORM disk needs to be formatted and places the
appropriate VOLUME ID label name on the new disk.
Now we have practically unlimited storage and control. We know exactly
where any day of data is on any WORM disk. All five UNIX systems feed
into this one DOS system.
R:BASE's Part
"""""""""""""
R:BASE has two primary jobs: create the final report and hold the data
while waiting for a complete day of information to exist before
writing a new WORM disk file.
The UNIX system feeds new data to the system. New diskettes with
several files of ASCII data on them arrive on a daily basis. We load
these from the disks into R:BASE by using a Programmer Interface (PI)
routine I wrote. Microrim's PI allows programmers like me to write a
routine in C and give the routine direct access to the database.
The PI routine I wrote has a CRC (cyclic redundancy check) function to
verify that R:BASE received the same data that the UNIX system sent.
Once the R:BASE database holds a complete day of information, the
R:BASE application creates the new file name, unloads the data to that
file on the hard disk, prompts the user to insert the correct WORM
disk, and then copies the file onto the WORM disk. This method works
perfectly. We ended up using only a 60 megabyte hard disk on the DOS
computer.
From File to Database
"""""""""""""""""""""
Next, I needed to find a way to load 250,000 rows into the R:BASE
database in three minutes so R:BASE could create the report. That's
over 1,000 records a second. I couldn't use the LOAD command because I
needed exceptional speed, so I wrote another fast C routine and used
Microrim's PI to get direct access to the ASCII files and the R:BASE
database.
After careful study of the PI, I came up with a lightning fast program
that met the loading specifications. The program loads some 65,000
records per minute into an R:BASE database.
The PI is great for any application where you want to do a single
function (like loading data) extremely fast. I have written many
programs with the PI, and despite it's quirks, there's no better way
to achieve the fastest possible program speed with complete device and
user interface control.
Process Flow
""""""""""""
The process flow now looks like this:
+-------------+
| UNIX Data |
| Diskettes |
+-------------+
|
| Daily Upload/Archive
| +-------------<-----------+
| | |
| | |
| +-------------+ +-------------+
+-------> | R:BASE | | WORM Disk |
| Databases | | Archive |
+-------------+ +-------------+
| | |
| | |
| +------------->-----------+
| Report Creation Process
|
|
To Printer
Report Generation
"""""""""""""""""
Generating the report was the most complex part of the problem. First,
R:BASE prompts for the days when a report cycle started and stopped.
At report time, R:BASE translates these dates into file names to be
loaded from the WORM disk with the PI routine. For example, report
cycle number 9001-2 contained a start date of 11-20-90 and an end date
of 11-22-90. In R:BASE, I translated each date value into a file name
to load from the WORM disk.
Next, R:BASE asks the person using the application for their
preferences as to the period of time slices to report and whether they
want temperature data displayed in Celsius or Fahrenheit degrees
(another minor detail that came out later in the development process).
Then the PI routine loads the R:BASE database with the rows and
columns from the ASCII files on WORM disk required to create the
report, converting the data on the fly during the loading process.
Now the data is in the R:BASE database. The next step is to give some
structure to the real time data. The report generator consists of 34
dynamic modules--one for each of the 34 phases. Any of the 34 modules
could be called at any time during the report generation process.
The data dictates which module is called. Certain rows in the data
contain flags (marks or notifications) that indicate when the phase
changes. When a flag is encountered one of three situations has just
occurred:
o A module change just happened--the phase just changed.
o A module change is about to happen--the phase is about to change.
o An error occurred in the flag sensor, in which case the flag
should be ignored.
I didn't know which of the three was true, so I wrote an R:BASE
program that searches backward and forward in the data around that
point to determine exactly when the phase change occurred and what the
change was. Each phase is defined by a phase number (1 through 34),
starting date, starting time, ending time, and ending date. R:BASE
loads these values into a FLAG table in the R:BASE database as new
flags were encountered in the data stream.
If the R:BASE program finds out the phase change wasn't valid, it's
row is removed from the flag table.
Each R:BASE module has a number (1.CMD through 34.CMD) corresponding
to the phase number stored in the flag table. For example, when the
data indicates that we're now at phase 1, this means that R:BASE
should run 1.CMD, passing that phase's starting date, starting time,
ending time, and ending date into 1.CMD as parameters.
To run the correct module, I go row by row through the flag table,
copying its its values into variables. Then I append .CMD onto the end
of the phase number to create the name of the correct program. Now, I
can run the program 1.CMD passing the start and stop values as
parameters.
Here's a simplified example using an R:BASE 3.1 DECLARE CURSOR
structure to demonstrate how I do this:
*( Example DECLARE CURSOR structure)
DECLARE cur1 CURSOR FOR SELECT +
phase, sdate, stime, etime, edate +
FROM flagtab
OPEN cur1
FETCH cur1 INTO v_phase IND i1, +
v_sdate IND i2, v_stime IND i3, +
v_etime IND i4, v_edate IND i5
WHILE SQLCODE <> 100 THEN
SET VAR v_phaset TEXT = +
(CTXT(.v_phase) + '.CMD')
RUN &v_phaset USING +
.v_sdate, .v_stime, .v_edate, .v_etime
FETCH cur1 INTO v_phase IND i1, +
v_sdate IND i2, v_stime IND i3, +
v_etime IND i4, v_edate IND i5
ENDWHILE
DROP CURSOR cur1
Actually, this is only part of the report. In addition, introductory,
summary, and sign-off sheets are generated.
Conclusion
""""""""""
I used the strengths of R:BASE where I could and the speed and control
of the C programming language and Microrim's PI where it was required.
All of this combined produced a solution that met all my client's
requirements and was completed in about a month.
Now, almost two years later, the system still runs smoothly and has
grown to be almost four gigabytes in size. Those boxes containing
hundreds of diskettes are gone now. They've been replaced by four WORM
disks.
Here's an R:BASE system that doesn't store the majority of its data in
R:BASE format or even on a hard disk for that matter. Future systems
will undoubtedly take this concept further, storing data in many
different locations and formats. System solutions are limited only by
our imaginations.