========================================================================
WORKING WITH TEXT STRINGS USING R:BASE FUNCTIONS
========================================================================
PRODUCT : R:BASE VERSION : 3.1 or Higher
CATEGORY : PROGRAMMING SUBCATEGORY : STRING MANIPULATION
========================================================================
From William B. Driskell, 6536 20th Ave. N.E., Seattle, WA 98115. Bill is
a marine biologist, a computer consultant, an active participant in the
Seattle-area R:BASE user's group, and a frequent contributor to the R:BASE
EXCHANGE.
In computer jargon, a string is a sequence of characters that is handled
as one chunk of data. R:BASE provides string functions to manipulate
strings in various ways: you can peek inside, extract portions, or even
move pieces around.
In R:BASE, a string can be stored in a TEXT column or variable or it can
be a quote-enclosed set of characters. R:BASE string functions can handle
all three - variables, columns, and values - interchangeably, with only
two restrictions:
<> If you use a column name in a string function, R:BASE must be able
to determine the table. Column names work if you use the string
function in a computed column expression, a form expression, a
report expression, or a SELECT command expression all places
where R:BASE knows the table. But you can't use a column name in
string manipulation functions in a SET VAR command because the
table is unknown.
<> Expressions are not allowed inside string functions.
String Manipulation Functions
=============================
These are all the string manipulation functions available in R:BASE 3.1C:
<> (SFIL(chr,nchar)) to fill a string (CHR) with a specified number of
characters (NCHAR).
<> (SPUT(text,string,pos)) to put a string of characters (STRING) into
another string (TEXT) beginning at a specified position number (POS)
in the destination string (TEXT).
<> (SGET(text,nchar,pos)) to get a specified number of characters
(NCHAR) out of a string (TEXT) beginning at a specified position
number (POS).
<> (SLEN(text)) to find out how many characters are in a string (TEXT).
<> (SSUB(text,n)) to pull the nth piece out of a string (TEXT)
when each piece is separated from the next by a comma.
<> (SLOC(text,string)) to locate a substring (STRING) inside another
string (TEXT). If the substring (STRING) is found, SLOC returns a
number representing the substring's starting position in the search
string (TEXT).
<> (CTR(text,width)) to center a string (TEXT) in a specified number of
characters (WIDTH).
<> (SMOVE(text,pos1,nchar,string,pos2)) to place a piece of one string
(TEXT) into another string (STRING). The substring is defined by a
specified starting position (POS1) in the source string (TEXT) along
with a specified number of characters (NCHAR). R:BASE places that
substring in the destination string (STRING) beginning at a specified
position number (POS2).
<> (LJS(text,width)) to left justify a string (TEXT) in a specified
number of characters (WIDTH).
<> (RJS(text,width)) to right justify a string (TEXT) in a specified
number of characters (WIDTH).
<> (FORMAT(value,'picture-format')) to use picture formats to format a
string (VALUE).
<> (STRIM(text)) to trim blanks from a string (TEXT).
<> (ULC(text)) to change all the characters in a string (TEXT) to
lowercase.
<> (LUC(text)) to change all the characters in a string (TEXT) to
uppercase.
<> (ICAP1(text)) to make the first character in the string uppercase
and all other characters lowercase.
<> (ICAP2(text)) to make the first character of each word in the string
uppercase and all other characters lowercase.
Text Conversion Functions
=========================
R:BASE also includes several conversion functions that either return text
or accept text parameters:
<> (ICHAR(chr)) returns the ASCII code for a character (CHR)
<> (CTXT(arg)) changes a value (ARG), no matter what its data type, into
a text string
<> (CHAR(integer)) looks up the character defined by the specified ASCII
code (INTEGER)
<> (INT(text)) changes a non-decimal number (TEXT) that is currently
defined as TEXT into an INTEGER
<> (NINT(text)) rounds a decimal number (TEXT) that is currently defined
as TEXT to the nearest INTEGER and returns an INTEGER
<> (FLOAT(text)) changes a number (TEXT) that is currently defined as
TEXT into the same number defined as a DOUBLE data type
<> (IHASH(text,width)) returns a hash value for a specified number (WIDTH)
of characters in a text string (TEXT) beginning with the first character
Building a String
=================
You can build a string in several ways. Each of the following commands
builds a string:
SET VAR vstring1 = ('explicit (literal) characters')
SET VAR vstring2 = ('letters' & 'more letters')
SET VAR vstring3 = (CTXT(#DATE))
SELECT textcolname INTO vstring4 FROM tblname WHERE...
In the second example, the ampersand (&) pastes (concatenates) two strings
together so that the result has a blank space between the two parts. If
you don't want any space between the two parts, use the plus sign (+)
instead of the ampersand (&). In the third example, the CTXT conversion
function changes the current date into a text string. In the fourth
example, because the variable is loaded from a query, the WHERE clause
must be specific enough to find just one match or R:BASE will display
this error message: -ERROR- Too many rows returned.
You can also use the SFIL function, which creates a string of any desired
length using just one character, to build a string:
SET VAR vstring = (SFIL(.vchar,.vlength))
Locating & Measuring Strings
============================
Two primary questions exist for any string:
<> How long is it?
<> What's inside it?
Use SLEN like this to find the length of a string:
SET VAR vlen INTEGER = (SLEN(.vstring))
Use SLOC like this to find the starting location of one string embedded
in another string:
SET VAR vpos INTEGER = (SLOC(.vhaystack,.vneedle))
This SLOC example takes the value stored in vneedle and looks for
that string in the value stored in vhaystack. If the search is successful,
the function returns the starting location in vhaystack. If no match is
found, the function returns a zero. For example, if vneedle is equal to
'tiny needle' and vhaystack is equal to 'big haystack,' the function will
return a zero. But if vneedle is equal to 'hay' and vhaystack is equal to
'big haystack,' the function will return a five because 'hay' begins with
the fifth character in 'big haystack.'
The following example uses SLOC and SLEN to test a Social Security
number's format. It produces a list of Social Security numbers that are
missing hyphens, given that the correct format is 123-45-6789.
SELECT ssn FROM employee WHERE (SLOC(ssn,'-')) = 0 OR (SLEN(ssn)) < 11
Building a Table of Contents
============================
By combining the SLEN and the SFIL functions, you can build a table-of-
contents format in which each line has, for example, an item followed by
a variable number of periods followed by a related currency value. Here's
an example:
SET VAR vaddlen = (40 - SLEN(.vitem))
SET VAR vdots = (SFIL('.',.vaddlen))
SET VAR voutput = (.vitem & .vdots & .vcost)
WRITE .voutput
This produces a list that looks like this:
Arthur A. Aardvark ..................... $237.57
Bettsy Boop ............................ $199.95
Parsing a String
================
The word parsing means separating into elemental parts. For example, you
can parse a name field by splitting it into the first, middle, and last
names.
For example, the following routine extracts a person's last name from the
full name by using SLOC to locate the last blank space and SGET to extract
the desired segment. The arguments in SGET tell it which string to use,
how many characters to extract, and where to begin extracting.
SET VAR vname TEXT = ('Arthur A. Aardvark')
-- Locate first blank in the name.
SET VAR vloc INTEGER = (SLOC(.vname,'~'))
SET VAR vlen INTEGER = (SLEN(.vname))
-- Trim the name & locate next blank.
WHILE vloc > 0 THEN
SET VAR vlen = (.vlen - .vloc)
SET VAR vloc = (.vloc + 1)
SET VAR vname = (SGET(.vname,.vlen,.vloc))
SET VAR vloc = (SLOC(.vname,' '))
ENDWHILE
When the routine finishes, the variable vname contains the last name
(Aardvark).
You could use a similar routine to parse a telephone number by looking
for the embedded hyphen locations or parse an address by looking for the
commas.
Using SPUT as a Paste Editor
============================
The SPUT function puts characters into a string at specified locations,
overwriting the previous characters in the process. Its functionality
is best described as a paste editor. But it can also assist in parsing.
For example, you can change the above routine to use SPUT to place an at
symbol (@) on top of the blanks to allow the search to proceed to the
next blank.
-- Locate first blank in the name
SET VAR vloc INTEGER = (SLOC(.vname,'~'))
-- Fill it with @ & locate next blank.
WHILE vloc > 0 THEN
SET VAR vname = (SPUT(.vname,'@',.vloc))
SET VAR vlttrloc = (.vloc + 1)
SET VAR vloc = (SLOC(.vname,' '))
ENDWHILE
-- Extract the last name.
SET VAR vlen = (SLEN(.vname) - .vlttrloc + 1)
SET VAR vlastname = (SGET(.vname,.vlen,.vlttrloc))
Using SSUB to Parse a String
============================
You can also use the SSUB function to parse strings. SSUB extracts items
in a comma-delimited string based on the item's numbered position in the
list. In the next example, the commas in vaddr define a list of four
items:
SET VAR vaddr = ('15395 SE 30th Place,Bellevue,WA,98007')
SET VAR vcity TEXT = (SSUB(.vaddr,2))
SET VAR vstate TEXT = (SSUB(.vaddr,3))
SET VAR vzip TEXT = (SSUB(.vaddr,4))
Now, for example, the variable vstate will contain WA. The variable vzip
will contain the TEXT zip code, but you can convert it into a numeric
value by using the new conversion capabilities of R:BASE 3.1B's FLOAT,
INT, or NINT functions.
The SSUB function is very handy for parsing out the selections from a
two-column menu or from a checkbox menu.
Moving From One String to Another
=================================
The SMOVE function is an extract-and-merge command, combining the
functions of SGET and SPUT. Consider the following example:
SET VAR vdate=(CTXT(.#DATE)), +
vproj# = ('64-951.00-SPD'), +
vjobid = (SMOVE(.vdate,7,2,.vproj#,8))
The variable vjobid contains 64-951.91-SPD because two characters (the
seventh and eighth) of the year were moved into vproj# starting
in position number eight and vjobid was set to the result.
Formatting Strings
==================
R:BASE provides two groups of string formatting functions: the
capitalization functions and the justification functions.
The Capitalization Functions
============================
The capitalization functions cover most case conversion needs, as the
following table shows:
SET VAR vstring = ('Arthur A. Aardvark')
Function Result
ULC(.vstring) 'arthur a. aardvark'
LUC(.vstring) 'ARTHUR A. AARDVARK'
ICAP1(.vstring) 'Arthur a. aardvark'
ICAP2(.vstring) 'Arthur A. Aardvark'
Names like O'Toole or McDonald, which have embedded capitals, present
special problems. ICAP2 won't work on names such as these, but you can
use what I call brute-force ASCII manipulation - subtracting 32 from the
ASCII value to force the capitalization. Here's a routine that deals with
names containing an apostrophe by using a variety of functions.
SET QUOTE="
*( In the next command, the item to the)
*( right of the = is a single quotation)
*( mark enclosed in double quotation marks.)
SET VAR vappos TEXT = "'"
SET VAR vloc = (SLOC(.vname,.vappos) + 1)
IF vloc > 0 THEN
SET VAR vcap = (SGET(.vname,1,.vloc))
SET VAR vcap = (LUC(.vcap))
SET VAR .vname = (SPUT(.vname,.vcap,.vloc))
ENDIF
SET QUOTE='
Justification Functions
=======================
The justification functions control placement of a string within a
prescribed field length. These functions can be particularly handy for
aligning columns on a decimal point, centering titles, or precisely
placing a currency amount on a bank check.
Function Result
RJS(.vstring,26) 'Arthur A. Aardvark'
LJS(.vstring,26) 'Arthur A. Aardvark'
CTR(.vstring,26) 'Arthur A. Aardvark'
With R:BASE 3.1C, you can use the new FORMAT function to format strings.
FORMAT works like the WRITE command's USING format. For example, you can
use the FORMAT function in commands like SELECT to align DOUBLE values on
a decimal point or to place commas in INTEGER values.
The final string function is STRIM. It trims trailing blanks from a text
string. R:BASE currently automatically trims trailing blanks, so you won't
need to use STRIM; it's a remnant from previous versions and products.
These string functions can make your job easier and your output more
attractive.