Snobol - StriNg Oriented SymBOlic Language
Index.

Introduction.


indentSo what is this SNOBOL thing anyway? SNOBOL stands for StriNg Oriented symBOlic Language. SNOBOL was developed during the 1960s at Bell Laboratories, by Griswold, Farber and Polonsky.1 It was specifically designed to work with strings of text.2 SNOBOL is an imperative language, with powerful string manipulation and pattern matching facilities.3
indentWhile there are several versions of SNOBOL and it's descendants in circulation, this article will be focusing on SNOBOL4. SNOBOL4 is dynamically typed, meaning that the type of a variable is checked at runtime, and bound dynamically to a type.3 While this requires more overhead on the part of the interpreter, the programmer is given more flexibility and writability.
indentI have been using Phil Budne's "Macro SNOBOL4 in C" interpeter, which can be downloaded from http://www.snobol4.org/csnobol4/ or from ftp://ftp.ultimate.com/snobol. A compiled version of SNOBOL called SPITBOL is also available, (http://www.snobol4.com/spitbol360) but is not discussed in this article.

[Index]

Syntax.


indentThe basic syntax of a SNOBOL4 program is a series of statements followed by the "END" keyword. Each statement consists of 3 parts, a label, followed by the statement body, and finally a goto command (goto and labels will be explained further on in this article).
label statementBody :(gotoThisLabel)

indentVariable names can include letters, numbers and the period ("."), but all variables must begin with a letter.2 No limit is placed on the length of a variable, it is up to the programmer to use common sense when naming variables.

My.Variable
Number69

indentNote that variable names are case-sensitive. Each of the following variables are valid names, but are three seperate,
distinct variables.

ThisVariable
thisvariable
THISVARIABLE

indentStrings are the basic data strucure of SNOBOL, strings are assigned to variables in with the syntax, <name> <=> '<string>', where name is a valid variable name, <=> is the "=" (assignment) operator, <string> is any string of characters surrounded by quotes (' or ").

myString = 'The text in the string'
anotherString = "this new string"

indentA variable can also be assigned the value stored in another variable. The value stored in "thisString", is stored in "thatString" by the second assignment statement.

thisString = 'tra la le'
thatString = thisString

indentA string can also be assigned to NULL by placing the assignment operator to the right of the variable name, and leaving the right hand side of the assignment operator blank.2

thisString =

indentStrings can be concatenated, that is added together to form new strings. Concatenation is done with assignment operator, followed a sequence of variables or strings seperated by spaces. The following example concatenates the strings stored in "string1" and "string2" and the strings 'jack be' and 'quick' into one string variable "string3".

string1 = 'jack '
string2 = 'be nimble '
string3 = string1 string2 'jack be ' 'quick'


[Index]

Pattern Matching.


indentSimple pattern matching looks at the contents of one variable, called the subject, and compares it with another string (or variable) called the pattern. The success or failure of the match can be used to make decisions.2 The example below shows the contents of "subjectString" being compared to the contents of patternString and to the string 'this is'.

subjectString = ' this is my string '
patternString = 'string'
subjectString patternString
subjectString 'this is'

indentA pattern can be matched and replaced within a string. For example, all the occurences of the pattern 'jack' in "string3" could be replaced with 'jill'. However, without setting "&anchor = 0" pattern matching only matches the first occurence of a pattern within a string,in order to match multiple occurences, anchor must be set to 0. If anchor is set to 1 (which is the default) only the first occurence of 'jack' in the subject would be replaced with 'jill' but not the second however, since anchor is set to 0 both occurences of 'jack' will be matched and replaced with 'jill'.

&anchor = 0
indentstring3 = 'jack be nimble jack be quick'
indentstring3 'jack' = 'jill'

indentLabels are composed of any characters, but they must begin with a letter, and they MUST start in the first character position of a line and be seperated by at least one space from the rest of the statement. (Note: END is a special label and must be the last statement of a SNOBOL program.2)

thislabel stringVar = 'statement'

indentThere are three different kinds of goto statements. All goto statements are preceded by a colon (":"). Unconditional goto statements follow the colon with ":(labelName)". An unconditional goto will goto the label regardless of the success or failure of the statement that precedes it. A success goto follows the colon with ":S(labelName)", only if the statement is true does the program goto the label listed. A failure goto follows the colon with ":F(labelName)", only if the statement was false does the program goto the label.

indentthisVar = 'goto'
indentthisVar 'goto' :S(success)
success thisVar 'failure' :F(failure)
failure thisVar 'trueOrFalse' :(unconditional)
unconditional

indentBeyond simple pattern matching, there are builtin functions that enable the programmer to do more advanced pattern matching, and string manipulation operations. the "LEN(number)" function will match the first n characters in a string, the cursor is then placed after at the n + 1 location in the string. Using the "." operator, a substring can be extracted from a string and assigned to another variable.

someString LEN(4)
someString LEN(4) . aVariable

indentThe "BREAK()" function matches all the characters up to, but not including, the character(s) specified in the function call. More than one character can be specified, such as "BREAK(',.')" matches all characters up to but not including a comma or a dot.

myVar = 'jack be nimble'
myVar BREAK('n')

indentThe "SPAN(string)" function matches an uninterupted sequence of one or more of the characters in "string" irregardless of order. In the following example, "SPAN('kcaj b')" matches the first 6 characters of myVar "jack b" as they are contained within the span string.

myVar = 'jack be nimble'
myVar SPAN('bkcaj ')


[Index]

Input and Output.



indentInput and output are accomplished by the use of two special variables INPUT and OUTPUT. When a variable is assigned the value of INPUT (like "var = INPUT") the next line of input is assigned to the variable. Each time that the value of INPUT is assigned to a variable, the next line of input is assigned. Each time a line of input is assigned, the statement is successful, but if the end of input is reached (EOF) the statement will fail, using these facts, we can process all the lines of input until the end of the input stream is reached. The following example reads in all the lines of input (until EOF) and concatenates them into one line.

loop varIn = varIn INPUT :S(loop)F(END)
END

indentOUTPUT works similarly, but as you might expect, in the opposite direction. When a string is assigned to OUTPUT the string is written. To output a newline, assign null to output. The following example reads in each line of input and outputs it with blank lines in between.

loop varIn = INPUT :F(END)
indent OUTPUT = varIn
indent OUTPUT = indent:(loop)
END



[Index]

Sources.
  1. Farber, Griswold, and Polonsky. "SNOBOL, A String Manipulation Language." 1964.
  2. Hockey, Susan M. SNOBOL programming for the humanities. 1985.
  3. "The SNOBOL Programming Language." University of Michigan. 1996.
[Index]