TclX adds several character and string functions, permitting you to perform quite easily many operations that would be inconvenient in core Tcl.
The function cequal compares two strings:
cequal strA strBIt returns 1 if the two strings are identical, and 0 if they are not. This is a shorter syntax than string compare, and the result is more intuitive (though string compare is modelled on the C strcmp, many programmers find strcmp confusing at first, and cequal is more "sensible").
tcl> cequal "This" "That" 0 tcl> cequal "This" "This" 1 tcl> if {[cequal $strA $strB]} { . . .cequal also gets you around a well-known "gotcha" in Tcl expressions: if a string happens to conform to the Tcl syntax for a numeric quantity, the normal equals operator (=) interprets it as a number, and finds it to be equal to another string that seems to represent the same number. Thus:
tcl>set str1 "0x7" tcl>set str2 "007" tcl>if {$str1 == $str2} {echo they are the same} they are the sameTo overcome this in standard Tcl you have to resort to string compare.
The cindex function does character-wise indexing into strings. Thus,
cindex string indexExprreturns the character indexed by the indexExpr. For example,
tcl>cindex Hello 1 eNote that indexing in Tcl generally starts with 0.
clength gets the length in characters of a string:
tcl>clength Hello 5
If you need to extract substrings by character indices,
crange string ind1Expr ind2Exprreturns the range of characters from index ind1Expr through index ind2Expr:
tcl>crange "Hello World" 2 7 llo Wocsubstr does almost the same thing as crange, but by start position and length, so
csubstr string indExpr lenExprreturns a range of characters starting at the index indExpr and lenExpr long. If the values of these arguments take csubstr beyond the end of the string, it simply returns what it can get.
tcl>csubstr "Hello World" 4 5 o Wor tcl>csubstr "Hello World" 9 10 ld
One common application that can be tedious and repetitive to code is the parsing of ASCII input. With its list functions and string functions, Tcl and TclX are remarkably useful for parsing. TclX includes the ctoken function specifically for this purpose:
ctoken strVar sepStringThis parses a token out of a character string. The string to parse is contained in the variable strVar, and the string sepString contains all the valid separator characters for tokens in the target string. The first token is returned and the contents of strVar are modified to contain only the remainder of the input string following the extracted token:
tcl>set sepString ~_ tcl>set parse "_~This~is_a~string__to_parse~for~tokens" tcl>ctoken parse $sepString This tcl>echo $parse ~is_a~string__to_parse~for~tokens tcl>ctoken parse $sepString is tcl>ctoken parse $sepString a
(and so on). ctoken ignores any leading separators. ctoken is basically a more intelligent split, with the addition of the "eat token and shorten string" step that one would otherwise have to code by hand. ctoken is a close analogue of the C library routine strtok.
Parsing problems often involve the validation or "typing" of input tokens. TclX provides ctype to address this:
ctype [-failIndex var] charClass stringreturns 1 if every character in the string is of the specified charClass, and 0 if any character is not. It also returns 0 if the string is of zero length. If the failIndex flag and variable name are provided, then the index of the first character to fail the test for membership in type charClass is returned in the variable.
tcl>set str 87654h890 tcl>ctype digit $str 0 tcl>ctype -failindex where digit $str 0 tcl>echo $where 5 tcl>echo [cindex $str $where] h
Other character classes include alnum, alpha, ascii, cntrl, lower, upper, space, etc. ctype does more than just test strings for type; it can also be used to convert decimal ASCII values to characters, and vice versa:
tcl>ctype ord e 101 tcl>ctype char 101 e
Eventually every programmer needs this conversion. It's one more wheel that the TclX user doesn't have to re-invent.
TclX has yet more "fun with strings" in its bag of tricks: replicate and translit.
replicate string timessimply returns a string constructed of times replications of the string string:
tcl>replicate a 10 aaaaaaaaaa tcl>replicate ab 10 abababababababababab
The Unix tr command is mirrored in
translit inrange outrange stringwhich translates characters in string, changing any char in the range inrange to its corresponding char in outrange. You could use this as an alternative version of string toupper:
tcl>set str "Hello World" tcl>translit a-z A-Z $str HELLO WORLD
or you could do some simple-minded data obfuscation:
tcl>translit a-z b-za abc bcd tcl>translit a-z m-zA-L abcpqr mnoBCD tcl>translit m-zA-L a-z mnoBCD abcpqr
The string expand function,
cexpand stringexpands all backslash sequences in string to their actual character values.
tcl>set str "This is a square bracket \\\[ in a string" tcl> echo $str This is a square bracket \[ in a string tcl>cexpand $str This is a square bracket [ in a string
Of these functions, I have found clength, crange, and ctype the most essential; when parsing user input they are invaluable. Tcl can be called essentially a string processing language, since its variables are typeless; the more powerful the string parsing and manipulation commands in your toolbox, the better you can exploit Tcl's "everything's a string" philosophy.