[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
70.1 Introduction to string processing | ||
70.2 Definitions for input and output | ||
70.3 Definitions for characters | ||
70.4 Definitions for strings |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
stringproc.lisp
enlarges Maximas capabilities of working with strings
and adds some useful functions for file in/output.
For questions and bugs please mail to van.nek at arcor.de .
Load stringproc.lisp
by typing load("stringproc");
.
In Maxima a string is easily constructed by typing "text".
Note that Maxima-strings are no Lisp-strings and vice versa.
Tests can be done with stringp
respectively lstringp
.
If for some reasons you have a value,
that is a Lisp-string, maybe when using Maxima-function sconcat
, you can convert via sunlisp
.
(%i1) load("stringproc")$ (%i2) m: "text"; (%o2) text (%i3) [stringp(m),lstringp(m)]; (%o3) [true, false] (%i4) l: sconcat("text"); (%o4) text (%i5) [stringp(l),lstringp(l)]; (%o5) [false, true] (%i6) stringp( sunlisp(l) ); (%o6) true |
All functions in stringproc.lisp
, that return strings, return Maxima-strings.
Characters are introduced as Maxima-strings of length 1.
Of course, these are no Lisp-characters.
Tests can be done with charp
(respectively lcharp
and conversion from Lisp to Maxima with cunlisp
).
(%i1) load("stringproc")$ (%i2) c: "e"; (%o2) e (%i3) [charp(c),lcharp(c)]; (%o3) [true, false] (%i4) supcase(c); (%o4) E (%i5) charp(%); (%o5) true |
Again, all functions in stringproc.lisp
, that return characters, return Maxima-characters.
Due to the fact, that the introduced characters are strings of length 1,
you can use a lot of string functions also for characters.
As seen, supcase
is one example.
It is important to know,
that the first character in a Maxima-string is at position 1.
This is designed due to the fact that the first element in a Maxima-list is at position 1 too.
See definitions of charat
and charlist
for examples.
In applications string-functions are often used when working with files.
You will find some useful stream- and print-functions in stringproc.lisp
.
The following example shows some of the here introduced functions at work.
Example:
openw
returns an output stream to a file, printf
then allows formatted writing
to this file. See printf
for details.
(%i1) load("stringproc")$ (%i2) s: openw("E:/file.txt"); (%o2) #<output stream E:/file.txt> (%i3) for n:0 thru 10 do printf( s, "~d ", fib(n) ); (%o3) done (%i4) printf( s, "~%~d ~f ~a ~a ~f ~e ~a~%", 42,1.234,sqrt(2),%pi,1.0e-2,1.0e-2,1.0b-2 ); (%o4) false (%i5) close(s); (%o5) true |
After closing the stream you can open it again, this time with input direction.
readline
returns the entire line as one string. The stringproc
package
now offers a lot of functions for manipulating strings. Tokenizing can be done by
split
or tokens
.
(%i6) s: openr("E:/file.txt"); (%o6) #<input stream E:/file.txt> (%i7) readline(s); (%o7) 0 1 1 2 3 5 8 13 21 34 55 (%i8) line: readline(s); (%o8) 42 1.234 sqrt(2) %pi 0.01 1.0E-2 1.0b-2 (%i9) list: tokens(line); (%o9) [42, 1.234, sqrt(2), %pi, 0.01, 1.0E-2, 1.0b-2] (%i10) map( parsetoken, list ); (%o10) [42, 1.234, false, false, 0.01, 0.01, false] |
parsetoken
only parses integer and float numbers. Parsing symbols or bigfloats
needs parse_string
, which can be loaded from eval_string.lisp
.
(%i11) load("eval_string")$ (%i12) map( parse_string, list ); (%o12) [42, 1.234, sqrt(2), %pi, 0.01, 0.01, 1.0b-2] (%i13) float(%); (%o13) [42.0, 1.234, 1.414213562373095, 3.141592653589793, 0.01, 0.01, 0.01] (%i14) readline(s); (%o14) false (%i15) close(s)$ |
readline
returns false
when the end of file occurs.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Example:
(%i1) load("stringproc")$ (%i2) s: openw("E:/file.txt"); (%o2) #<output stream E:/file.txt> (%i3) control: "~2tAn atom: ~20t~a~%~2tand a list: ~20t~{~r ~}~%~2tand an integer: ~20t~d~%"$ (%i4) printf( s,control, 'true,[1,2,3],42 )$ (%o4) false (%i5) close(s); (%o5) true (%i6) s: openr("E:/file.txt"); (%o6) #<input stream E:/file.txt> (%i7) while stringp( tmp:readline(s) ) do print(tmp)$ An atom: true and a list: one two three and an integer: 42 (%i8) close(s)$ |
Closes stream and returns true
if stream had been open.
Returns the number of elements in stream.
Returns the current position in stream, if pos is not used.
If pos is used,
fposition
sets the position in stream.
pos has to be a positive number,
the first element in stream is in position 1.
Writes a new line (to stream),
if the position is not at the beginning of a line.
See also newline
.
Writes a new line (to stream).
See sprint
for an example of using newline()
.
Note that there are some cases, where newline()
does not work as expected.
Returns an output stream to file.
If an existing file is opened, opena
appends elements at the end of file.
Returns an input stream to file. If file does not exist, it will be created.
Returns an output stream to file.
If file does not exist, it will be created.
If an existing file is opened, openw
destructively modifies file.
Makes the Common Lisp function FORMAT available in Maxima. (From gcl.info: "format produces formatted output by outputting the characters of control-string string and observing that a tilde introduces a directive. The character after the tilde, possibly preceded by prefix parameters and modifiers, specifies what kind of formatting is desired. Most directives use one or more elements of args to create their output.")
The following description and the examples may give an idea of using printf
.
See a Lisp reference for more information.
~% new line ~& fresh line ~t tab ~$ monetary ~d decimal integer ~b binary integer ~o octal integer ~x hexadecimal integer ~br base-b integer ~r spell an integer ~p plural ~f floating point ~e scientific notation ~g ~f or ~e, depending upon magnitude ~a as printed by Maxima function print ~s strings enclosed in "double quotes" ~~ ~ ~< justification, ~> terminates ~( case conversion, ~) terminates ~[ selection, ~] terminates ~{ iteration, ~} terminates |
Please note that there is no format specifier for bigfloats. However bigfloats can
simply be printed by using the ~a
directive.
~s
prints strings enclosed in "double quotes", you can avoid this by using ~a
.
Note that the selection directive ~[
is zero-indexed.
Also note that there are some directives, which do not work in Maxima.
For example, ~:[
fails.
(%i1) load("stringproc")$ (%i2) printf( false, "~a ~a ~4f ~a ~@r", "String",sym,bound,sqrt(12),144), bound = 1.234; (%o2) String sym 1.23 2*sqrt(3) CXLIV (%i3) printf( false,"~{~a ~}",["one",2,"THREE"] ); (%o3) one 2 THREE (%i4) printf( true,"~{~{~9,1f ~}~%~}",mat ), mat = args( matrix([1.1,2,3.33],[4,5,6],[7,8.88,9]) )$ 1.1 2.0 3.3 4.0 5.0 6.0 7.0 8.9 9.0 (%i5) control: "~:(~r~) bird~p ~[is~;are~] singing."$ (%i6) printf( false,control, n,n,if n=1 then 0 else 1 ), n=2; (%o6) Two birds are singing. |
If dest is a stream or true
, then printf
returns false
.
Otherwise, printf
returns a string containing the output.
Returns a string containing the characters from the current position in stream up to the end of the line or false if the end of the file is encountered.
Evaluates and displays its arguments one after the other `on a line' starting at the leftmost position.
The numbers are printed with the '-' right next to the number,
and it disregards line length. newline()
, which can be loaded from stringproc.lisp
might be useful, if you whish to place intermediate line breaking.
(%i1) for n:0 thru 22 do sprint( fib(n) )$ 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 (%i2) load("stringproc")$ (%i3) for n:0 thru 22 do ( sprint(fib(n)), if mod(n,10)=9 then newline() )$ 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Returns true
if char is an alphabetic character.
Returns true
if char is an alphabetic character or a digit.
Returns the character corresponding to the ASCII number int. ( -1 < int < 256 )
(%i1) load("stringproc")$ (%i2) for n from 0 thru 255 do ( tmp: ascii(n), if alphacharp(tmp) then sprint(tmp), if n=96 then newline() )$ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z |
Returns true
if char_1 and char_2 are the same.
Like cequal
but ignores case.
Returns true
if the ASCII number of char_1 is greater than the number of char_2.
Like cgreaterp
but ignores case.
Returns true
if obj is a Maxima-character.
See introduction for example.
Returns the ASCII number of char.
Returns true
if the ASCII number of char_1 is less than the number of char_2.
Like clessp
but ignores case.
Returns true
if char is a graphic character and not the space character.
A graphic character is a character one can see, plus the space character.
(constituent
is defined by Paul Graham, ANSI Common Lisp, 1996, page 67.)
(%i1) load("stringproc")$ (%i2) for n from 0 thru 255 do ( tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$ ! " # % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ |
Converts a Lisp-character into a Maxima-character. (You won't need it.)
Returns true
if char is a digit.
Returns true
if obj is a Lisp-character.
(You won't need it.)
Returns true
if char is a lowercase character.
The newline character.
The space character.
The tab character.
Returns true
if char is an uppercase character.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Converts a Lisp-string into a Maxima-string. (In general you won't need it.)
Returns true
if obj is a Lisp-string.
(In general you won't need it.)
Returns true
if obj is a Maxima-string.
See introduction for example.
Returns the n-th character of string. The first character in string is returned with n = 1.
(%i1) load("stringproc")$ (%i2) charat("Lisp",1); (%o2) L |
Returns the list of all characters in string.
(%i1) load("stringproc")$ (%i2) charlist("Lisp"); (%o2) [L, i, s, p] (%i3) %[1]; (%o3) L |
parsetoken
converts the first token in string to the corresponding number or returns false
if the number cannot be determined .
The delimiter set for tokenizing is {space, comma, semicolon, tab, newline}
.
(%i1) load("stringproc")$ (%i2) 2*parsetoken("1.234 5.678"); (%o2) 2.468 |
For parsing you can also use function parse_string. See description in file 'share\contrib\eval_string.lisp'.
Evaluates its arguments and concatenates them into a string.
sconc
is like sconcat
but returns a Maxima string.
(%i1) load("stringproc")$ (%i2) sconc("xx[",3,"]:",expand((x+y)^3)); (%o2) xx[3]:y^3+3*x*y^2+3*x^2*y+x^3 (%i3) stringp(%); (%o3) true |
Returns a copy of string as a new string.
Like supcase
, but uppercase characters are converted to lowercase.
Returns true
if string_1 and string_2 are the same length and contain the same characters.
Like sequal
but ignores case.
sexplode
is an alias for function charlist
.
simplode
takes a list of expressions and concatenates them into a string.
If no delimiter delim is used, simplode
is like sconc
and uses no delimiter.
delim can be any string.
(%i1) load("stringproc")$ (%i2) simplode(["xx[",3,"]:",expand((x+y)^3)]); (%o2) xx[3]:y^3+3*x*y^2+3*x^2*y+x^3 (%i3) simplode( sexplode("stars")," * " ); (%o3) s * t * a * r * s (%i4) simplode( ["One","more","coffee."]," " ); (%o4) One more coffee. |
Returns a string that is a concatenation of substring (string, 1, pos - 1)
,
the string seq and substring (string, pos)
.
Note that the first character in string is in position 1.
(%i1) load("stringproc")$ (%i2) s: "A submarine."$ (%i3) sconc( substring(s,1,3),"yellow ",substring(s,3) ); (%o3) A yellow submarine. (%i4) sinsert("hollow ",s,3); (%o4) A hollow submarine. |
Returns string except that each character from position start to end is inverted. If end is not given, all characters from start to the end of string are replaced.
(%i1) load("stringproc")$ (%i2) sinvertcase("sInvertCase"); (%o2) SiNVERTcASE |
Returns the number of characters in string.
Returns a new string with a number of num characters char.
(%i1) load("stringproc")$ (%i2) smake(3,"w"); (%o2) www |
Returns the position of the first character of string_1 at which string_1 and string_2 differ or false
.
Default test function for matching is sequal
.
If smismatch
should ignore case, use sequalignore
as test.
(%i1) load("stringproc")$ (%i2) smismatch("seven","seventh"); (%o2) 6 |
Returns the list of all tokens in string.
Each token is an unparsed string.
split
uses delim as delimiter.
If delim is not given, the space character is the default delimiter.
multiple is a boolean variable with true
by default.
Multiple delimiters are read as one.
This is useful if tabs are saved as multiple space characters.
If multiple is set to false
, each delimiter is noted.
(%i1) load("stringproc")$ (%i2) split("1.2 2.3 3.4 4.5"); (%o2) [1.2, 2.3, 3.4, 4.5] (%i3) split("first;;third;fourth",";",false); (%o3) [first, , third, fourth] |
Returns the position of the first character in string which matches char.
The first character in string is in position 1.
For matching characters ignoring case see ssearch
.
Returns a string like string but without all substrings matching seq.
Default test function for matching is sequal
.
If sremove
should ignore case while searching for seq, use sequalignore
as test.
Use start and end to limit searching.
Note that the first character in string is in position 1.
(%i1) load("stringproc")$ (%i2) sremove("n't","I don't like coffee."); (%o2) I do like coffee. (%i3) sremove ("DO ",%,'sequalignore); (%o3) I like coffee. |
Like sremove
except that only the first substring that matches seq
is removed.
Returns a string with all the characters of string in reverse order.
Returns the position of the first substring of string that matches the string seq.
Default test function for matching is sequal
.
If ssearch
should ignore case, use sequalignore
as test.
Use start and end to limit searching.
Note that the first character in string is in position 1.
(%i1) ssearch("~s","~{~S ~}~%",'sequalignore); (%o1) 4 |
Returns a string that contains all characters from string in an order such there are no two successive characters c and d such that test (c, d)
is false
and test (d, c)
is true
.
Default test function for sorting is clessp.
The set of test functions is {clessp, clesspignore, cgreaterp, cgreaterpignore, cequal, cequalignore}
.
(%i1) load("stringproc")$ (%i2) ssort("I don't like Mondays."); (%o2) '.IMaddeiklnnoosty (%i3) ssort("I don't like Mondays.",'cgreaterpignore); (%o3) ytsoonnMlkIiedda.' |
Returns a string like string except that all substrings matching old are replaced by new.
old and new need not to be of the same length.
Default test function for matching is sequal
.
If ssubst
should ignore case while searching for old, use sequalignore
as test.
Use start and end to limit searching.
Note that the first character in string is in position 1.
(%i1) load("stringproc")$ (%i2) ssubst("like","hate","I hate Thai food. I hate green tea."); (%o2) I like Thai food. I like green tea. (%i3) ssubst("Indian","thai",%,'sequalignore,8,12); (%o3) I like Indian food. I like green tea. |
Like subst
except that only the first substring that matches old is replaced.
Returns a string like string, but with all characters that appear in seq removed from both ends.
(%i1) load("stringproc")$ (%i2) "/* comment */"$ (%i3) strim(" /*",%); (%o3) comment (%i4) slength(%); (%o4) 7 |
Like strim
except that only the left end of string is trimmed.
Like strim
except that only the right end of string is trimmed.
Returns the substring of string beginning at position start and ending at position end. The character at position end is not included. If end is not given, the substring contains the rest of the string. Note that the first character in string is in position 1.
(%i1) load("stringproc")$ (%i2) substring("substring",4); (%o2) string (%i3) substring(%,4,6); (%o3) in |
Returns string except that lowercase characters from position start to end are replaced by the corresponding uppercase ones. If end is not given, all lowercase characters from start to the end of string are replaced.
(%i1) load("stringproc")$ (%i2) supcase("english",1,2); (%o2) English |
Returns a list of tokens, which have been extracted from string.
The tokens are substrings whose characters satisfy a certain test function.
If test is not given, constituent is used as the default test.
{constituent, alphacharp, digitcharp, lowercasep, uppercasep, charp, characterp, alphanumericp}
is the set of test functions.
(The Lisp-version of tokens
is written by Paul Graham. ANSI Common Lisp, 1996, page 67.)
(%i1) load("stringproc")$ (%i2) tokens("24 October 2005"); (%o2) [24, October, 2005] (%i3) tokens("05-10-24",'digitcharp); (%o3) [05, 10, 24] (%i4) map(parsetoken,%); (%o4) [5, 10, 24] |
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Robert Dodier on December, 21 2006 using texi2html 1.76.