projman/hlp/en/tcl/string.htm
2015-10-19 14:27:31 +04:00

339 lines
20 KiB
HTML

<HTML><HEAD><TITLE>Tcl Built-In Commands - string manual page</TITLE></HEAD><BODY>
<DL>
<DD><A HREF="string.htm#M2" NAME="L1104">NAME</A>
<DL><DD>string - Manipulate strings</DL>
<DD><A HREF="string.htm#M3" NAME="L1105">SYNOPSIS</A>
<DL>
<DD><B>string </B><I>option arg </I>?<I>arg ...?</I>
</DL>
<DD><A HREF="string.htm#M4" NAME="L1106">DESCRIPTION</A>
<DL>
<DD><A HREF="string.htm#M5" NAME="L1107"><B>string bytelength </B><I>string</I></A>
<DD><A HREF="string.htm#M6" NAME="L1108"><B>string compare</B> ?<B>-nocase</B>? ?<B>-length int</B>? <I>string1 string2</I></A>
<DD><A HREF="string.htm#M7" NAME="L1109"><B>string equal</B> ?<B>-nocase</B>? ?<B>-length int</B>? <I>string1 string2</I></A>
<DD><A HREF="string.htm#M8" NAME="L1110"><B>string first </B><I>string1 string2</I> ?<I>startIndex</I>?</A>
<DD><A HREF="string.htm#M9" NAME="L1111"><B>string index </B><I>string charIndex</I></A>
<DL>
<DD><A HREF="string.htm#M10" NAME="L1112"><I>integer</I></A>
<DD><A HREF="string.htm#M11" NAME="L1113"><B>end</B></A>
<DD><A HREF="string.htm#M12" NAME="L1114"><B>end-</B><I>integer</I></A>
</DL>
<DD><A HREF="string.htm#M13" NAME="L1115"><B>string is </B><I>class</I> ?<B>-strict</B>? ?<B>-failindex </B><I>varname</I>? <I>string</I></A>
<DL>
<DD><A HREF="string.htm#M14" NAME="L1116"><B>alnum</B></A>
<DD><A HREF="string.htm#M15" NAME="L1117"><B>alpha</B></A>
<DD><A HREF="string.htm#M16" NAME="L1118"><B>ascii</B></A>
<DD><A HREF="string.htm#M17" NAME="L1119"><B>boolean</B></A>
<DD><A HREF="string.htm#M18" NAME="L1120"><B>control</B></A>
<DD><A HREF="string.htm#M19" NAME="L1121"><B>digit</B></A>
<DD><A HREF="string.htm#M20" NAME="L1122"><B>double</B></A>
<DD><A HREF="string.htm#M21" NAME="L1123"><B>false</B></A>
<DD><A HREF="string.htm#M22" NAME="L1124"><B>graph</B></A>
<DD><A HREF="string.htm#M23" NAME="L1125"><B>integer</B></A>
<DD><A HREF="string.htm#M24" NAME="L1126"><B>lower</B></A>
<DD><A HREF="string.htm#M25" NAME="L1127"><B>print</B></A>
<DD><A HREF="string.htm#M26" NAME="L1128"><B>punct</B></A>
<DD><A HREF="string.htm#M27" NAME="L1129"><B>space</B></A>
<DD><A HREF="string.htm#M28" NAME="L1130"><B>true</B></A>
<DD><A HREF="string.htm#M29" NAME="L1131"><B>upper</B></A>
<DD><A HREF="string.htm#M30" NAME="L1132"><B>wordchar</B></A>
<DD><A HREF="string.htm#M31" NAME="L1133"><B>xdigit</B></A>
</DL>
<DD><A HREF="string.htm#M32" NAME="L1134"><B>string last </B><I>string1 string2</I> ?<I>startIndex</I>?</A>
<DD><A HREF="string.htm#M33" NAME="L1135"><B>string length </B><I>string</I></A>
<DD><A HREF="string.htm#M34" NAME="L1136"><B>string map</B> ?<B>-nocase</B>? <I>charMap string</I></A>
<DD><A HREF="string.htm#M35" NAME="L1137"><B>string match</B> ?<B>-nocase</B>? <I>pattern</I> <I>string</I></A>
<DL>
<DD><A HREF="string.htm#M36" NAME="L1138"><B>*</B></A>
<DD><A HREF="string.htm#M37" NAME="L1139"><B>?</B></A>
<DD><A HREF="string.htm#M38" NAME="L1140"><B>[</B><I>chars</I><B>]</B></A>
<DD><A HREF="string.htm#M39" NAME="L1141"><B>&#92;</B><I>x</I></A>
</DL>
<DD><A HREF="string.htm#M40" NAME="L1142"><B>string range </B><I>string first last</I></A>
<DD><A HREF="string.htm#M41" NAME="L1143"><B>string repeat </B><I>string count</I></A>
<DD><A HREF="string.htm#M42" NAME="L1144"><B>string replace </B><I>string first last</I> ?<I>newstring</I>?</A>
<DD><A HREF="string.htm#M43" NAME="L1145"><B>string tolower </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A>
<DD><A HREF="string.htm#M44" NAME="L1146"><B>string totitle </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A>
<DD><A HREF="string.htm#M45" NAME="L1147"><B>string toupper </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A>
<DD><A HREF="string.htm#M46" NAME="L1148"><B>string trim </B><I>string</I> ?<I>chars</I>?</A>
<DD><A HREF="string.htm#M47" NAME="L1149"><B>string trimleft </B><I>string</I> ?<I>chars</I>?</A>
<DD><A HREF="string.htm#M48" NAME="L1150"><B>string trimright </B><I>string</I> ?<I>chars</I>?</A>
<DD><A HREF="string.htm#M49" NAME="L1151"><B>string wordend </B><I>string charIndex</I></A>
<DD><A HREF="string.htm#M50" NAME="L1152"><B>string wordstart </B><I>string charIndex</I></A>
</DL>
<DD><A HREF="string.htm#M51" NAME="L1153">SEE ALSO</A>
<DD><A HREF="string.htm#M52" NAME="L1154">KEYWORDS</A>
</DL><HR>
<H3><A NAME="M2">NAME</A></H3>
string - Manipulate strings
<H3><A NAME="M3">SYNOPSIS</A></H3>
<B>string </B><I>option arg </I>?<I>arg ...?</I><BR>
<H3><A NAME="M4">DESCRIPTION</A></H3>
Performs one of several string operations, depending on <I>option</I>.
The legal <I>option</I>s (which may be abbreviated) are:
<P>
<DL>
<P><DT><A NAME="M5"><B>string bytelength </B><I>string</I></A><DD>
Returns a decimal string giving the number of bytes used to represent
<I>string</I> in memory. Because UTF-8 uses one to three bytes to
represent Unicode characters, the byte length will not be the same as
the character length in general. The cases where a script cares about
the byte length are rare. In almost all cases, you should use the
<B>string length</B> operation. Refer to the <B><A HREF="../TkLib/Utf.htm">Tcl_NumUtfChars</A></B>
manual entry for more details on the UTF-8 representation.
<P><DT><A NAME="M6"><B>string compare</B> ?<B>-nocase</B>? ?<B>-length int</B>? <I>string1 string2</I></A><DD>
Perform a character-by-character comparison of strings <I>string1</I> and
<I>string2</I>. Returns
-1, 0, or 1, depending on whether <I>string1</I> is lexicographically
less than, equal to, or greater than <I>string2</I>.
If <B>-length</B> is specified, then only the first <I>length</I> characters
are used in the comparison. If <B>-length</B> is negative, it is
ignored. If <B>-nocase</B> is specified, then the strings are
compared in a case-insensitive manner.
<P><DT><A NAME="M7"><B>string equal</B> ?<B>-nocase</B>? ?<B>-length int</B>? <I>string1 string2</I></A><DD>
Perform a character-by-character comparison of strings
<I>string1</I> and <I>string2</I>. Returns 1 if <I>string1</I> and
<I>string2</I> are identical, or 0 when not. If <B>-length</B> is
specified, then only the first <I>length</I> characters are used in the
comparison. If <B>-length</B> is negative, it is ignored. If
<B>-nocase</B> is specified, then the strings are compared in a
case-insensitive manner.
<P><DT><A NAME="M8"><B>string first </B><I>string1 string2</I> ?<I>startIndex</I>?</A><DD>
Search <I>string2</I> for a sequence of characters that exactly match
the characters in <I>string1</I>. If found, return the index of the
first character in the first such match within <I>string2</I>. If not
found, return -1.
If <I>startIndex</I> is specified (in any of the forms accepted by the
<B>index</B> method), then the search is constrained to start with the
character in <I>string2</I> specified by the index. For example,
<PRE><B>string first a 0a23456789abcdef 5</B></PRE>
will return <B>10</B>, but
<PRE><B>string first a 0123456789abcdef 11</B></PRE>
will return <B>-1</B>.
<P><DT><A NAME="M9"><B>string index </B><I>string charIndex</I></A><DD>
Returns the <I>charIndex</I>'th character of the <I>string</I>
argument. A <I>charIndex</I> of 0 corresponds to the first
character of the string.
<I>charIndex</I> may be specified as
follows:
<P>
<DL>
<P><DT><A NAME="M10"><I>integer</I></A><DD>
The char specified at this integral index
<P><DT><A NAME="M11"><B>end</B></A><DD>
The last char of the string.
<P><DT><A NAME="M12"><B>end-</B><I>integer</I></A><DD>
The last char of the string minus the specified integer
offset (e.g. <B>end-1</B> would refer to the &quot;c&quot; in &quot;abcd&quot;).
</DL><P>If <I>charIndex</I> is less than 0 or greater than
or equal to the length of the string then an empty string is
returned.<DL>
<P></DL>
<P><DT><A NAME="M13"><B>string is </B><I>class</I> ?<B>-strict</B>? ?<B>-failindex </B><I>varname</I>? <I>string</I></A><DD>
Returns 1 if <I>string</I> is a valid member of the specified character
class, otherwise returns 0. If <B>-strict</B> is specified, then an
empty string returns 0, otherwise and empty string will return 1 on
any class. If <B>-failindex</B> is specified, then if the function
returns 0, the index in the string where the class was no longer valid
will be stored in the variable named <I>varname</I>. The <I>varname</I>
will not be set if the function returns 1. The following character classes
are recognized (the class name can be abbreviated):
<P>
<DL>
<P><DT><A NAME="M14"><B>alnum</B></A><DD>
Any Unicode alphabet or digit character.
<P><DT><A NAME="M15"><B>alpha</B></A><DD>
Any Unicode alphabet character.
<P><DT><A NAME="M16"><B>ascii</B></A><DD>
Any character with a value less than &#92;u0080 (those that
are in the 7-bit ascii range).
<P><DT><A NAME="M17"><B>boolean</B></A><DD>
Any of the forms allowed to <B><A HREF="../TkLib/GetInt.htm">Tcl_GetBoolean</A></B>.
<P><DT><A NAME="M18"><B>control</B></A><DD>
Any Unicode control character.
<P><DT><A NAME="M19"><B>digit</B></A><DD>
Any Unicode digit character. Note that this includes characters
outside of the [0-9] range.
<P><DT><A NAME="M20"><B>double</B></A><DD>
Any of the valid forms for a double in Tcl, with optional surrounding
whitespace. In case of under/overflow in the value, 0 is returned
and the <I>varname</I> will contain -1.
<P><DT><A NAME="M21"><B>false</B></A><DD>
Any of the forms allowed to <B><A HREF="../TkLib/GetInt.htm">Tcl_GetBoolean</A></B> where the value is false.
<P><DT><A NAME="M22"><B>graph</B></A><DD>
Any Unicode printing character, except space.
<P><DT><A NAME="M23"><B>integer</B></A><DD>
Any of the valid forms for an integer in Tcl, with optional surrounding
whitespace. In case of under/overflow in the value, 0 is returned
and the <I>varname</I> will contain -1.
<P><DT><A NAME="M24"><B>lower</B></A><DD>
Any Unicode lower case alphabet character.
<P><DT><A NAME="M25"><B>print</B></A><DD>
Any Unicode printing character, including space.
<P><DT><A NAME="M26"><B>punct</B></A><DD>
Any Unicode punctuation character.
<P><DT><A NAME="M27"><B>space</B></A><DD>
Any Unicode space character.
<P><DT><A NAME="M28"><B>true</B></A><DD>
Any of the forms allowed to <B><A HREF="../TkLib/GetInt.htm">Tcl_GetBoolean</A></B> where the value is true.
<P><DT><A NAME="M29"><B>upper</B></A><DD>
Any upper case alphabet character in the Unicode character set.
<P><DT><A NAME="M30"><B>wordchar</B></A><DD>
Any Unicode word character. That is any alphanumeric character,
and any Unicode connector punctuation characters (e.g. underscore).
<P><DT><A NAME="M31"><B>xdigit</B></A><DD>
Any hexadecimal digit character ([0-9A-Fa-f]).
</DL><P>In the case of <B>boolean</B>, <B>true</B> and <B>false</B>, if the
function will return 0, then the <I>varname</I> will always be set to 0,
due to the varied nature of a valid boolean value.<DL>
<P></DL>
<P><DT><A NAME="M32"><B>string last </B><I>string1 string2</I> ?<I>startIndex</I>?</A><DD>
Search <I>string2</I> for a sequence of characters that exactly match
the characters in <I>string1</I>. If found, return the index of the
first character in the last such match within <I>string2</I>. If there
is no match, then return -1.
If <I>startIndex</I> is specified (in any of the forms accepted by the
<B>index</B> method), then only the characters in <I>string2</I> at or before the
specified <I>startIndex</I> will be considered by the search. For example,
<PRE><B>string last a 0a23456789abcdef 15</B></PRE>
will return <B>10</B>, but
<PRE><B>string last a 0a23456789abcdef 9</B></PRE>
will return <B>1</B>.
<P><DT><A NAME="M33"><B>string length </B><I>string</I></A><DD>
Returns a decimal string giving the number of characters in
<I>string</I>. Note that this is not necessarily the same as the
number of bytes used to store the string.
<P><DT><A NAME="M34"><B>string map</B> ?<B>-nocase</B>? <I>charMap string</I></A><DD>
Replaces characters in <I>string</I> based on the key-value pairs in
<I>charMap</I>. <I>charMap</I> is a list of <I>key value key value</I> ...
as in the form returned by <B><A HREF="../TkCmd/array.htm">array get</A></B>. Each instance of a
key in the string will be replaced with its corresponding value. If
<B>-nocase</B> is specified, then matching is done without regard to
case differences. Both <I>key</I> and <I>value</I> may be multiple
characters. Replacement is done in an ordered manner, so the key appearing
first in the list will be checked first, and so on. <I>string</I> is
only iterated over once, so earlier key replacements will have no
affect for later key matches. For example,
<PRE><B>string map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc</B></PRE>
will return the string <B>01321221</B>.
<P><DT><A NAME="M35"><B>string match</B> ?<B>-nocase</B>? <I>pattern</I> <I>string</I></A><DD>
See if <I>pattern</I> matches <I>string</I>; return 1 if it does, 0
if it doesn't.
If <B>-nocase</B> is specified, then the pattern attempts to match
against the string in a case insensitive manner.
For the two strings to match, their contents
must be identical except that the following special sequences
may appear in <I>pattern</I>:
<P>
<DL>
<P><DT><A NAME="M36"><B>*</B></A><DD>
Matches any sequence of characters in <I>string</I>,
including a null string.
<P><DT><A NAME="M37"><B>?</B></A><DD>
Matches any single character in <I>string</I>.
<P><DT><A NAME="M38"><B>[</B><I>chars</I><B>]</B></A><DD>
Matches any character in the set given by <I>chars</I>. If a sequence
of the form
<I>x</I><B>-</B><I>y</I> appears in <I>chars</I>, then any character
between <I>x</I> and <I>y</I>, inclusive, will match.
When used with <B>-nocase</B>, the end points of the range are converted
to lower case first. Whereas {[A-z]} matches '_' when matching
case-sensitively ('_' falls between the 'Z' and 'a'), with <B>-nocase</B>
this is considered like {[A-Za-z]} (and probably what was meant in the
first place).
<P><DT><A NAME="M39"><B>&#92;</B><I>x</I></A><DD>
Matches the single character <I>x</I>. This provides a way of
avoiding the special interpretation of the characters
<B>*?[]&#92;</B> in <I>pattern</I>.
<P></DL>
<P><DT><A NAME="M40"><B>string range </B><I>string first last</I></A><DD>
Returns a range of consecutive characters from <I>string</I>, starting
with the character whose index is <I>first</I> and ending with the
character whose index is <I>last</I>. An index of 0 refers to the
first character of the string. <I>first</I> and <I>last</I> may be
specified as for the <B>index</B> method.
If <I>first</I> is less than zero then it is treated as if it were zero, and
if <I>last</I> is greater than or equal to the length of the string then
it is treated as if it were <B>end</B>. If <I>first</I> is greater than
<I>last</I> then an empty string is returned.
<P><DT><A NAME="M41"><B>string repeat </B><I>string count</I></A><DD>
Returns <I>string</I> repeated <I>count</I> number of times.
<P><DT><A NAME="M42"><B>string replace </B><I>string first last</I> ?<I>newstring</I>?</A><DD>
Removes a range of consecutive characters from <I>string</I>, starting
with the character whose index is <I>first</I> and ending with the
character whose index is <I>last</I>. An index of 0 refers to the
first character of the string. <I>First</I> and <I>last</I> may be
specified as for the <B>index</B> method. If <I>newstring</I> is
specified, then it is placed in the removed character range.
If <I>first</I> is less than zero then it is treated as if it were zero, and
if <I>last</I> is greater than or equal to the length of the string then
it is treated as if it were <B>end</B>. If <I>first</I> is greater than
<I>last</I> or the length of the initial string, or <I>last</I> is less
than 0, then the initial string is returned untouched.
<P><DT><A NAME="M43"><B>string tolower </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A><DD>
Returns a value equal to <I>string</I> except that all upper (or title) case
letters have been converted to lower case. If <I>first</I> is specified, it
refers to the first char index in the string to start modifying. If
<I>last</I> is specified, it refers to the char index in the string to stop
at (inclusive). <I>first</I> and <I>last</I> may be
specified as for the <B>index</B> method.
<P><DT><A NAME="M44"><B>string totitle </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A><DD>
Returns a value equal to <I>string</I> except that the first character
in <I>string</I> is converted to its Unicode title case variant (or upper
case if there is no title case variant) and the rest of the string is
converted to lower case. If <I>first</I> is specified, it
refers to the first char index in the string to start modifying. If
<I>last</I> is specified, it refers to the char index in the string to stop
at (inclusive). <I>first</I> and <I>last</I> may be
specified as for the <B>index</B> method.
<P><DT><A NAME="M45"><B>string toupper </B><I>string</I> ?<I>first</I>? ?<I>last</I>?</A><DD>
Returns a value equal to <I>string</I> except that all lower (or title) case
letters have been converted to upper case. If <I>first</I> is specified, it
refers to the first char index in the string to start modifying. If
<I>last</I> is specified, it refers to the char index in the string to stop
at (inclusive). <I>first</I> and <I>last</I> may be specified as for the
<B>index</B> method.
<P><DT><A NAME="M46"><B>string trim </B><I>string</I> ?<I>chars</I>?</A><DD>
Returns a value equal to <I>string</I> except that any leading
or trailing characters from the set given by <I>chars</I> are
removed.
If <I>chars</I> is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
<P><DT><A NAME="M47"><B>string trimleft </B><I>string</I> ?<I>chars</I>?</A><DD>
Returns a value equal to <I>string</I> except that any
leading characters from the set given by <I>chars</I> are
removed.
If <I>chars</I> is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
<P><DT><A NAME="M48"><B>string trimright </B><I>string</I> ?<I>chars</I>?</A><DD>
Returns a value equal to <I>string</I> except that any
trailing characters from the set given by <I>chars</I> are
removed.
If <I>chars</I> is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
<P><DT><A NAME="M49"><B>string wordend </B><I>string charIndex</I></A><DD>
Returns the index of the character just after the last one in the word
containing character <I>charIndex</I> of <I>string</I>. <I>charIndex</I>
may be specified as for the <B>index</B> method. A word is
considered to be any contiguous range of alphanumeric (Unicode letters
or decimal digits) or underscore (Unicode connector punctuation)
characters, or any single character other than these.
<P><DT><A NAME="M50"><B>string wordstart </B><I>string charIndex</I></A><DD>
Returns the index of the first character in the word containing
character <I>charIndex</I> of <I>string</I>. <I>charIndex</I> may be
specified as for the <B>index</B> method. A word is considered to be any
contiguous range of alphanumeric (Unicode letters or decimal digits)
or underscore (Unicode connector punctuation) characters, or any
single character other than these.
<P></DL>
<H3><A NAME="M51">SEE ALSO</A></H3>
<B><A HREF="../TkCmd/expr.htm">expr</A></B>, <B><A HREF="../TkCmd/list.htm">list</A></B>
<H3><A NAME="M52">KEYWORDS</A></H3>
<A href="../Keywords/C.htm#case conversion">case conversion</A>, <A href="../Keywords/C.htm#compare">compare</A>, <A href="../Keywords/I.htm#index">index</A>, <A href="../Keywords/M.htm#match">match</A>, <A href="../Keywords/P.htm#pattern">pattern</A>, <A href="../Keywords/S.htm#string">string</A>, <A href="../Keywords/W.htm#word">word</A>, <A href="../Keywords/E.htm#equal">equal</A>, <A href="../Keywords/C.htm#ctype">ctype</A>
<HR><PRE>
<A HREF="../copyright.htm">Copyright</A> &#169; 1993 The Regents of the University of California.
<A HREF="../copyright.htm">Copyright</A> &#169; 1994-1996 Sun Microsystems, Inc.
<A HREF="../copyright.htm">Copyright</A> &#169; 1995-1997 Roger E. Critchlow Jr.</PRE>
</BODY></HTML>