84 lines
4.2 KiB
HTML
84 lines
4.2 KiB
HTML
<HTML><HEAD><TITLE>Tcl Built-In Commands - encoding manual page</TITLE></HEAD><BODY>
|
|
<DL>
|
|
<DD><A HREF="encoding.htm#M2" NAME="L180">NAME</A>
|
|
<DL><DD>encoding - Manipulate encodings</DL>
|
|
<DD><A HREF="encoding.htm#M3" NAME="L181">SYNOPSIS</A>
|
|
<DL>
|
|
<DD><B>encoding </B><I>option</I> ?<I>arg arg ...</I>?
|
|
</DL>
|
|
<DD><A HREF="encoding.htm#M4" NAME="L182">INTRODUCTION</A>
|
|
<DD><A HREF="encoding.htm#M5" NAME="L183">DESCRIPTION</A>
|
|
<DL>
|
|
<DD><A HREF="encoding.htm#M6" NAME="L184"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A>
|
|
<DD><A HREF="encoding.htm#M7" NAME="L185"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A>
|
|
<DD><A HREF="encoding.htm#M8" NAME="L186"><B>encoding names</B></A>
|
|
<DD><A HREF="encoding.htm#M9" NAME="L187"><B>encoding system</B> ?<I>encoding</I>?</A>
|
|
</DL>
|
|
<DD><A HREF="encoding.htm#M10" NAME="L188">EXAMPLE</A>
|
|
<DD><A HREF="encoding.htm#M11" NAME="L189">SEE ALSO</A>
|
|
<DD><A HREF="encoding.htm#M12" NAME="L190">KEYWORDS</A>
|
|
</DL><HR>
|
|
<H3><A NAME="M2">NAME</A></H3>
|
|
encoding - Manipulate encodings
|
|
<H3><A NAME="M3">SYNOPSIS</A></H3>
|
|
<B>encoding </B><I>option</I> ?<I>arg arg ...</I>?<BR>
|
|
<H3><A NAME="M4">INTRODUCTION</A></H3>
|
|
Strings in Tcl are encoded using 16-bit Unicode characters. Different
|
|
operating system interfaces or applications may generate strings in
|
|
other encodings such as Shift-JIS. The <B>encoding</B> command helps
|
|
to bridge the gap between Unicode and these other formats.
|
|
|
|
<H3><A NAME="M5">DESCRIPTION</A></H3>
|
|
Performs one of several encoding related operations, depending on
|
|
<I>option</I>. The legal <I>option</I>s are:
|
|
<P>
|
|
<DL>
|
|
<P><DT><A NAME="M6"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A><DD>
|
|
Convert <I>data</I> to Unicode from the specified <I>encoding</I>. The
|
|
characters in <I>data</I> are treated as binary data where the lower
|
|
8-bits of each character is taken as a single byte. The resulting
|
|
sequence of bytes is treated as a string in the specified
|
|
<I>encoding</I>. If <I>encoding</I> is not specified, the current
|
|
system encoding is used.
|
|
<P><DT><A NAME="M7"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A><DD>
|
|
Convert <I>string</I> from Unicode to the specified <I>encoding</I>.
|
|
The result is a sequence of bytes that represents the converted
|
|
string. Each byte is stored in the lower 8-bits of a Unicode
|
|
character. If <I>encoding</I> is not specified, the current
|
|
system encoding is used.
|
|
<P><DT><A NAME="M8"><B>encoding names</B></A><DD>
|
|
Returns a list containing the names of all of the encodings that are
|
|
currently available.
|
|
<P><DT><A NAME="M9"><B>encoding system</B> ?<I>encoding</I>?</A><DD>
|
|
Set the system encoding to <I>encoding</I>. If <I>encoding</I> is
|
|
omitted then the command returns the current system encoding. The
|
|
system encoding is used whenever Tcl passes strings to system calls.
|
|
|
|
<P></DL>
|
|
<H3><A NAME="M10">EXAMPLE</A></H3>
|
|
It is common practice to write script files using a text editor that
|
|
produces output in the euc-jp encoding, which represents the ASCII
|
|
characters as singe bytes and Japanese characters as two bytes. This
|
|
makes it easy to embed literal strings that correspond to non-ASCII
|
|
characters by simply typing the strings in place in the script.
|
|
However, because the <B><A HREF="../TkCmd/source.htm">source</A></B> command always reads files using the
|
|
ISO8859-1 encoding, Tcl will treat each byte in the file as a separate
|
|
character that maps to the 00 page in Unicode. The
|
|
resulting Tcl strings will not contain the expected Japanese
|
|
characters. Instead, they will contain a sequence of Latin-1
|
|
characters that correspond to the bytes of the original string. The
|
|
<B>encoding</B> command can be used to convert this string to the
|
|
expected Japanese Unicode characters. For example,
|
|
<PRE>set s [encoding convertfrom euc-jp "\xA4\xCF"]</PRE>
|
|
would return the Unicode string "\u306F", which is the Hiragana
|
|
letter HA.
|
|
|
|
<H3><A NAME="M11">SEE ALSO</A></H3>
|
|
<B><A HREF="../TkLib/Encoding.htm">Tcl_GetEncoding</A></B>
|
|
<H3><A NAME="M12">KEYWORDS</A></H3>
|
|
<A href="../Keywords/E.htm#encoding">encoding</A>
|
|
<HR><PRE>
|
|
<A HREF="../copyright.htm">Copyright</A> © 1998 by Scriptics Corporation.
|
|
<A HREF="../copyright.htm">Copyright</A> © 1995-1997 Roger E. Critchlow Jr.</PRE>
|
|
</BODY></HTML>
|