projman/hlp/en/tcl/encoding.htm

<HTML><HEAD><TITLE>Tcl Built-In Commands - encoding manual page</TITLE></HEAD><BODY>
<DL>
<DD><A HREF="encoding.htm#M2" NAME="L180">NAME</A>
<DL><DD>encoding - Manipulate encodings</DL>
<DD><A HREF="encoding.htm#M3" NAME="L181">SYNOPSIS</A>
<DL>
<DD><B>encoding </B><I>option</I> ?<I>arg arg ...</I>?
</DL>
<DD><A HREF="encoding.htm#M4" NAME="L182">INTRODUCTION</A>
<DD><A HREF="encoding.htm#M5" NAME="L183">DESCRIPTION</A>
<DL>
<DD><A HREF="encoding.htm#M6" NAME="L184"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A>
<DD><A HREF="encoding.htm#M7" NAME="L185"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A>
<DD><A HREF="encoding.htm#M8" NAME="L186"><B>encoding names</B></A>
<DD><A HREF="encoding.htm#M9" NAME="L187"><B>encoding system</B> ?<I>encoding</I>?</A>
</DL>
<DD><A HREF="encoding.htm#M10" NAME="L188">EXAMPLE</A>
<DD><A HREF="encoding.htm#M11" NAME="L189">SEE ALSO</A>
<DD><A HREF="encoding.htm#M12" NAME="L190">KEYWORDS</A>
</DL><HR>
<H3><A NAME="M2">NAME</A></H3>
encoding - Manipulate encodings
<H3><A NAME="M3">SYNOPSIS</A></H3>
<B>encoding </B><I>option</I> ?<I>arg arg ...</I>?<BR>
<H3><A NAME="M4">INTRODUCTION</A></H3>
Strings in Tcl are encoded using 16-bit Unicode characters.  Different
operating system interfaces or applications may generate strings in
other encodings such as Shift-JIS.  The <B>encoding</B> command helps
to bridge the gap between Unicode and these other formats.

<H3><A NAME="M5">DESCRIPTION</A></H3>
Performs one of several encoding related operations, depending on
<I>option</I>.  The legal <I>option</I>s are:
<P>
<DL>
<P><DT><A NAME="M6"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A><DD>
Convert <I>data</I> to Unicode from the specified <I>encoding</I>.  The
characters in <I>data</I> are treated as binary data where the lower
8-bits of each character is taken as a single byte.  The resulting
sequence of bytes is treated as a string in the specified
<I>encoding</I>.  If <I>encoding</I> is not specified, the current
system encoding is used.
<P><DT><A NAME="M7"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A><DD>
Convert <I>string</I> from Unicode to the specified <I>encoding</I>.
The result is a sequence of bytes that represents the converted
string.  Each byte is stored in the lower 8-bits of a Unicode
character.  If <I>encoding</I> is not specified, the current
system encoding is used.
<P><DT><A NAME="M8"><B>encoding names</B></A><DD>
Returns a list containing the names of all of the encodings that are
currently available. 
<P><DT><A NAME="M9"><B>encoding system</B> ?<I>encoding</I>?</A><DD>
Set the system encoding to <I>encoding</I>. If <I>encoding</I> is
omitted then the command returns the current system encoding.  The
system encoding is used whenever Tcl passes strings to system calls.

<P></DL>
<H3><A NAME="M10">EXAMPLE</A></H3>
It is common practice to write script files using a text editor that
produces output in the euc-jp encoding, which represents the ASCII
characters as singe bytes and Japanese characters as two bytes.  This
makes it easy to embed literal strings that correspond to non-ASCII
characters by simply typing the strings in place in the script.
However, because the <B><A HREF="../TkCmd/source.htm">source</A></B> command always reads files using the
ISO8859-1 encoding, Tcl will treat each byte in the file as a separate
character that maps to the 00 page in Unicode.  The
resulting Tcl strings will not contain the expected Japanese
characters.  Instead, they will contain a sequence of Latin-1
characters that correspond to the bytes of the original string.  The
<B>encoding</B> command can be used to convert this string to the
expected Japanese Unicode characters.  For example,
<PRE>set s [encoding convertfrom euc-jp &quot;&#92;xA4&#92;xCF&quot;]</PRE>
would return the Unicode string &quot;&#92;u306F&quot;, which is the Hiragana
letter HA.

<H3><A NAME="M11">SEE ALSO</A></H3>
<B><A HREF="../TkLib/Encoding.htm">Tcl_GetEncoding</A></B>
<H3><A NAME="M12">KEYWORDS</A></H3>
<A href="../Keywords/E.htm#encoding">encoding</A>
<HR><PRE>
<A HREF="../copyright.htm">Copyright</A> &#169; 1998 by Scriptics Corporation.
<A HREF="../copyright.htm">Copyright</A> &#169; 1995-1997 Roger E. Critchlow Jr.</PRE>
</BODY></HTML>
Initial release 2015-10-19 13:27:31 +03:00			`<HTML><HEAD><TITLE>Tcl Built-In Commands - encoding manual page</TITLE></HEAD><BODY>`
			`<DL>`
			`<DD><A HREF="encoding.htm#M2" NAME="L180">NAME</A>`
			`<DL><DD>encoding - Manipulate encodings</DL>`
			`<DD><A HREF="encoding.htm#M3" NAME="L181">SYNOPSIS</A>`
			`<DL>`
			`<DD><B>encoding </B><I>option</I> ?<I>arg arg ...</I>?`
			`</DL>`
			`<DD><A HREF="encoding.htm#M4" NAME="L182">INTRODUCTION</A>`
			`<DD><A HREF="encoding.htm#M5" NAME="L183">DESCRIPTION</A>`
			`<DL>`
			`<DD><A HREF="encoding.htm#M6" NAME="L184"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A>`
			`<DD><A HREF="encoding.htm#M7" NAME="L185"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A>`
			`<DD><A HREF="encoding.htm#M8" NAME="L186"><B>encoding names</B></A>`
			`<DD><A HREF="encoding.htm#M9" NAME="L187"><B>encoding system</B> ?<I>encoding</I>?</A>`
			`</DL>`
			`<DD><A HREF="encoding.htm#M10" NAME="L188">EXAMPLE</A>`
			`<DD><A HREF="encoding.htm#M11" NAME="L189">SEE ALSO</A>`
			`<DD><A HREF="encoding.htm#M12" NAME="L190">KEYWORDS</A>`
			`</DL><HR>`
			`<H3><A NAME="M2">NAME</A></H3>`
			`encoding - Manipulate encodings`
			`<H3><A NAME="M3">SYNOPSIS</A></H3>`
			`<B>encoding </B><I>option</I> ?<I>arg arg ...</I>?<BR>`
			`<H3><A NAME="M4">INTRODUCTION</A></H3>`
			`Strings in Tcl are encoded using 16-bit Unicode characters. Different`
			`operating system interfaces or applications may generate strings in`
			`other encodings such as Shift-JIS. The <B>encoding</B> command helps`
			`to bridge the gap between Unicode and these other formats.`

			`<H3><A NAME="M5">DESCRIPTION</A></H3>`
			`Performs one of several encoding related operations, depending on`
			`<I>option</I>. The legal <I>option</I>s are:`
			`<P>`
			`<DL>`
			`<P><DT><A NAME="M6"><B>encoding convertfrom ?</B><I>encoding</I>? <I>data</I></A><DD>`
			`Convert <I>data</I> to Unicode from the specified <I>encoding</I>. The`
			`characters in <I>data</I> are treated as binary data where the lower`
			`8-bits of each character is taken as a single byte. The resulting`
			`sequence of bytes is treated as a string in the specified`
			`<I>encoding</I>. If <I>encoding</I> is not specified, the current`
			`system encoding is used.`
			`<P><DT><A NAME="M7"><B>encoding convertto ?</B><I>encoding</I>? <I>string</I></A><DD>`
			`Convert <I>string</I> from Unicode to the specified <I>encoding</I>.`
			`The result is a sequence of bytes that represents the converted`
			`string. Each byte is stored in the lower 8-bits of a Unicode`
			`character. If <I>encoding</I> is not specified, the current`
			`system encoding is used.`
			`<P><DT><A NAME="M8"><B>encoding names</B></A><DD>`
			`Returns a list containing the names of all of the encodings that are`
			`currently available.`
			`<P><DT><A NAME="M9"><B>encoding system</B> ?<I>encoding</I>?</A><DD>`
			`Set the system encoding to <I>encoding</I>. If <I>encoding</I> is`
			`omitted then the command returns the current system encoding. The`
			`system encoding is used whenever Tcl passes strings to system calls.`

			`<P></DL>`
			`<H3><A NAME="M10">EXAMPLE</A></H3>`
			`It is common practice to write script files using a text editor that`
			`produces output in the euc-jp encoding, which represents the ASCII`
			`characters as singe bytes and Japanese characters as two bytes. This`
			`makes it easy to embed literal strings that correspond to non-ASCII`
			`characters by simply typing the strings in place in the script.`
			`However, because the <B><A HREF="../TkCmd/source.htm">source</A></B> command always reads files using the`
			`ISO8859-1 encoding, Tcl will treat each byte in the file as a separate`
			`character that maps to the 00 page in Unicode. The`
			`resulting Tcl strings will not contain the expected Japanese`
			`characters. Instead, they will contain a sequence of Latin-1`
			`characters that correspond to the bytes of the original string. The`
			`<B>encoding</B> command can be used to convert this string to the`
			`expected Japanese Unicode characters. For example,`
			`<PRE>set s [encoding convertfrom euc-jp "\xA4\xCF"]</PRE>`
			`would return the Unicode string "\u306F", which is the Hiragana`
			`letter HA.`

			`<H3><A NAME="M11">SEE ALSO</A></H3>`
			`<B><A HREF="../TkLib/Encoding.htm">Tcl_GetEncoding</A></B>`
			`<H3><A NAME="M12">KEYWORDS</A></H3>`
			`<A href="../Keywords/E.htm#encoding">encoding</A>`
			`<HR><PRE>`
			`<A HREF="../copyright.htm">Copyright</A> © 1998 by Scriptics Corporation.`
			`<A HREF="../copyright.htm">Copyright</A> © 1995-1997 Roger E. Critchlow Jr.</PRE>`
			`</BODY></HTML>`