Tamil Script Code for Information Interchange: Difference between revisions
imported>Alangar Manickam No edit summary |
imported>W.andrea Move external links to the correct section. Fix bolding. |
||
| Line 1: | Line 1: | ||
{{Short description|Digital representation of text characters}} | {{Short description|Digital representation of text characters}} | ||
{{Contains special characters|Indic}} | {{Contains special characters|Indic}} | ||
'''Tamil Script Code for Information Interchange''' ('''TSCII''') is a coding scheme for representing the [[Tamil script]]. The lower 128 codepoints are plain [[American Standard Code for Information Interchange|ASCII]], the upper 128 codepoints are TSCII-specific. After long years of being used on the Internet by private agreement only, it was successfully registered with the [[Internet Assigned Numbers Authority|IANA]] in 2007.<ref>{{Cite web | url=https://www.iana.org/assignments/charset-reg/TSCII | format=TXT | title=Character set name: TSCII (TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE) | website=www.iana.org | publisher=[[IANA]]}}</ref> | '''Tamil Script Code for Information Interchange''' ('''TSCII''') is a coding scheme for representing the [[Tamil script]]. The lower 128 codepoints are plain [[American Standard Code for Information Interchange|ASCII]], and the upper 128 codepoints are TSCII-specific. After long years of being used on the Internet by private agreement only, it was successfully registered with the [[Internet Assigned Numbers Authority|IANA]] in 2007.<ref>{{Cite web | url=https://www.iana.org/assignments/charset-reg/TSCII | format=TXT | title=Character set name: TSCII (TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE) | website=www.iana.org | publisher=[[IANA]]}}</ref> | ||
TSCII encodes the characters in visual (written) order, paralleling the use of the Tamil Typewriter. [[Unicode]], instead, uses the logical order encoding strategy for Tamil, following [[ISCII]], in contrast to the case of [[Thai alphabet|Thai]], where the visual order encoding grandfathered by [[Thai Industrial Standard 620-2533|TIS-620]] was adopted. | |||
The government of [[Tamil Nadu]] endorses its own TAB/TAM standards for 8-bit encoding and other, older encoding schemes can still be found on the web. | The government of [[Tamil Nadu]] endorses its own TAB/TAM standards for 8-bit encoding and other, older encoding schemes can still be found on the web. | ||
==History== | ==History== | ||
The need for a common encoding for Tamil was felt by members of various mailing list based forums in mid-1990s, as there were multiple custom coded fonts were prevalent in those forums. While some of the commercial encodings were popular than the others, they were not accepted by wider community due to conflicting commercial interests. While Unicode was accepted by most as the future standard, most of the desktop systems at that time were still not capable of handling Unicode for Tamil language, and an interim 8-bit encoding was required. | The need for a common encoding for Tamil was felt by members of various mailing list based forums in the mid-1990s, as there were multiple custom coded fonts were prevalent in those forums. While some of the commercial encodings were popular than the others, they were not accepted by wider community due to conflicting commercial interests. While Unicode was accepted by most as the future standard, most of the desktop systems at that time were still not capable of handling Unicode for Tamil language, and an interim 8-bit encoding was required. | ||
A separate mailing list for discussion of such encodings (webmasters@tamil.net) was created in 1997 to initiate this discussion, starting with an email written by [[Kuppuswamy Kalyanasundaram|Dr.K.Kalyanasundaram]] to the popular Tamil author [[Sujatha (writer)|Sujatha]] who headed the committee for standardization of Tamil keyboard.<ref>{{Cite web|url=http://www.infitt.org/tscii/archives/msg00001.html|title = A proposal for font encoding scheme for tamil}}</ref> This forum quickly attracted enthusiastic participants from across the globe, including several prominent Tamil scholars.{{NPOV inline|date=August 2024}} Archives of these discussion are maintained by [[INFITT]].<ref>{{Cite web|url=http://www.infitt.org/tscii/archives/maillist.html|title = Tamil Discussion at webmasters@tamil.net}}</ref> | A separate mailing list for discussion of such encodings (webmasters@tamil.net) was created in 1997 to initiate this discussion, starting with an email written by [[Kuppuswamy Kalyanasundaram|Dr.K.Kalyanasundaram]] to the popular Tamil author [[Sujatha (writer)|Sujatha]] who headed the committee for standardization of Tamil keyboard.<ref>{{Cite web|url=http://www.infitt.org/tscii/archives/msg00001.html|title = A proposal for font encoding scheme for tamil}}</ref> This forum quickly attracted enthusiastic participants from across the globe, including several prominent Tamil scholars.{{NPOV inline|date=August 2024}} Archives of these discussion are maintained by [[INFITT]].<ref>{{Cite web|url=http://www.infitt.org/tscii/archives/maillist.html|title = Tamil Discussion at webmasters@tamil.net}}</ref> | ||
| Line 172: | Line 170: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Whereas conversion from TSCII to UTF-8 is done by interchanging '''-f''' and '''-t''' flags. | Whereas conversion from TSCII to UTF-8 is done by interchanging '''-f''' and '''-t''' flags. | ||
== See also == | == See also == | ||
| Line 180: | Line 175: | ||
* [[Clip font]] | * [[Clip font]] | ||
* [[Tamil keyboard]] | * [[Tamil keyboard]] | ||
* | * {{ill|தமிழ் 99|ta}} | ||
* [[InScript]] | * [[InScript]] | ||
* [[Tamil (Unicode block)]] | * [[Tamil (Unicode block)]] | ||
* [[Tamil blogosphere]] | * [[Tamil blogosphere]] | ||
== References == | == References == | ||
| Line 194: | Line 188: | ||
* [http://www.infitt.org/ INFITT (International Forum for Information Technology in Tamil)] | * [http://www.infitt.org/ INFITT (International Forum for Information Technology in Tamil)] | ||
* [https://web.archive.org/web/20030401012750/http://tamilone.com/ TSCII to Unicode Online & Webpage Conversion] | * [https://web.archive.org/web/20030401012750/http://tamilone.com/ TSCII to Unicode Online & Webpage Conversion] | ||
* [http://padma.mozdev.org Padma – Mozilla extension for transforming TSCII to Unicode] | * [http://padma.mozdev.org Padma – Mozilla extension for transforming TSCII to Unicode] {{Webarchive|url=https://web.archive.org/web/20191001172317/http://padma.mozdev.org/ |date=2019-10-01 }} | ||
* The free etext collection at [https://web.archive.org/web/20040711082741/http://www.tamil.net/projectmadurai/ Project Madurai] uses the TSCII encoding, but has already started to provide Unicode versions. | |||
* [https://github.com/ThaniThamizhAkarathiKalanjiyam/AnyTaFont2UTF8 AnyTaFont2UTF8] – an [[open source]] project for all Tamil encoding/font mapping characters, maintained by [https://groups.yahoo.com/groups/isaiyini Isaiyini Tamil Community] | |||
{{character encoding}} | {{character encoding}} | ||
Latest revision as of 22:32, 2 August 2025
Template:Short description Template:Contains special characters Tamil Script Code for Information Interchange (TSCII) is a coding scheme for representing the Tamil script. The lower 128 codepoints are plain ASCII, and the upper 128 codepoints are TSCII-specific. After long years of being used on the Internet by private agreement only, it was successfully registered with the IANA in 2007.[1]
TSCII encodes the characters in visual (written) order, paralleling the use of the Tamil Typewriter. Unicode, instead, uses the logical order encoding strategy for Tamil, following ISCII, in contrast to the case of Thai, where the visual order encoding grandfathered by TIS-620 was adopted.
The government of Tamil Nadu endorses its own TAB/TAM standards for 8-bit encoding and other, older encoding schemes can still be found on the web.
History
The need for a common encoding for Tamil was felt by members of various mailing list based forums in the mid-1990s, as there were multiple custom coded fonts were prevalent in those forums. While some of the commercial encodings were popular than the others, they were not accepted by wider community due to conflicting commercial interests. While Unicode was accepted by most as the future standard, most of the desktop systems at that time were still not capable of handling Unicode for Tamil language, and an interim 8-bit encoding was required.
A separate mailing list for discussion of such encodings (webmasters@tamil.net) was created in 1997 to initiate this discussion, starting with an email written by Dr.K.Kalyanasundaram to the popular Tamil author Sujatha who headed the committee for standardization of Tamil keyboard.[2] This forum quickly attracted enthusiastic participants from across the globe, including several prominent Tamil scholars.Template:NPOV inline Archives of these discussion are maintained by INFITT.[3]
Subsequent to publishing TSCII, most of the members of webmasters@tamil.net mailing list became part of INFITT, which is a wider initiative to bring in standardization and continued development in various areas of Tamil computing.
Codepage layout
Conversion Tools
Text encoded in UTF-8 can be converted to TSCII using the GNU iconv tools as follows,
$ iconv -f utf-8 -t tscii hello.utf8 > hello.tscii
Whereas conversion from TSCII to UTF-8 is done by interchanging -f and -t flags.
See also
- TACE16 (Tamil All Character Encoding)
- Clip font
- Tamil keyboard
- Template:Ill
- InScript
- Tamil (Unicode block)
- Tamil blogosphere
References
<templatestyles src="Reflist/styles.css" />
Script error: No such module "Check for unknown parameters".
External links
- TSCII Start Page
- Unicode Technical Note #15 Text conversion From TSCII 1.7 to Unicode
- INFITT (International Forum for Information Technology in Tamil)
- TSCII to Unicode Online & Webpage Conversion
- Padma – Mozilla extension for transforming TSCII to Unicode Template:Webarchive
- The free etext collection at Project Madurai uses the TSCII encoding, but has already started to provide Unicode versions.
- AnyTaFont2UTF8 – an open source project for all Tamil encoding/font mapping characters, maintained by Isaiyini Tamil Community