[ Index ]

PHP Cross Reference of Unnamed Project

title

Body

[close]

/lib/typo3/csconvtbl/ -> readme.txt (source)

   1  Charset Conversion Tables
   2  
   3  
   4  The files in this directory contains conversion tables from a number of charsets (corresponding to the filenames) to UNICODE.
   5  This is used by the TYPO3 class "t3lib_cs" which uses these tables when asked to convert a string from one charset to another.
   6  
   7  Conversion tables are reproduced from http://www.microsoft.com/globaldev/reference/cphome.mspx
   8  Found overview on this page: http://www.i18nguy.com/unicode/codepages.html
   9  A mirror of Czyborra's pages: http://aspell.net/charsets/
  10  
  11  Further a lot of mapping tables are found here as well:
  12  http://www.unicode.org/Public/MAPPINGS/
  13  
  14  
  15  
  16  
  17  
  18  PARSING:
  19  The conversion table files are parsed linie by linie, extracting by either of these formulars:
  20  
  21  Syntax 1:
  22  [Local charset value, hex] = U+[UNICODE hex number] : [descriptive text, ignored]
  23  
  24  Example:
  25  A0 = U+00A0 : NO-BREAK SPACE
  26  (eg. iso-8859-1.tbl)
  27  
  28  Syntax 2:
  29  0x[Local charset value, hex]    0x[UNICODE hex number]        [descriptive text, ignored]
  30  
  31  Example:
  32  0xA0  0x00A0      NO-BREAK SPACE
  33  (eg. big5.tbl)
  34  
  35  
  36  Lines beginning with "#" or empty lines are ignored.
  37  The syntax is auto-detected based on the first line found in the file.
  38  Syntax 2 is directly from http://www.unicode.org/Public/MAPPINGS/ and you can probably take any charmap there and just copy into th csconvtbl/ folder and it will be ready for use.
  39  
  40  
  41  
  42  
  43  
  44  
  45  
  46  
  47  
  48  
  49  
  50  
  51  
  52  
  53  
  54  INDEX:
  55  
  56  iso-8859-1.tbl
  57  ISO Character Set 8859-1 (Latin 1)
  58  http://www.microsoft.com/globaldev/reference/iso/28591.htm
  59  
  60  iso-8859-2.tbl
  61  ISO Character Set 8859-2 (Latin 2)
  62  http://www.microsoft.com/globaldev/reference/iso/28592.htm
  63  
  64  iso-8859-3.tbl
  65  ISO Character Set 8859-3 (Latin 3)
  66  http://www.microsoft.com/globaldev/reference/iso/28593.htm
  67  
  68  iso-8859-4.tbl
  69  ISO Character Set 8859-4 (Baltic)
  70  http://www.microsoft.com/globaldev/reference/iso/28594.htm
  71  
  72  iso-8859-5.tbl
  73  ISO Character Set 8859-5 (Cyrillic)
  74  http://www.microsoft.com/globaldev/reference/iso/28595.htm
  75  
  76  iso-8859-6.tbl
  77  ISO Character Set 8859-6 (Arabic)
  78  http://www.microsoft.com/globaldev/reference/iso/28596.htm
  79  
  80  iso-8859-7.tbl
  81  ISO Character Set 8859-7 (Greek)
  82  http://www.microsoft.com/globaldev/reference/iso/28597.htm
  83  
  84  iso-8859-8.tbl
  85  ISO Character Set 8859-8 (Hebrew)
  86  http://www.microsoft.com/globaldev/reference/iso/28598.htm
  87  
  88  iso-8859-9.tbl
  89  ISO Character Set 8859-9 (Turkish)
  90  http://www.microsoft.com/globaldev/reference/iso/28599.htm
  91  
  92  iso-8859-10.tbl
  93  ISO Character Set 8859-10
  94  http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-10.TXT
  95  
  96  iso-8859-11.tbl
  97  ISO Character Set 8859-11 (Thai)
  98  http://czyborra.com/charsets/iso8859.html
  99  http://aspell.net/charsets/iso8859.html
 100  http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT
 101  
 102  iso-8859-13.tbl
 103  ISO Character Set 8859-13 (Lithuanian)
 104  http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-13.TXT
 105  
 106  iso-8859-14.tbl
 107  ISO Character Set 8859-14 (Celtic)
 108  http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-10.TXT
 109  
 110  iso-8859-15.tbl
 111  ISO Character Set 8859-15 (Latin 9)
 112  http://www.microsoft.com/globaldev/reference/iso/28605.htm
 113  
 114  iso-8859-16.tbl
 115  ISO Character Set 8859-16 (Romanian)
 116  http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-16.TXT
 117  
 118  windows-1250.tbl
 119  Microsoft Windows Codepage : 1250 (Central Europe)
 120  http://www.microsoft.com/globaldev/reference/sbcs/1250.htm
 121  
 122  windows-1251.tbl
 123  Microsoft Windows Codepage : 1251 (Cyrillic)
 124  http://www.microsoft.com/globaldev/reference/sbcs/1251.htm
 125  
 126  windows-1252.tbl
 127  Microsoft Windows Codepage : 1252 (Latin I)
 128  http://www.microsoft.com/globaldev/reference/sbcs/1252.htm
 129  
 130  windows-1253.tbl
 131  Microsoft Windows Code Page : 1253 (Greek)
 132  http://www.microsoft.com/globaldev/reference/sbcs/1253.htm
 133  
 134  windows-1254.tbl
 135  Microsoft Windows Codepage : 1254 (Turkish)
 136  http://www.microsoft.com/globaldev/reference/sbcs/1254.htm
 137  
 138  windows-1255.tbl
 139  Microsoft Windows Codepage : 1255 (Hebrew)
 140  http://www.microsoft.com/globaldev/reference/sbcs/1255.htm
 141  
 142  windows-1256.tbl
 143  Microsoft Windows Codepage : 1256 (Arabic)
 144  http://www.microsoft.com/globaldev/reference/sbcs/1256.htm
 145  
 146  windows-1257.tbl
 147  Microsoft Windows Codepage : 1257 (Baltic)
 148  http://www.microsoft.com/globaldev/reference/sbcs/1257.htm
 149  
 150  windows-1258.tbl
 151  Microsoft Windows Codepage : 1258 (Viet Nam)
 152  http://www.microsoft.com/globaldev/reference/sbcs/1258.htm
 153  
 154  windows-874.tbl
 155  Microsoft Windows Codepage : 874 (Thai)
 156  http://www.microsoft.com/globaldev/reference/sbcs/874.htm
 157  
 158  shift_jis.tbl
 159  Microsoft Windows Codepage : 932 (Japanese Shift-JIS)
 160  http://www.microsoft.com/globaldev/reference/dbcs/932.htm
 161  (Multibyte)
 162  
 163  gb2312.tbl
 164  Microsoft Windows Codepage : 936 (Simplified Chinese GBK)
 165  gb2312 936 Chinese Simplified (GB2312)
 166  gb_2312-80 936 Chinese Simplified (GB2312)
 167  http://www.microsoft.com/globaldev/reference/dbcs/936.htm
 168  (Multibyte)
 169  Note: this is a MS-specific superset of the real GB2312
 170  
 171  euc-kr.tbl
 172  Microsoft Windows Codepage : 949 (Korean EUC-KR)
 173  http://www.microsoft.com/globaldev/reference/dbcs/932.htm
 174  (Multibyte)
 175  Note: this is a MS-specific superset of the real EUC-KR
 176  
 177  big5.tbl
 178  Microsoft Windows Codepage : 950 (Traditional Chinese Big5)
 179  http://www.microsoft.com/globaldev/reference/dbcs/950.htm
 180  (Multibyte)
 181  Note: this is a MS-specific superset of the real Big5
 182  
 183  
 184  koi8-r.tbl
 185  Cyrillic (Russian)
 186  http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT
 187  


Generated: Thu Aug 11 10:00:09 2016 Cross-referenced by PHPXref 0.7.1