Translated and annotated by Jony Rosenne (rosennej@qsm.co.il), January 2001.
Translator Notes:
This standard defines the implementation of Hebrew in identification cards with integrated circuits. It is based on the international standards and specifications relating to "smart cards".
Identification cards with integrated circuits may contain various types of data, for example textual data such as name and address. Textual data must be represented in a uniform manner to allow interoperability between various applications.
The representation of the data in these cards is based on the ASN.1 standard.
1.1 This standard defines the uniform implementation of the Hebrew language in identification cards with integrated circuits (ICCs - Integrated Circuit Cards), the rendering of the Hebrew information in them in accordance with the relevant international standards and specifications, and the manner of presenting this information.
1.2 This standard is intended for users of cards that include textual data, and for users of similar technologies, such as contactless cards and integrated circuits that are not integrated in cards but in other devices.
| SI 1080.1 (1996) | Information Technology - Definition of Terms:Basic Terminology |
| SI 1080.4 (1996) | Information Technology - Definition of Terms:Data Organization |
| SI 1311 (1989) | Information Processing: ISO 8 bit coded character set for information interchange. [Note: This standard is equivalent to ISO 8859-8.] |
| SI 1489 (1992) | Architecture for implementation of the Hebrew language in telematic systems |
| ISO 7816-4:1995 | Information technology - Identification cards - Integrated circuit(s) cards with contacts - Part 4: Interindustry commands for interchange |
| ISO/IEC 7816-6:1996 | Identification cards - Integrated circuit(s) cards with contacts - Part 6: Interindustry data elements |
| ISO/IEC 8825:1990 | Information technology - Open Systems Interconnection - Specification of Basic Encoding Rules for Abstract Syntax Notation One (ASN.1) |
| ISO/IEC 8825-1:1995 | Information technology - Open Systems Interconnection - Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER) |
| ISO 8859-1:1987 | Information Processing - 8-bit Single-Byte Coded Graphic Character Sets - Part 1:Latin Alphabet No. 1 |
| ISO 8859-8:1988 | Information Processing - 8-bit Single-Byte Coded Graphic Character Sets - Part 8:Latin/Hebrew Alphabet |
| ISO/IEC 10646-1:1993 | Information Technology - Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane |
| Unicode 2.0 | The Unicode Standard, Version 2.0 (1996) |
Seven characters that guide the bidirectional implicit algorithm in special cases in addition to the analysis of the properties of the text characters. These codes are defined in ISO-10646-1, Annex D. [Note: See this link for the directional formatting codes].
The data in the cards shall be encoded in ASN.1 format, as follows:
ISO 7816-4 specifies the TLV (Tag - Length - Value) method, defined by ISO 8825 and 8825-1. ISO/IEC 7816-6 specifies the encoding of standard data fields, such name and address.
Note: ISO 7816-6 specifies that the name shall be encoded in the Latin script according to ISO 8859-1, and that for languages such as Hebrew, that use a different script, the data shall be transcripted according to a suitable standard (ISO is preparing ISO 259-3).
In accordance with these standards, Hebrew may be encoded in two ways:
For BMPString data the UCS is encoded as two octets (16 bits) per character. The universal character set (UCS-4) is defined in ISO/IEC 10646-1, which is equivalent to Unicode.
The values 32 to 95 are allocated to the parallel values of ISO-8859-1 that are identical also to US-ASCII and to SI 1311.
The UCS includes the Hebrew letters and the directional formatting codes. [Note: See these links for the Hebrew letters and for the directional formatting codes].
The UCS requires implicit directionality. [Note: This means the Unicode bidirectional algorithm. For an up-to-date specification, see Unicode Standard Annex #9, The Bidirectional Algorithm.]
Notes:
1. The UCS is not single valued, it contains a large number of precomposed characters. A precomposed character is equivalent to the sequence of characters it is composed of. For example, the UCS contains the Hebrew letters with Dagesh as precomposed characters. Each such character is equivalent to the letter itself followed by Dagesh.
2. This standard recommends not to use the Hebrew precomposed characters for encoding the information.
3. If the application requires [also] Arabic the use of the UCS is required.
When the UCS is not required, for example when the application uses only Hebrew and Latin characters, the text shall be encoded according to SI 1311 using implicit directionality.
The data type HebrewString indicates the use of 8 bits per character according to ISO 8859-8 that is identical to SI 1311.
The implementation of Hebrew in ICCs shall follow implicit directionality. Implicit directionality renders the text according to the directionality property of each character and the base directionality of the block.
The bidirectional algorithm is defined in SI 1489 [Note: This is equivalent to the Unicode bidirectional algorithm. For an up-to-date specification, see Unicode Standard Annex #9, The Bidirectional Algorithm].
The default base directionality for Hebrew data elements is right-to-left.
The UCS includes 7 directional formatting codes. [Note: See this link for the directional formatting codes]. The characters RLM and LRM are invisible characters that specify directionality and influence the behavior of neutral characters. It is recommended not to use the other directional formatting codes.
© 2001 Jonathan Rosenne. All rights reserved. Last modified January 5, 2001.
The latest version of this document resides at http://www.qsm.co.il/Hebrew/si4424e.htm
Please send your comments to Jonathan (Jony) Rosenne, rosennej@qsm.co.il