Graphics Programs Reference
In-Depth Information
Text Strings
Strings outside of the actual textual content of a page (e.g., bookmark names, document
information etc.) are known as
text strings
. They are encoded using either
PDFDocEn-
coding
or (in more recent documents) Unicode. PDFDocEncoding is a based on the
ISO Latin-1 Encoding. It is documented fully in Annex D of ISO Standard
32000-1:2008.
Text strings which are encoded as Unicode are distinguished by looking at the first two
bytes: these will be 254 followed by 255. This is the Unicode byte-order marker
U+FEFF, which indicates the UTF16BE encoding. This means a PDFDocEncoding
string can't begin with þ (254) followed by ÿ (255), but this is unlikely to occur in any
reasonable circumstance.
Dates
The creation and modification dates
/CreationDate
and
/ModDate
in the document in-
formation dictionary are examples of the PDF date format, which encodes a date in a
string, including information about the time zone.
A date string has the format:
(D:YYYYMMDDHHmmSSOHH'mm')
where the parentheses indicate a string as usual. The other parts of the date are sum-
marised in
Table 4-6
.
Table 4-6. PDF date format constituents
Portion
Meaning
YYYY
The year, in four digits, e.g.,
2008
.
MM
The month, in two digits from
01
to
12
.
DD
The day, in two digits from
01
to
31
.
HH
The hour, in two digits from
00
to
23
.
mm
The minute, in two digits from
00
to
59
.
SS
The second, in two digits from
00
to
59
.
O
The relationship of local time to Universal Time, either
+
,
-
or
Z
.
+
signifies local time is later than UT,
-
earlier, and
Z
equal to Universal Time.
HH'
The absolute value of the offset from Universal Time in hours, in two digits from
00
to
23
.
mm'
The absolute value of the offset from Universal Time in minutes, in two digits from
00
to
59
.