diff --git a/pdf/css/remark.css b/pdf/css/remark.css index cdb2ed8842bffa685744b4bfafb09d6e8daa68d2..26103fd18845e22a47da32e429390cf8b3b87761 100644 --- a/pdf/css/remark.css +++ b/pdf/css/remark.css @@ -36,3 +36,7 @@ table thead th { border-bottom: 2px solid #dee2e6; vertical-align: bottom; } + +.space-3 { + margin-top: 5em +} \ No newline at end of file diff --git a/pdf/image.pdf b/pdf/image.pdf new file mode 100644 index 0000000000000000000000000000000000000000..169b26612de0a638e0c6b9f0522fd90c39557824 Binary files /dev/null and b/pdf/image.pdf differ diff --git a/pdf/pdf.html b/pdf/pdf.html index 20288e2710f8206a90d7e00aba871223c4733287..590d013198aa7619394fa1a9adbda2dc9d007c13 100644 --- a/pdf/pdf.html +++ b/pdf/pdf.html @@ -56,18 +56,16 @@ Acrobat Distiller (network version): 2495$ --- -# Releases - -| | Year | Industry | Notable features | -| --- | :--- | :------- | :--------------- | -| 1.0 | 1993 | Adobe | **text**, **images**, **pages**, **hyperlinks**, bookmarks | -| 1.1 | 1994 | Tax | passwords, encryption, device-independent color | -| 1.2 | 1996 | Printing | radio buttons, checkboxes, forms incl. import/export, mouse events, sound, **unicode**, color features | -| 1.3 | 2000 | Printing | digital signatures, color spaces, JavaScript, embedded file streams, image utilities, **CIDFonts**, prepress support | -| 1.4 | 2001 | | RC4 > 40bits, transparency, better forms, metadata, accessibility, page boundaries, printer marks, predefined CMaps | -| 1.5 | 2003 | | jpeg, multimedia playback, better forms, public key encryption, permissions, view/hide layers, slideshow | -| 1.6 | 2004 | | 3D, **OpenType**, SOAP over http, public key encryption improvements, color spaces | -| 1.7 | 2006 | | 3D improvements, public key encryption improvements | +| | Year | Industry | Notable features | +| ---- | :--- | :------- | :--------------- | +| v1.0 | 1993 | Adobe | **text**, **images**, **pages**, **hyperlinks**, bookmarks | +| v1.1 | 1994 | Tax | passwords, encryption, device-independent color | +| v1.2 | 1996 | Printing | radio buttons, checkboxes, forms incl. import/export, mouse events, sound, **unicode**, color features | +| v1.3 | 2000 | Printing | digital signatures, color spaces, JavaScript, embedded file streams, image utilities, **CIDFonts**, prepress support | +| v1.4 | 2001 | | RC4 > 40bits, transparency, better forms, metadata, accessibility, page boundaries, printer marks, predefined CMaps | +| v1.5 | 2003 | | jpeg, multimedia playback, better forms, public key encryption, permissions, view/hide layers, slideshow | +| v1.6 | 2004 | | 3D, **OpenType**, SOAP over http, public key encryption improvements, color spaces | +| v1.7 | 2006 | | 3D improvements, public key encryption improvements | *Version 1.7 is ISO 32000-1:2008.* @@ -110,18 +108,175 @@ PDF/A Subversions: # Why is it so popular? Single purpose, which is archived: -**It display content equally on all devices.** +**It displays content equally on all devices.** -Comparison to web: -- Need to increase readability on smaller screens (font size, colors, spacing, ...) -- Need to handle different aspect ratios -- Optimize for less powerful devices -- [Can I Use](https://caniuse.com/) tries to track browser inconsistencies +<div class="space-3"></div> + +Comarison: How webpages deal with different devices: +- Adapt font size, colors, spacing, ... to screen size +- Adapt layout to aspect ratio / screen size +- Remove or add elements depending on end device +- Test on end devices and/or use resources such as [Can I Use](https://caniuse.com/) + +In short: Its painful and slow. + +--- + +class: center, middle + +# The file format + +--- + +# Tokens + +PDF is a text format. You can open any PDF in your text editor! + +Tokens: +```html +0 % Numbers +Hello % Strings +5 0 R % References (5 for the object number, 0 for its generation number, R for reference) +[2 0 R] % Arrays +<</Key /Value>> % Dictionaries +Image % Names (any two with the same content are "equal") +``` + +Out of these tokens, the higher-level objects are composed +- Dictionaries +- Streams --- -# The PDF format +# Structure (1/2) + +Header (asserts PDF version and wether binary data is contained) +```html +%PDF-1.7 +%���� +``` + +Body (contains the actual content) +```html +1 0 obj +<</Type /Catalog /Pages 2 0 R>> +endobj +2 0 obj +<</Type /Pages /Kids [3 0 R] /Count 1>> +endobj +% ... +``` + +--- + +# Structure (2/2) + +Cross-Reference Table (CRT) (contains the binary offset of objects) +```html +xref +0 8 +0000000000 65535 f +0000000015 00000 n +0000000062 00000 n +% ... +``` + +Trailer (contains size of CRT and reader start points) +```html +trailer +<</Size 8 /Root 1 0 R /Info 7 0 R>> +startxref +574 +%%EOF +``` + +--- + +# Body + +```html +1 0 obj +<</Type /Catalog /Pages 2 0 R>> +endobj +2 0 obj +<</Type /Pages /Kids [3 0 R] /Count 1>> +endobj +3 0 obj +<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 210 297] /Contents [6 0 R]>> +endobj +4 0 obj +<</Font <</F 5 0 R>> /ProcSet [/PDF /Text]>> +endobj +5 0 obj +<</Type /Font /Subtype /Type1 /BaseFont /Helvetica /Encoding /WinAnsiEncoding>> +endobj +6 0 obj +<</Length 59>> +stream +BT 1 0 0 1 22 20 cm 1 w /F 12 Tf 14.4 TL (Hello world)Tj ET +endstream +endobj +``` + +--- + +# Content + +Text (`w` = line width, `Tf` = font, `TL` = leading, `Tj` = text) +```html +stream +BT 1 0 0 1 22 20 cm 1 w /F 12 Tf 14.4 TL (Hello world)Tj ET +endstream +``` + +Drawing (`RG` = line color, `rg` = background color, `re` = rectangle, `b` = painting mode) +```html +stream +1 0 0 1 40 20 cm 0.5 w 0.68 0.98 0.94 RG 0.67 0.8 0.73 rg 0 0 20 30 re b +endstream +``` + +--- + +# Image + +Stream with binary image data + +```html +5 0 obj +<</Length 570 /Type /XObject /Subtype /Image /Width 25 /Height 16 /Filter /DCTDecode /BitsPerComponent 8 /ColorSpace /DeviceRGB>> +stream +�����JFIF���������#�#�#�#�%�#�'�+�+�'�6�;�4�;�6�P�J�C�C�J�P�z�W�]�W�]�W�z���s���s�s���s����������������%��������%S +S�oo������#�#�#�#�%�#�'�+�+�'�6�;�4�;�6�P�J�C�C�J�P�z�W�]�W�]�W�z���s���s�s���s����������������%��������%S +S�oo����������"��������������������������̊ �N�������������������������k���������������������������"������������"CRSb�����?��`W�F�X�6�G�Qy&IXFĪ��V�to������������������A���?�zе]������������������Q���?�S����� +endstream +endobj +``` + +Print image (`Do` = print referenced image) +```html +6 0 obj +<</Length 28>> +stream +20 0 0 20 20 20 cm 1 w /I Do +endstream +endobj +``` + +--- + +# Supporting unicode + +`WinAnsiEncoding` is a single byte encoding; so are the other default encodings. + +What if I need a character not contained in these standard encodings (like chinese characters)? + +Steps: +- Embedd font which supports character as stream +- Declare encoding of text (PDF) onto glyphs (font) +- Declare meta data (widths of characters, ...) +=> Requires deep knowledge about font --- diff --git a/pdf/rectangle.pdf b/pdf/rectangle.pdf new file mode 100644 index 0000000000000000000000000000000000000000..fe60870ce15953e0bdad6fa74d2bd9f36fa2bd80 --- /dev/null +++ b/pdf/rectangle.pdf @@ -0,0 +1,44 @@ +%PDF-1.7 +%���� +1 0 obj +<</Type /Catalog /Pages 2 0 R>> +endobj +2 0 obj +<</Type /Pages /Kids [3 0 R] /Count 1>> +endobj +3 0 obj +<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 210 297] /Contents [5 0 R 6 0 R]>> +endobj +4 0 obj +<</ProcSet [/PDF]>> +endobj +5 0 obj +<</Length 72>> +stream +1 0 0 1 40 20 cm 0.5 w 0.68 0.98 0.94 RG 0.67 0.8 0.73 rg 0 0 20 30 re b +endstream +endobj +6 0 obj +<</Length 31>> +stream +1 0 0 1 -30 0 cm 0 0 20 30 re b +endstream +endobj +7 0 obj +<</Creator (famoser/pdf-generator) /CreationDate (D:20210328202704+02'00)>> +endobj +xref +0 8 +0000000000 65535 f +0000000015 00000 n +0000000062 00000 n +0000000117 00000 n +0000000227 00000 n +0000000262 00000 n +0000000382 00000 n +0000000461 00000 n +trailer +<</Size 8 /Root 1 0 R /Info 7 0 R>> +startxref +552 +%%EOF \ No newline at end of file diff --git a/pdf/special_characters.pdf b/pdf/special_characters.pdf new file mode 100644 index 0000000000000000000000000000000000000000..69ccdadfed0b08bee305a10ac458e72117b0ed13 Binary files /dev/null and b/pdf/special_characters.pdf differ diff --git a/pdf/text.pdf b/pdf/text.pdf new file mode 100644 index 0000000000000000000000000000000000000000..9dfae31938199e188568b7ce497b30563c5b05d2 --- /dev/null +++ b/pdf/text.pdf @@ -0,0 +1,41 @@ +%PDF-1.7 +%���� +1 0 obj +<</Type /Catalog /Pages 2 0 R>> +endobj +2 0 obj +<</Type /Pages /Kids [3 0 R] /Count 1>> +endobj +3 0 obj +<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 210 297] /Contents [6 0 R]>> +endobj +4 0 obj +<</Font <</F 5 0 R>> /ProcSet [/PDF /Text]>> +endobj +5 0 obj +<</Type /Font /Subtype /Type1 /BaseFont /Helvetica /Encoding /WinAnsiEncoding>> +endobj +6 0 obj +<</Length 59>> +stream +BT 1 0 0 1 22 20 cm 1 w /F 12 Tf 14.4 TL (Hello world)Tj ET +endstream +endobj +7 0 obj +<</Creator (famoser/pdf-generator) /CreationDate (D:20210328193015+02'00)>> +endobj +xref +0 8 +0000000000 65535 f +0000000015 00000 n +0000000062 00000 n +0000000117 00000 n +0000000221 00000 n +0000000281 00000 n +0000000376 00000 n +0000000483 00000 n +trailer +<</Size 8 /Root 1 0 R /Info 7 0 R>> +startxref +574 +%%EOF \ No newline at end of file