diff --git a/pdf/css/remark.css b/pdf/css/remark.css index b0cbe369c67f8649d4d39e2fadc6ed71c2ef0c77..a9c332cbe35f6007bc1b50bb9f20b848c503c4d0 100644 --- a/pdf/css/remark.css +++ b/pdf/css/remark.css @@ -44,4 +44,18 @@ table thead th { .full-width-pre > pre { margin-left: -4em; margin-right: -4em; +} + +blockquote { + background: #dee2e6; + margin-left: 0; + padding: 0.5em; +} + +blockquote > *:first-child { + margin-top: 0; +} + +blockquote > *:last-child { + margin-bottom: 0 } \ No newline at end of file diff --git a/pdf/pdf.html b/pdf/pdf.html index a0a26287c74f3554b76ae0233cdf36ae4c58e280..cb331dc61321a4dd429c151a9fd3606d6c26b6f1 100644 --- a/pdf/pdf.html +++ b/pdf/pdf.html @@ -368,24 +368,35 @@ What about suficient? | Name | Purpose | | ---- | :--- | | cmap | character maps | -| maxp | memory requirements (#characters, #points, ...) -| head | global information (max bounding box, font style, ...) -| OS/2 | windows font metrics -| name | multilingual strings about font (copyright notice, font name, ...) -| cvt | list of values for instructions -| fpgm | program run upon first usage of font -| gasp | "prefered rasterization on gray-capable devices" -| prep | program run upon character drawing -| GDEF | ligatures -| GPOS | "glyph placement in sophisticated text layout" -| GSUB | ligatures -| hhea | sizing per glyph (ascender, decender, ...) -| loca | offsets of glyph data blocks -| hmtx | width of glyphs -| glyf | list of glyph data blocks +| maxp | memory requirements (#characters, #points, ...) | +| head | global information (max bounding box, font style, ...) | +| OS/2 | windows font metrics | +| name | multilingual strings about font (copyright notice, font name, ...) | +| cvt | list of values for instructions | +| fpgm | program run upon first usage of font | +| gasp | "prefered rasterization on gray-capable devices" | +| prep | program run upon character drawing | +| GDEF | ligatures | +| GPOS | "glyph placement in sophisticated text layout" | +| GSUB | ligatures | +| hhea | sizing per glyph (ascender, decender, ...) | +| loca | offsets of glyph data blocks | +| hmtx | width of glyphs | +| glyf | list of glyph data blocks | In short: Read out tables, patch them together differently, hope offsets work. +--- + +# Subsetting + +Fonts usually have 100's of glyphs. +Do we need more than 60? + + + + + --- # Bonus: How to extend ttf? @@ -405,6 +416,124 @@ like colored glyphs (emojis!): class: center, middle +# PDF Writers + +--- + +# State of the art + +Overview: +- PHP provides [TCPDF](https://github.com/tecnickcom/TCPDF) and [FPDF](https://github.com/Setasign/FPDF) (and "new" [tc-lib-pdf](https://github.com/tecnickcom/tc-lib-pdf)). +- Python has [PyFDPF](https://github.com/reingart/pyfpdf), [PyPDF4](https://github.com/claird/PyPDF4). +- Some action in the [go](https://github.com/jung-kurt/gofpdf) and [JavaScript](https://github.com/MrRio/jsPDF/blob/master/src/jspdf.js) universe. + +Libraries are hard to use. + +```php +private function printH3($text, $startX, $startY) +{ + $this->pdfDocument->SetXY($startX, $startY); + $this->pdfDocument->SetFontSize(22); + $this->pdfDocument->SetFont('opensans', 'b'); + $this->pdfDocument->MultiCell(0, 0, $text, 0, 'L', false, 1); + $this->pdfDocument->Ln(1.6); +} +``` + +Usually html & CSS to PDF "converters" are used -> spin up chrome headless, and use it to print the page. + +--- + +# Architecture (TCPDF) + +Two big tasks: +- Read (and possibly write) `.ttf` +- Write `.pdf` + +Issues: +- "One File approach": Leads to 24k (`pdf`) and 3K (`ttf`) lines of spaget. +- Code quality generally **very** low. [1](https://github.com/tecnickcom/TCPDF/pull/361/files) [2](https://github.com/tecnickcom/TCPDF/commit/978eb8c8247cc1069a2935784125596fc3507326) [3](https://github.com/tecnickcom/TCPDF/commit/14fd6779f320c2873f9deab00a9b77f2a657bc98) +- API unintuitive, badly documented (`AddPage` vs `startPage` vs `addTOCPage`?) +- Resulting code is untestable, hard to follow, large [print table row](https://github.com/mangelio/web/blob/e6ae315bada04313505e87e513d17778f11df548/src/Service/Report/Pdf/Report.php#L465) + +> You cannot meaningfully improve upon this existing work. + +--- + +# Target + +Implementation requirements: +- Add content (text, images, drawings) +- Within Layout (columns, tables, grid, ...) +- Fully testable (what text is printed? which color does it have?) +- No output technology specific details + +```php +// create printer +$printer = new Printer(new PdfPrinter()); +$printer->setColor("dark grey"); + +// create layout +$layout = new ColumnLayout($printer); +$layout->setColumns(2); +``` + +Architecture: +- `document-generator` provides printer & layouts (nothing PDF-specific!) +- `pdf-generator` implements interfaces required by `document-generator` +- `html-generator`, `email-generator`, `word-generator`, ... + +--- + +# `document-generator` + +Concept: +- abstracts PDF specific details (can also attach HTMl / CSS export) +- `Layout` (recursively!) defines columns, tables, grid, ... +- `Printer` allows to print text, images within said layout + +```php +/** + * @param string[] $rowContent + */ +private function printTableRow(TableRowLayoutInterface $row, PrintFactoryInterface $printFactory, array $rowContent) +{ + $printer = $printFactory->getPrinter($row); + + $columnLength = \count($rowContent); + for ($i = 0; $i < $columnLength; ++$i) { + $row->setColumn($i); + $printer->printParagraph($rowContent[$i]); + } +} +``` + +--- + +# `pdf-generator` + +Concept: +- abstract lower-level details continously to tame complexity +- `Printer` prints text, images & drawings + +Architecture: +- `Frontend` (to attach to `document-generator`) +- `IR` (pdf-agnostic `Printer` with colors, sizing, fonts, ...) +- `Backend` (actually writes the PDF; 5 more layers of abstraction) + +```php +$printer = new Printer(new Document()); +$printer->setCursor(new Cursor($xPosition, $yPosition, 1)); +$printer->printText('Hello World'); +$result = $printer->save(); +``` + +... same architecture for `ttf`! + +--- + +class: center, middle + # Reflection --- @@ -422,7 +551,7 @@ What about long term storage (PDF)? Better Long-Term storage? - Word / Excel -> nearly same problems as PDF -- HTML / CSS as structure stays -> but the "web" changes rapidly +- HTML / CSS as structure stays -> but writing a browser even harder </textarea>