Page-level PDF assembly. The JetsonPDF.Composition package does two things and does them losslessly: PageExtractor pulls a chosen subset of pages out of a PDF into a new file, and Merger concatenates whole PDFs into one. Both copy a page's content, resources, fonts, images, and annotations in their original encoded form — nothing is re-rendered or re-encoded.
Two static entry points.
Extract — PageExtractor.Extract(source, 1, 3, 5). Copy chosen 1-based pages into a new PDF. Order is preserved, so the same call reorders and duplicates pages too.
Merge — Merger.Merge(a, b, c). Concatenate documents in order, carrying over and de-colliding their outlines, named destinations, and AcroForm fields.
Lossless — both are a COS object-graph copy (ISO 32000-2 §7.3), not a render pass. Output quality is identical to the input; there is no generational loss from repeated extract/merge cycles.
Composition sits on top of the Reader — it does not pull in the Writer. A merge or extract is an object-graph copy rather than a render pass:
Parse & resolve — the source is read by the Reader's file parser, which resolves the cross-reference table/stream and decrypts the file when a password is supplied.
Deep-copy each page — the page dictionary and everything reachable from it (content streams, /Resources, fonts, images, annotations) is copied into a fresh object table with all indirect references remapped. A dedup map keyed by (source, object-number) handles cycles and shared resources, so a font used by ten pages is copied once.
Materialize inherited attributes — /Resources, /MediaBox, /CropBox, and /Rotate are flattened onto each copied page (§7.7.3.4) before it is reparented under the new page tree; /Parent is dropped.
Write a fresh file — a new catalog, page tree, classic cross-reference table (§7.5.4), and trailer with a new /ID. The output PDF version is the maximum of the source versions.
Targets — net8.0 / netstandard2.0 / net462. No native dependencies.
Stateless — both types are static and thread-safe; each call builds its own assembler over the input bytes.
In memory — input streams are read fully into memory before processing.
Quick start
Add the package to any .NET project:
dotnet add package JetsonPDF.Composition
Then extract or merge — the in-memory byte[] overloads are the core; file and stream overloads wrap them.
using JetsonPDF.Composition;
// Pull pages 1, 3 and 5 out of a report into a new PDF
byte[] excerpt = PageExtractor.Extract(reportBytes, 1, 3, 5);
// Concatenate three PDFs into one
byte[] combined = Merger.Merge(coverBytes, bodyBytes, appendixBytes);
File.WriteAllBytes("excerpt.pdf", excerpt);
File.WriteAllBytes("combined.pdf", combined);
Extract pages
Extracts a subset of pages from an existing PDF into a brand-new PDF. Page numbers are 1-based, and the output keeps them in the exact order you list — so the same call also reorders and duplicates pages.
using JetsonPDF.Composition;
// Single page
byte[] cover = PageExtractor.Extract(sourceBytes, 1);
// Several pages, in the order given
byte[] picked = PageExtractor.Extract(sourceBytes, 3, 1, 5);
// Reorder + duplicate: page 2, then page 1 twice
byte[] shuffled = PageExtractor.Extract(sourceBytes, 2, 1, 1);
// Inclusive 1-based range (pages 5..12)
byte[] chapter = PageExtractor.ExtractRange(sourceBytes, 5, 12);
// Encrypted source — decrypt with the password, output is not encrypted
byte[] unlocked = PageExtractor.Extract(sourceBytes, password: "secret", 1, 2);
File-to-file and stream-to-stream overloads avoid the manual read/write. Streams are never closed by the call, so their lifetimes stay yours.
// File to file
PageExtractor.Extract("report.pdf", "summary.pdf", 1, 2, 10);
// Stream to stream
using var input = File.OpenRead("report.pdf");
using var output = File.Create("summary.pdf");
PageExtractor.Extract(input, output, 1, 2, 10);
ExtractRange(byte[] source, int firstPage, int lastPage)
byte[]
Inclusive 1-based range.
Merge documents
Concatenates multiple PDFs into one, in the order supplied. Every page of every source is copied losslessly into a fresh page tree and catalog.
using JetsonPDF.Composition;
// params overload
byte[] combined = Merger.Merge(firstBytes, secondBytes, thirdBytes);
// IEnumerable overload — merge a whole folder in name order
byte[] all = Merger.Merge(
Directory.EnumerateFiles("chapters", "*.pdf")
.OrderBy(p => p)
.Select(File.ReadAllBytes));
// File to file
Merger.Merge(new[] { "a.pdf", "b.pdf", "c.pdf" }, "combined.pdf");
// Stream to stream — the output stream is written but not closed
using var output = File.Create("combined.pdf");
Merger.Merge(new[] { File.OpenRead("a.pdf"), File.OpenRead("b.pdf") }, output);
Encrypted sources must be decrypted first.Merge has no password parameter and throws on an encrypted file it can't read. Extract each source with its password (which yields a decrypted byte[]), then merge the results.
This is where Composition does more than byte-splicing. The document-level features that reference pages are merged across all sources, and cross-document name collisions are disambiguated so nothing silently shares state.
Outlines / bookmarks — each source's outline tree is appended under one merged /Outlines root. Destinations are remapped to the new page objects; a bookmark whose target page was dropped (and which has no surviving children) is pruned. Prev/Next/First/Last/Count linkage is rebuilt, preserving open/closed state.
Named destinations — the modern /Names /Dests name tree and the legacy /Dests dictionary are merged into one name tree. Destinations targeting dropped pages are removed; name collisions across documents are suffixed (intro, intro_2, …).
AcroForm fields — a combined /AcroForm with a unified /Fields list, a merged default-resource (/DR) dictionary, OR-combined /NeedAppearances and /SigFlags, and a concatenated calculation order (/CO). Top-level field-name collisions are suffixed (signature → signature + signature_2) so two forms that reuse a field name stay independent rather than sharing a value.
Default-resource fonts — identical standard fonts from different documents are shared under one resource name; a genuinely different font that lands under an already-used name is added under a fresh name and the referring /DA appearance strings are rewritten to match.
Collision suffixing is consistent across features: a bookmark that points at a renamed named destination follows the rename, and a widget on a renamed field carries the new name too.
// Two PDFs that both define a "signature" field merge into
// "signature" + "signature_2" — each keeps its own value.
byte[] combined = Merger.Merge(formA, formB);
The runnable PdfCompositionDemo sample builds a report (with an outline and named destinations) and a form (with AcroForm fields), then prints the page counts, outline titles, destination keys, and field names of every output so you can confirm exactly what carried over.
Errors
Both operations validate their arguments eagerly.
Condition
Exception
source / sources / inputPaths is null
ArgumentNullException
No page numbers passed to Extract
ArgumentException
A page number < 1 (page numbers are 1-based)
ArgumentOutOfRangeException
ExtractRange with firstPage < 1 or lastPage < firstPage
ArgumentException
Empty sources passed to Merge
ArgumentException
Source is encrypted and the password didn't unlock it (or none was supplied)
InvalidOperationException
Scope & limitations
A fresh catalog and page tree are always emitted. What is and isn't carried over:
Carried over
Not carried over
Page content, resources, fonts, images
Document structure tree (tagged-PDF /StructTreeRoot)
Per-page annotations (links, widgets, markup)
Catalog-level viewer preferences
Outlines / bookmarks (remapped + pruned)
Page labels
Named destinations (modern tree + legacy dict)
Article threads, OCG layer config
AcroForm fields (/Fields, /DR, /CO, flags)
The document information dictionary (/Info) is preserved — from the source on extract, and from the first document on merge.
Need to build the pages you're assembling? Author them with the Writer, Fluent, or Flow APIs, then compose the results here. To edit the fields of an existing form rather than merge whole documents, see JetsonPDF.Forms.