Composition

Page-level PDF assembly. The JetsonPDF.Composition package does two things and does them losslessly: PageExtractor pulls a chosen subset of pages out of a PDF into a new file, and Merger concatenates whole PDFs into one. Both copy a page's content, resources, fonts, images, and annotations in their original encoded form — nothing is re-rendered or re-encoded.

Two static entry points.

Package: dotnet add package JetsonPDF.Composition · targets net8.0 / netstandard2.0 / net462 · depends on Common, Reader.

Overview

Composition sits on top of the Reader — it does not pull in the Writer. A merge or extract is an object-graph copy rather than a render pass:

Quick start

Add the package to any .NET project:

dotnet add package JetsonPDF.Composition

Then extract or merge — the in-memory byte[] overloads are the core; file and stream overloads wrap them.

using JetsonPDF.Composition;

// Pull pages 1, 3 and 5 out of a report into a new PDF
byte[] excerpt  = PageExtractor.Extract(reportBytes, 1, 3, 5);

// Concatenate three PDFs into one
byte[] combined = Merger.Merge(coverBytes, bodyBytes, appendixBytes);

File.WriteAllBytes("excerpt.pdf",  excerpt);
File.WriteAllBytes("combined.pdf", combined);

Extract pages

Extracts a subset of pages from an existing PDF into a brand-new PDF. Page numbers are 1-based, and the output keeps them in the exact order you list — so the same call also reorders and duplicates pages.

using JetsonPDF.Composition;

// Single page
byte[] cover    = PageExtractor.Extract(sourceBytes, 1);

// Several pages, in the order given
byte[] picked   = PageExtractor.Extract(sourceBytes, 3, 1, 5);

// Reorder + duplicate: page 2, then page 1 twice
byte[] shuffled = PageExtractor.Extract(sourceBytes, 2, 1, 1);

// Inclusive 1-based range (pages 5..12)
byte[] chapter  = PageExtractor.ExtractRange(sourceBytes, 5, 12);

// Encrypted source — decrypt with the password, output is not encrypted
byte[] unlocked = PageExtractor.Extract(sourceBytes, password: "secret", 1, 2);

File-to-file and stream-to-stream overloads avoid the manual read/write. Streams are never closed by the call, so their lifetimes stay yours.

// File to file
PageExtractor.Extract("report.pdf", "summary.pdf", 1, 2, 10);

// Stream to stream
using var input  = File.OpenRead("report.pdf");
using var output = File.Create("summary.pdf");
PageExtractor.Extract(input, output, 1, 2, 10);
MemberReturnsNotes
Extract(byte[] source, params int[] pageNumbers)byte[]1-based, order preserved.
Extract(byte[] source, string password, params int[] pageNumbers)byte[]Decrypts an encrypted source.
Extract(string inputPath, string outputPath, params int[] pageNumbers)voidReads and writes files (password overload too).
Extract(Stream input, Stream output, params int[] pageNumbers)voidNeither stream is closed.
ExtractRange(byte[] source, int firstPage, int lastPage)byte[]Inclusive 1-based range.

Merge documents

Concatenates multiple PDFs into one, in the order supplied. Every page of every source is copied losslessly into a fresh page tree and catalog.

using JetsonPDF.Composition;

// params overload
byte[] combined = Merger.Merge(firstBytes, secondBytes, thirdBytes);

// IEnumerable overload — merge a whole folder in name order
byte[] all = Merger.Merge(
    Directory.EnumerateFiles("chapters", "*.pdf")
             .OrderBy(p => p)
             .Select(File.ReadAllBytes));

// File to file
Merger.Merge(new[] { "a.pdf", "b.pdf", "c.pdf" }, "combined.pdf");

// Stream to stream — the output stream is written but not closed
using var output = File.Create("combined.pdf");
Merger.Merge(new[] { File.OpenRead("a.pdf"), File.OpenRead("b.pdf") }, output);
Encrypted sources must be decrypted first. Merge has no password parameter and throws on an encrypted file it can't read. Extract each source with its password (which yields a decrypted byte[]), then merge the results.
MemberReturnsNotes
Merge(params byte[][] sources)byte[]Concatenate in argument order.
Merge(IEnumerable<byte[]> sources)byte[]Concatenate a sequence.
Merge(IEnumerable<string> inputPaths, string outputPath)voidRead files, write the result.
Merge(IEnumerable<Stream> inputs, Stream output)voidOutput stream is not closed.

Errors

Both operations validate their arguments eagerly.

ConditionException
source / sources / inputPaths is nullArgumentNullException
No page numbers passed to ExtractArgumentException
A page number < 1 (page numbers are 1-based)ArgumentOutOfRangeException
ExtractRange with firstPage < 1 or lastPage < firstPageArgumentException
Empty sources passed to MergeArgumentException
Source is encrypted and the password didn't unlock it (or none was supplied)InvalidOperationException

Scope & limitations

A fresh catalog and page tree are always emitted. What is and isn't carried over:

Carried overNot carried over
Page content, resources, fonts, imagesDocument structure tree (tagged-PDF /StructTreeRoot)
Per-page annotations (links, widgets, markup)Catalog-level viewer preferences
Outlines / bookmarks (remapped + pruned)Page labels
Named destinations (modern tree + legacy dict)Article threads, OCG layer config
AcroForm fields (/Fields, /DR, /CO, flags)

The document information dictionary (/Info) is preserved — from the source on extract, and from the first document on merge.

Need to build the pages you're assembling? Author them with the Writer, Fluent, or Flow APIs, then compose the results here. To edit the fields of an existing form rather than merge whole documents, see JetsonPDF.Forms.

See the full feature matrix →