Docento.app Logo
Docento.app
All Posts

PDF/A Explained: The Archival Format for Long-Term Storage

April 14, 2026·6 min read

Most PDFs you create are designed to be useful next week. PDF/A is for documents that need to be useful in 50 years. The "A" stands for archival, and the format is the difference between a tax record that opens cleanly in 2076 and one that needs forensic recovery to even render.

What problem PDF/A solves

A regular PDF can:

  • Reference external fonts that may not exist on a future system.
  • Embed videos in formats that future readers can't play.
  • Run JavaScript whose runtime is long gone.
  • Use compression algorithms that have been deprecated.
  • Link to online resources that have moved or vanished.

Each of these is fine for a document you'll read this year. None of them are fine for a document you need to read 30 years from now.

PDF/A locks down everything that depends on the world outside the file. The result: a self-contained, deterministic document that should render identically forever.

The PDF/A rules

A PDF/A file must:

  • Embed all fonts the document uses. No font lookups on render.
  • Avoid JavaScript entirely.
  • Avoid encryption. Archives need to be readable; encryption keys may be lost over decades.
  • Avoid external references. Hyperlinks to files or URLs are allowed but external content cannot be required.
  • Avoid audio and video. Media formats become unsupported.
  • Use device-independent colour. ICC profiles, not raw RGB.
  • Include accessibility tags (in PDF/A-1a and stricter versions).
  • Include metadata in XMP format for discoverability.

The result is a slightly larger but fully self-contained file.

The PDF/A flavours

PDF/A has multiple "conformance levels," and they're confusing. Here's the practical map:

  • PDF/A-1a: strictest. Embedded fonts, accessibility tags, structured content. Right choice for documents that need both archival and accessibility (most public-sector requirements).
  • PDF/A-1b: less strict. Embedded fonts, no accessibility requirement. Right choice when the source isn't tagged but you still want archival fidelity.
  • PDF/A-2: extends PDF/A-1 with JPEG2000 image support, transparency, layers.
  • PDF/A-3: PDF/A-2 plus the ability to embed any file as an attachment. Used for invoices that bundle a machine-readable XML.
  • PDF/A-4: published 2020, simplifies the conformance levels and keeps PDF/A in sync with PDF 2.0.

For most users: PDF/A-2b or PDF/A-2u is a sensible default. PDF/A-1a is required for legal/government compliance.

When to use PDF/A

Required:

  • Legal records in many jurisdictions.
  • Tax filings (some authorities require PDF/A specifically).
  • Government correspondence archives.
  • Court submissions in some regions.
  • Long-term medical records under regulatory retention.
  • Academic and library archives — most national libraries require PDF/A.

Recommended:

  • Personal tax records you'll keep for years.
  • Diplomas, certificates, contracts — anything you'll need a clean copy of decades later.
  • Reference documents in your workflow that you don't want to lose.

Not needed:

  • Everyday sharing. A regular PDF is fine.
  • Email attachments. Adds size with no real benefit.
  • Transient documents. The benefits don't matter for documents you'll discard in a month.

How to create a PDF/A file

Several routes:

  • Word: File → Save As → PDF → Options → ISO 19005-1 compliant (PDF/A). Easy and the most common path.
  • LibreOffice: File → Export As PDF → PDF/A with a choice of versions.
  • Adobe Acrobat Pro: File → Save As Other → Archivable PDF (PDF/A).
  • Browser tool: Docento.app supports PDF/A export from browser without uploading the source.
  • Command line: Ghostscript with -dPDFA=2 -sProcessColorModel=DeviceRGB.

After creation, validate compliance using a checker (see below) — many tools claim PDF/A compliance and produce files that fail validation.

Validating PDF/A compliance

Just exporting "as PDF/A" doesn't guarantee compliance. Always validate:

  • veraPDF: free, open source, the de facto reference checker. Runs locally.
  • Adobe Acrobat Pro: built-in validator.
  • Online PDF/A checkers: convenient but uploads your file.

Common failures veraPDF flags:

  • A font isn't embedded (often happens with rare or commercial fonts).
  • The colour space isn't ICC-tagged.
  • Metadata is in the wrong place.
  • Accessibility tags are missing for PDF/A-1a or 2a.
  • An external link uses an unsafe URL scheme.

Fix the source document, re-export, re-validate. Iteration is normal.

Converting an existing PDF to PDF/A

You can convert a normal PDF to PDF/A:

  • Open in your PDF tool, save as PDF/A, validate.
  • Command line: gs -dPDFA=2 -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf.

The conversion will fail if the source uses incompatible features (encryption, JavaScript, missing fonts). The tool either strips them or refuses, depending on settings.

For documents that started as scans, OCR before PDF/A export — a PDF/A file should ideally have a real text layer. See PDF OCR explained.

What gets lost in PDF/A

  • Encryption — incompatible. If you need encryption, store the unencrypted PDF/A in an encrypted container (encrypted ZIP, encrypted disk).
  • Forms with JavaScript validation — the JS is removed; static fields remain.
  • Multimedia — strictly speaking PDF/A-2 onwards allows some, in practice still avoid.
  • Some advanced rendering effects — transparency in PDF/A-1, certain blend modes.

For documents where these features are critical, keep both versions: a working PDF and an archival PDF/A.

PDF/A-3 and the structured invoice case

PDF/A-3 is special. It allows arbitrary file attachments. The use case: a human-readable invoice that also contains a machine-readable XML inside, so accounting software can parse it automatically. This is the basis of e-invoicing standards like ZUGFeRD (Germany) and Factur-X (France/Germany).

If you generate invoices for European customers, PDF/A-3 may be required by their procurement systems.

How long PDF/A actually lasts

PDF/A is designed for long-term, but "long-term" depends on:

  • Media survival: a PDF/A on a failing hard drive is no better than a regular PDF on the same drive.
  • Format support: ISO continues to standardise PDF, and reader support is essentially universal. PDF will remain readable for the foreseeable future.
  • Migration: even archival formats may need migration eventually. National archives plan for periodic format migration regardless.

For genuine long-term storage, pair PDF/A with sound archival practices: redundant storage, integrity checks, periodic format review. PDF/A is necessary but not sufficient.

Conclusion

Use PDF/A when documents need to outlive software changes — legal, regulatory, archival contexts. Use a regular PDF for everything else. Validate after export with veraPDF. Docento.app supports PDF/A export in the browser without uploading the source, useful when the document is sensitive enough that the archival copy itself needs privacy. For more on the format ecosystem, see history of the PDF format and PDF file formats.

Related Posts