Docento.app Logo
Docento.app
Glowing laptop screen at night
All Posts

How to Detect Tampered PDFs: Signs, Tools, and Forensics

May 12, 2026·8 min read

A PDF that has been altered after being shared, signed, or archived can have legal, financial, or operational consequences. Detecting tampering is sometimes obvious (the digital signature shows "invalid") and sometimes subtle (a modified date that contradicts the document's claimed provenance). This guide walks through how to detect PDF tampering and what tools and techniques apply.

Why tampering matters

A few real scenarios:

  • Contracts. A signed PDF where one party later altered a clause and reused the document.
  • Financial reports. A PDF certified for distribution where numbers were edited before forwarding.
  • Legal evidence. A PDF submitted in court that may have been modified by the submitter.
  • Compliance. A regulatory filing that needs to demonstrate integrity from creation to submission.
  • Forensics. A PDF in a fraud investigation that needs analysis of when and how it was modified.

In any of these, "is this file the same as it was when it was created?" is a question with real consequences.

Strong signal: digital signatures

A digitally signed PDF carries cryptographic proof of integrity. When you open it:

  • Adobe Acrobat / Reader. Shows a signature panel at the top: "Signed and all signatures are valid" (green check) or "At least one signature has problems" (warning icon) or "Document has been modified" (red X).
  • Other readers show similar indicators.

If a digital signature is valid, the file is byte-for-byte identical to what was signed. If any byte changed after signing, the signature breaks. This is the gold standard.

Limitations:

  • Only works if the file was signed in the first place
  • Signature validity depends on the signing certificate's trust chain
  • Some "valid" signatures use compromised or unverified certificates, verify the certificate authority

For background, see digital signatures vs electronic signatures and certified PDFs explained.

Weaker but useful: hash comparison

If you have the original PDF (or a hash of it), compare:

sha256sum file.pdf

Compare the output to the known hash. If they match, the file is unchanged. If they differ, it has been modified.

For this to work:

  • You need a trusted source of the original hash (signed receipt, internal database)
  • The hash must have been recorded before any tampering opportunity

This is the second-best integrity check after digital signatures.

Indirect signals of tampering

When you do not have a signature or hash, you look for clues:

Modification date. PDF metadata records CreationDate and ModDate. If ModDate is later than CreationDate, the file has been modified. Many tampered files have suspicious ModDates. Inspect:

exiftool file.pdf | grep -i date

Caveat: dates can be forged. A tampered file might be saved with a backdated ModDate. Use as one signal, not the only one.

Producer mismatch. The Producer field shows the tool that created the PDF. If the file claims to be from a corporate report generator but the Producer says "Adobe Acrobat Pro DC 2026", someone re-saved it.

Incremental updates. PDFs can be modified via incremental updates (changes appended at end of file). A "Save" rather than "Save As" produces this. Look for it:

qpdf --check file.pdf

The output will note if the file has multiple revisions. Inspect older revisions with qpdf --json --json-key=objects file.pdf to see what changed.

Differential file sizes. A PDF claiming to be the same as a reference but with significantly different file size has been modified.

Re-saved fonts. Custom fonts in the original may have been replaced with system fonts when re-saved by a different tool. Detect by comparing the font list.

Annotation traces. If the file was annotated and then "annotations removed", traces may remain in the object structure.

XMP version history. Modern XMP metadata can include version history. If the history shows modifications you did not expect, the file was edited.

Tools for detection

Adobe Acrobat Pro. Tools → Print Production → Preflight has integrity checks. Tools → Edit PDF can also inspect metadata.

qpdf --check file.pdf, reports linearization, encryption, structural issues, and incremental updates.

exiftool file.pdf, full metadata dump including any version history.

pdftk file.pdf dump_data, pdftk shows structure.

mutool show file.pdf trailer, shows the file's trailer, including any cross-reference table abnormalities.

pdf-parser.py (Didier Stevens' tools), detailed object-level inspection for forensic analysis.

peepdf, open-source PDF inspection tool. Lists objects, streams, scripts.

origami, Ruby-based forensic toolkit.

For deep forensic analysis, multiple tools agree better than one tool's verdict.

A practical detection workflow

For a PDF you suspect of tampering:

  1. Check digital signatures first. If signed and valid, no tampering (modulo the certificate trust). If signed and invalid, tampering.
  2. Compare hash if you have a reference. Match means clean; mismatch means modified.
  3. Inspect metadata. Producer, dates, version history.
  4. Look for incremental updates. Multiple revisions in the file structure are a red flag.
  5. Inspect annotations and form data. Did the recipient add anything?
  6. Compare structure to a known-good twin. If you have another file from the same source, structural diffs reveal changes.
  7. Inspect content visually. Compare side-by-side with a reference if possible.
  8. Inspect specific suspect regions. If you suspect a particular page or field was tampered, focus there.

For high-stakes cases, document each step and the tools used. Forensic admissibility depends on documented methodology.

Signs of "obvious" tampering

A few patterns that almost always indicate manipulation:

  • Mismatched fonts within the same paragraph. Original text in one font, an edit in another.
  • Misaligned text. A correction that does not match the surrounding baseline.
  • Pixelation in scanned regions. A text replacement on a scanned page often has different rendering.
  • Inconsistent paper texture or background. A modified scanned page may have a slightly different background color.
  • White rectangle over content. A clumsy redaction or correction. Highlight all white objects and inspect.
  • Annotation traces with author names from the wrong organization. A reviewer's annotations from outside the claimed organization.

What tampering can hide

A skilled forger can:

  • Backdate metadata fields
  • Match the Producer to the original
  • Remove version history
  • Re-sign with a similar-looking but different certificate
  • Use fonts that match the original

What is harder:

  • Defeating a digital signature signed with a valid certificate
  • Hiding the modification from a hash check
  • Avoiding visual or font mismatches in modified regions

For high-confidence verification, digital signatures with trusted certificates are essential. Without them, you are doing pattern recognition.

Common gotchas

Valid signature, wrong content. The original signer signed exactly what is in the file. Tampering occurred before signing (e.g., the signer was tricked into signing an altered document). Signature alone proves byte integrity, not content correctness.

Self-signed certificates. A signature with a self-signed certificate (created by the signer themselves) provides integrity but not identity verification. Verify the certificate chain.

Expired certificates. A signature with an expired certificate may show as invalid even if the file is unchanged. Adobe Acrobat's Long-Term Validation (LTV) is the proper handling, see certified PDFs explained.

Backdated re-saves. Some tools allow setting CreationDate and ModDate to arbitrary values. A 2020 CreationDate on a file actually created in 2026 is forgeable.

Re-saving in different tools. Opening in tool A and saving in tool B changes the Producer field. This is often legitimate (not tampering) but can be confused with malicious modification.

OCR layer mismatch. If the visible page does not match the OCR text layer, the page was modified after OCR (or the OCR is wrong). Inspect both.

XMP packet manipulation. XMP metadata is mostly XML; an attacker familiar with the format can rewrite history. Trust XMP less than digital signatures.

Operational practices to prevent tampering

If you produce PDFs and care about integrity:

  • Sign documents at the point of finalization
  • Hash and record every critical PDF in an internal log
  • Distribute through tracked channels (DRM, document tracking services)
  • Watermark per-recipient so leaks can be traced
  • Audit periodically against the originals

For receivers of PDFs:

  • Always check digital signatures when present
  • Refuse to act on unsigned documents in high-stakes workflows
  • Compare to a known-good source when possible
  • Inspect metadata routinely

Forensic analysis at scale

For investigators dealing with many PDFs:

  • Automate hash comparison against a database of known files
  • Index metadata across the corpus to spot anomalies
  • Run structural diffs between submitted documents and reference templates
  • Use specialized forensic tools (pdf-parser, peepdf, origami) for deep object inspection

For legal admissibility, document the process meticulously.

Practical recipe

For a one-off "is this tampered" check:

  1. Open in Adobe Acrobat / Reader
  2. Check signature panel
  3. Look at File → Properties → Description (dates, Producer)
  4. Run exiftool file.pdf for full metadata
  5. Run qpdf --check file.pdf for structural integrity
  6. If suspicious, compare to a reference

For a high-stakes formal verification:

  1. Document chain of custody
  2. Use multiple tools (Acrobat, qpdf, exiftool, mutool)
  3. Inspect bytes directly for known structural patterns
  4. Record findings with screenshots and hashes
  5. Conclude with confidence level

Takeaway

Detecting PDF tampering is straightforward when digital signatures are involved and a forensic exercise when they are not. Hash comparison, signature verification, metadata inspection, and structural analysis form a layered approach that catches most tampering. For prevention, always sign critical documents; for detection in their absence, use multiple tools and document your findings. For the operational side, see digital signatures vs electronic signatures, certified PDFs explained, and PDF encryption explained. The technology to verify integrity is mature; the discipline of using it consistently is what protects organizations from tampering disputes.

Related Posts