A visually polished PDF and an accessible PDF aren't the same thing. The first looks right; the second works right for everyone, including users of screen readers, keyboard navigation, and assistive technology. Tags are the structural skeleton that turns visual content into accessible content. Most PDFs don't have them. Adding them isn't hard, but it isn't automatic either.
What tags actually are
A PDF page describes visual content: glyphs at specific positions, images at specific coordinates, lines and shapes drawn with specific colours. None of that conveys structure. To a screen reader, a heading and a paragraph look identical — just text at different positions on the page.
Tags are a parallel structure layer in the PDF that says: this group of glyphs is a heading; this group is a paragraph; this group is a list item; this image has the alt text "company logo." The visual content stays the same; the tags add semantic meaning.
The result: a screen reader can announce "heading level 1: Annual Report 2026," then read the paragraphs in the right order, then skip to the next heading. Without tags, the screen reader gets a stream of text fragments in unpredictable order — often the visual order on the page, which isn't always the reading order.
What tags include
A typical tagged PDF has tags for:
- Headings (
<H1>,<H2>,<H3>, etc.) with the right hierarchy. - Paragraphs (
<P>). - Lists (
<L>) with list items (<LI>). - Tables (
<Table>) with rows (<TR>), header cells (<TH>), and data cells (<TD>). - Figures (
<Figure>) with alt text. - Links (
<Link>) with descriptions. - Captions linked to their figures or tables.
- Footnotes linked to their references.
- Reading order — the order tags appear in determines the order assistive tech reads them.
- Language tags for content in different languages.
Each tag carries metadata: alt text for figures, headers and scope for tables, language for non-default content.
Why tagging matters more than you think
- Legal compliance. Many jurisdictions require accessible documents for public-sector publications. The EU Accessibility Act, the US Rehabilitation Act Section 508, the UK Public Sector Bodies Accessibility Regulations, and similar laws all apply to PDFs.
- Inclusion. About 1 in 5 users has some form of disability that benefits from accessibility features.
- SEO. Search engines index tagged PDFs more accurately. A tagged PDF appears in search results with proper structure.
- Reuse. Tagged PDFs convert to HTML and other formats more cleanly than untagged ones — see PDF to HTML.
- Future-proofing. Tags are the closest thing PDFs have to semantic structure; they help any tool — assistive tech today, AI tools tomorrow — understand the document.
How to create tagged PDFs
The cleanest path is to tag during creation:
- Word: enable "Document structure tags for accessibility" in the PDF export options. Use real heading styles in the Word doc — Word maps them to PDF tags. Add alt text to images via right-click → "Edit Alt Text."
- Google Docs: most heading and structure exports as tags automatically.
- InDesign: explicit tagging via the Tags panel and Articles panel.
- LaTeX: use
tagpdfpackage or modern engines (lualatex withtagging). - HTML to PDF tools (WeasyPrint, Prince): these usually generate tags from HTML structure automatically.
For documents you're starting from scratch, this is much easier than adding tags after the fact.
Tagging an existing untagged PDF
If you have an untagged PDF and need to make it accessible, the tools:
- Adobe Acrobat Pro: "Make Accessible" wizard auto-tags, then a manual review pass via the Tags panel and Reading Order tool.
- CommonLook PDF: dedicated accessibility remediation tool with strong validation.
- PAC 2024: free PDF accessibility checker (Windows only) that validates against PDF/UA.
- Browser tools: Docento.app supports basic tag editing in the browser without uploads.
Auto-tagging handles the easy parts (paragraphs, obvious headings) but always misses things. A manual pass for headings, alt text, and reading order is essential.
What auto-tagging gets wrong
Common auto-tagging mistakes:
- Headings recognised as paragraphs if the styling isn't bold and large enough.
- Decorative images marked as figures (they should be marked as artifacts, invisible to screen readers).
- Sidebars and pull quotes read in the wrong order.
- Multi-column layouts read across columns instead of down each.
- Tables without header cells, where the auto-tagger doesn't recognise the header row.
- Lists where the bullets become decoration instead of list markers.
- Captions floating free instead of linked to their figure or table.
Plan to spend significant time on a manual review pass. For long documents, consider rebuilding the source with proper structure rather than fixing the PDF.
Reading order
This is where most accessibility audits fail. The reading order — the order in which tags appear in the structure tree — determines the order a screen reader announces content.
Common problems:
- Sidebars read mid-paragraph, breaking the main flow.
- Image captions read before the image, which makes no sense without context.
- Footnote references read at the wrong point.
- Multi-column text read top-to-bottom of left column then top-to-bottom of right column — only sometimes correct.
- Headers and footers read on every page, drowning out the content.
Reading-order tools let you visualise and reorder. It is tedious and important.
Alt text for images
Every image needs alt text — a short description of what the image shows. Rules:
- Be concise but specific. "Photo of a beach at sunset" is better than "Photo." Better still: "Photo of a beach at sunset with three people walking along the shore."
- Don't start with "Image of …" — screen readers already announce that it's an image.
- For decorative images, mark as artifact (no alt text). Don't write
alt="decorative". - For complex images (charts, diagrams), provide a long description elsewhere — in the body text or as a linked file.
For screenshots of forms, charts, or diagrams, the alt text should convey the information the image provides, not just its appearance.
Tables that work
Tables are accessibility's hardest case. A table is accessible if:
- It has explicit header cells (
<TH>) for both row and column headers (where applicable). - Headers have a
Scopeattribute (Row,Column, orRowGroup/ColumnGroup). - The table has a
Summaryattribute for complex tables. - Layout-only tables (used for visual positioning, not data) are marked as artifacts so screen readers ignore them.
Most auto-taggers struggle with tables. Plan to verify each one manually.
PDF/UA compliance
PDF/UA (ISO 14289) is the formal accessibility standard. A PDF/UA document has:
- All required tags.
- No untagged content.
- Language declared for the document and any non-default-language sections.
- Accessibility metadata in the document properties.
- No content marked as artifact that's actually meaningful.
- No heading-level skips (no
<H4>directly after<H2>).
Validate with PAC 2024 or veraPDF. PDF/UA compliance is increasingly required for government and education publications.
Conclusion
Tags are the foundation of an accessible PDF. Generate them at creation time — use real heading styles, alt text, and structured tables in your source. For untagged PDFs, plan a manual remediation pass; auto-tagging is a starting point, not a finish. Validate against PDF/UA. Docento.app supports browser-based tag editing without uploads. For broader context, see PDF accessibility guide.