PDF standards

Overview

PDF stands for Portable Document Format and is a common internet file format. It's used for electronic distribution because it keeps the look and feel of the original document, including the fonts, colours, images, and layout.

Sharing a file in PDF format allows anyone to view the document, regardless of what word processor was used to create it. More importantly, it prevents formatting errors from cropping up due to word-processor incompatibilities, making PDFs a must for official documents like policies and important letters.

If your documents do not meet accessibility standards you could be breaking the Equality Act 2010.

Why do people use PDFs?

  • They are quick and easy to create from popular applications that people already use to author and share documents. Converting content into HTML takes a bit of time. However, creating a fully usable and accessible PDF requires specialist knowledge and can take longer than creating content in HTML.
  • Control over the design - authors and publishers have more control over the layout, design and branding of a PDF. This can be important where there is a need to include complex charts and tables, which can be difficult in HTML. However, there will be people who cannot access the content.
  • They are easy for people to download and print, but you can print and download HTML web pages just as readily.
  • They have the feel of a standalone product - this reflects a culture of paper-based i.e. print output. There's a common feeling that a PDF publication is more tangible and credible compared to HTML. We need to transition to a digital-first culture.
  • There may be a need for a static document to show what was said at a particular point in time. In these cases, publishers should continue to publish PDF in addition to HTML. It is strongly discouraged to only publish a PDF in these cases.

Why are PDF files a problem?

  • PDF documents can make content harder to find, use and maintain.
  • They do not change size to fit the browser.
  • They are not designed for reading on screens - people read differently on the web so it's important to create content that is clear, concise and structured appropriately.
  • It is harder to track the use of PDFs - we can get the number of downloads, but don't know how much it is used offline.
  • It is harder to have an audit trail of changes to PDFs - with HTML, the content management system keeps a record of changes.
  • It is harder to ensure filenames comply with our file standards.
  • Cause difficulties for navigation and orientation - PDFs may open in a new browser window, tab or a separate app. This is more of an issue if the user goes directly to the PDF from a search engine. Without the context of the site, they can't easily browse related content or search the website.
  • They can be hard for some users to access. It depends on how the document was created e.g. it needs to have a logical structure based on tags and headings, meaningful document properties, readable body text, good colour contrast and text alternative for images. It takes time to do this properly.
  • Some users need to change browser settings such as colours and text size to make it easier to read. It is difficult to do this for PDF files.
  • Less likely to be kept up to date - compared to HTML, it is harder to update a PDF once it's been created and published. PDFs are also less likely to be actively maintained, which can lead to broken links and users getting the wrong information. HTML documents encourage people to refer to the website for the latest version.
  • Unless created with sufficient care, PDFs can be bad for accessibility and rarely comply with open standards.
  • Making changes to PDF documents requires finding the owner of the source document.
  • PDF documents can quickly become out of date.
  • HTML should always be the default format for publishing content. This will ensure the widest accessibility.

Checklist for creating a document

Only consider publishing a PDF file if the following checks are passed:

  • Is the PDF intended for print e.g. a brochure? If so, don't upload to the website.
  • Is the content specific to the University of St Andrews, or is it generic advice that could be found elsewhere on the internet? Only publish content that is specific to the University.
  • Does the content have a clear intended audience and function? Do you have evidence that this content is actually needed? Only publish content that meets a definitive user need.
  • Does the content already exist on the University website? We should only have one source of truth.
  • Does the content need to record what was said at a particular time e.g. policy, meeting minutes? If so, continue to publish in PDF, but ideally, it should also be available in HTML.
  • Has the PDF passed WCAG 2.1 AA standards?

If you are in any doubt about whether to upload a PDF file to the website, please contact itservicedesk@st-andrews.ac.uk

Corporate identity

When publicly distributing written PDF documents online, you should use the University branded Microsoft Word templates that can be downloaded from the corporate identity pages.

How to make a PDF file accessible

An accessible PDF requires expertise to ensure it meets the following requirements:

  • Give the document a meaningful title - if using Microsoft Word, this has to be set via the Info section of the document.

Screenshot showing how to set the title attribute of a Word document

  • Keep the document structure simple. Use heading styles to define heading 1, heading 2 and so on. Also, use styles for tables and bullet lists so that screen readers can recognise the formatting and read out the content correctly.
  • Do not use bold to mark up subheadings: always use heading styles.
  • Make sure the heading structure follows a logical sequence.
  • Use clear and simple language. Avoid the use of acronyms, abbreviations and symbols. Explain what they mean the first time you use them.
  • Keep sentences and paragraphs short. Aim for around 25 words or fewer per sentence.
  • Use a sans serif font like Arial or Helvetica.
  • Use sentence case. Avoid all-caps text and italics.
  • Make sure the text is left aligned, not justified.
  • Avoid underlining, except for links.
  • Make sure the link text clearly describes where the link goes. It should be understandable on its own, even if read out of context. This is because some screen readers use list links on a page to find what they need quickly.
  • Documents with single columns of text are easier to make accessible than those with a complex layout.
  • Only use tables for data. Keep tables simple - avoid splitting or merging cells.
  • Do not use colour, or shape alone, to show meaning.
  • If using images or charts think about how to make the content accessible to people with a visual impairment: make the same point in the text of the document, giving the person alt text for the image or chart.
  • Do not use images containing text as it's not possible to resize the text in the image and screen readers cannot read the text which is part of an image.
  • Avoid footnotes where possible - provide explanations inline instead.
  • Build forms in HTML where possible, or use Microsoft Forms.
  • When saving a Word document as a PDF, set the document structure tags and publish the file in PDF/A format.

Use the PDF/A format to save files

PDF/A is an open standard intended for downloading or long-term digital storage. The 'A' stands for archiving; it is not related to accessibility.

Saving your document as PDF/A means the document will continue to work for a long time after it is published. It also improves document security by blocking scripts and macros in documents that might contain malicious code. By default, Word will save a document in PDF/A format. If your software does not, then another program will need to be used to convert it to PDF/A.

Checking the accessibility of PDF files

Before uploading a PDF to the website, make sure it is accessible by using a free online PDF accessibility checker tool. You can also use the accessibility checklist created by 18F (the US government’s digital agency) to help you with your manual testing. You should also test whether your PDF is accessible using a screen reader.

Also, make sure that the filename meets the required standards.

Adobe Reader

You can use Adobe Reader to find out if your PDF document is correctly tagged and structured. People using screen readers need these to be able to access your document.

Go to ‘Edit’ then ‘Accessibility’ and select ‘Quick check’. To fix any issues, you’ll need to either fix the original document in Word or use Adobe Acrobat Pro.

Adobe Acrobat Pro

Follow Adobe’s instructions on using Acrobat Pro to check if your PDF is accessible.

The PDF should pass the full check for WCAG Level AA without any warnings.

Quick screen reader check

If you’re using Windows

Non-Visual Desktop Access (NVDA) is a free open-source screen reader for Windows. It can be installed on the desktop or run from a portable USB drive.

With NVDA running, open the PDF and use the following commands to check the PDF:

  • From the top of the PDF (with the num lock off), use Numpad 0 + Numpad 2 to read the PDF from top to bottom and check the reading order.
  • Use the tab key to move through the PDF and check the tab order.
  • Use the h key to move through the PDF and check the heading structure.
  • Use the g key to move through the PDF and check for text descriptions.

These commands will also work with the JAWS screen reader from Freedom Scientific.

If you’re using a Mac

All Apple Macs have VoiceOver built in. Turn VoiceOver on (or off again) using Command + F5. With VoiceOver running, open the PDF and use the following commands to check the PDF:

  • From the top of the PDF, use a double finger down swipe, or ‘Control + Option + a’ to read the PDF from top to bottom and check the reading order.
  • Use the tab key (repeatedly) to move through the PDF and check the tab order.

VoiceOver does not provide shortcut keys for navigating by headings or graphics.

Useful links