The video below shows a screen reader in use on the GOV.UK homepage:
Screen readers are applications that turn on-screen content into speech or show it on a Braille display. Most screen readers are made up of two things: the screen reader software, and a Text To Speech (TTS) engine, which is what converts the text from the screen reader into speech.
Through our Accessibility community, we often get asked about how to create content that works well with screen readers. The broad answer is to write content that is as clear and simple as possible - as you would in any case.
Here’s a closer look at how screen readers respond to content, and what it means for the people creating that content.
Punctuation
One thing to consider is the way screen readers handle punctuation. This can vary depending on the screen reader. Some announce important punctuation marks by default, like the @ sign in an email address, but do not announce common punctuation marks like full-stops or question marks. Instead they speak the text much like a human would: they pause briefly at the end of a sentence where there is a full-stop, or increase the pitch of the voice where the sentence ends in a question mark.
Some screen readers speak all punctuation by default. And they can all be adjusted by the user to choose how much punctuation is announced.
Language
Then there is the content itself. The English language is not a simple thing. We have words that are spelled the same, but that sound different: we tie a "bow" or take a "bow". We have words that sound different depending on the context they're used in: last week we "read" something, and now we want to "read" it again.
Some screen readers are good at choosing the right pronunciation based on the surrounding context, but others are not.
Acronyms
When it comes to acronyms and abbreviations there are more differences. Sometimes we speak acronyms as whole words and at others one letter at a time. For example the acronym for Value Added Tax can be said as "V A T", or as "vat" (like a container of liquid).
A screen reader might speak an acronym like a word, when a person would not: for example, the acronym for Disability Living Allowance is pronounced "D L A" by people, but some screen readers will say "dla" (like dlah") instead.
The abbreviation "Gov." is an interesting case. In British English it's an abbreviation for "Government", and screen readers using a TTS engine with a British voice will speak it as it's written - "Gov.". However, a screen reader using a TTS with an American voice might speak it as "Governor", because in the US "Gov." is also a short form of that word.
Capital letters also change the way screen readers pronounce things. A screen reader will say "Gov.UK" like a person would (Gov dot UK"), unless it is reading it as part of a web address where it's in lowercase. Then it will say something like "Gov dot uck" instead.
What’s the answer?
There are lots of different screen readers in use - as our recent assistive technology survey found. There are also many different ways a screen reader can be configured, and there are many ways the English language makes things complicated. So it probably isn't possible to write content that will work flawlessly for everyone who uses a screen reader.
It is tempting to try to write content so it sounds right with screen readers. To put spaces between the letters of an acronym ("V A T"), or full-stops ("V.A.T"), or even to use hidden text so something is written phonetically ("Vee Ay Tee") for example.
The trouble is that spaces between letters can make things difficult for sighted people to read, especially if they have low literacy or find reading difficult because of a condition like Dyslexia. Putting a full-stop between each letter prevents those acronyms that can be spoken like whole words being announced properly by screen readers.
Using hidden text to spell something phonetically might work for people using a screen reader with a TTS engine, but people using a refreshable Braille display will read the text exactly as it's written. Anyone who uses personalised style sheets will see the same thing, and that means they will see content that is spelled incorrectly.
User experience
Received wisdom is that users are accustomed to screen readers pronouncing things strangely sometimes, especially if they use more than one screen reader on a regular basis.
If the user is unsure of the way a word is spoken, they can choose to read it one character at a time, or have the screen reader spell it out for them. Some screen readers also have dictionaries where it's possible to change the way a word is pronounced, or add pronunciations for new or unusual words.
So the best answer seems to be: don't write content that works specifically for screen readers, write content that works well for everyone. Use correct punctuation, spelling and grammar, use standard conventions for acronyms and abbreviations, and use words that are appropriate for your audience.
Help us find out more
There is a lot of guesswork involved in this recommendation though. Little or no rigorous research has been done into the comparative behaviours of screen readers, TTS engines and refreshable Braille displays, or user preferences for these things.
So in the best tradition of starting somewhere, we (the GDS accessibility team) would like to know what you think. If you use a screen reader (with or without a refreshable Braille display), if you don't use a screen reader but have difficulty reading content sometimes, or if you know of any research into these things, please let us know by leaving a comment on this post or by getting in touch through the Cross-Government Accessibility Google Group.
7 comments
Comment by John Brandt posted on
Great observations. I thought I was the only one who was concerned about this. Wish there were better answers.
Comment by Guy Hickling posted on
I also have been pondering over this for a while. I like the idea described in your post of spelling it phonetically in brackets after the abbreviation, using a hidden or ARIA label. That would help screen reader users.
And since Braille readers will already have spelt out the letters of the abbreviation itself, any users seeing the phonetic equivalent appear in brackets immediately afterwards will hopefully immediately recognise what is going on - especially once the method has become accepted practice.
But since abbreviations are a major source of difficulty for screen reader users, maybe we could all prevail on W3C to recommend a standard method of dealing with them? Perhaps by simply recommending screen reader and other speech devices should automatically supply the phonetic pronunciation to anything inside the <abbr> element (whether or not it has a title attribute, as the developer pleases). (Including using either Zed and Zee for Z according to the lang attribute in force at the time.)
Developers could then always use <abbr> on all abbreviations to achieve the desired result for both screen and Braille reader users.
And when the abbreviation is normally pronounced as a word then the developer can simply omit the <abbr> (or W3C could add a new attribute to mean "don't supply the phonetics" - having it this way round would mean that it defaults to leaving screen readers to act as they do now when a developer does not bother to comply with the system and omits the abbr element).
Comment by Carmelo posted on
Couldn't a <span type="abbr">....</span> work the same?
This way the screen reader manufacturers could start immediately, i.e. without waiting the W3C to propose and approves changes to the HTML tags?
Comment by Léonie WatsonLéonie Watson posted on
@Guy
I'm not sure we need to go to those lengths. At least, not without some research to find out whether screen reader users need to know the phonetic pronunciation of something.
If we use the abbr element then the title attribute provides the expanded version of the acronym. All screen readers are able to access that information, though not all do by default.
@Carmelo
There is no such thing as type="abbr". For that to be recognised would require the W3C to add it to the HTML specification, and for browsers to support it in order for screen readers to be able to support it.
Comment by Kevin Key posted on
Your tips are helpful! We need a final proof reader to use before publishing the document. A common mistake would be double periods that were mistakenly typed. The proofread would stop at each perceived grammatical error, etc.
Comment by Emma Richens posted on
It definitely would be great to get some more user research around this. Especially to know what screen reader users are familiar enough with not to struggle, and where the hiccups are that would be good to find better solutions for.
I recently did a little research into pronouncing numbers, as colleagues needed something be read as time duration rather than digits. My advice to them was to use aria labels, as unfortunately there is no programmatic mechanism to tell a screen reader a context for which pronunciation to use. That would be something the W3C and screen reader developers could collaborate on.
Comment by Glen posted on
Advocating the use of W3C HTML for speech is a mistake. W3C maintains the SSML specification for speech.
Since at least 2000 you have had the option of writing web pages in a speech-specific markup language.
One of the problems faced by speech synthesis application developers is that web content writers learn only HTML!
Sun Microsystems JSML (possibly the original)
http://www.w3.org/TR/2000/NOTE-jsml-20000605/
W3C SSML
http://www.w3.org/TR/speech-synthesis11/