IT'S HAPPENED TO everyone.
You read a sentence over and over again. You break it down into little pieces. You try to put it back together again. And still it makes no sense.
Much more frequently, you’ll come across text whose meaning you can have a guess at, if you squint a bit and scratch your head. It’s annoying, but also potentially serious – how important is it for a parent to fully understand the installation instructions for a child car-seat, for example?
As people who are not professional writers create more and more of the things we read, so we are faced with more and more poor writing. Often, text has been put through an automatic dictionary, or a grammar checker, or a computerised translation process, but in the wrong hands these can actually make things worse instead of better.
Neil Newbold wants to change all that. A PhD student in computational linguistics in the Department of Computing at the University of Surrey, Neil has created a simple tool called Readability Report to help people improve their writing.
Readability Report is a little download that works with Open Office (the free open-source software that is fully compatible with Microsoft Office’s word-processing and spreadsheet programs). You simply open your document, click on the Readability Report icon, and the program creates a report on your text’s readability.
But let’s go back to basics – what does readability actually mean?
“Readability is how easy it is to understand text,” says Neil. “It’s not to be confused with legibility, or things like presentation and colours, it’s about the text itself.”
“It’s complicated because a lot of it depends on you, the reader – so if you’re an expert in a subject you’re going to find a text a lot easier, while someone not based in that field is going to find it more difficult.
“That’s something we’ve been looking at, and also how motivated the reader is – how interested they are in the subject. Research has shown that if you’re interested in the subject matter you’re going to find the text easier than someone who isn't.”
Word-processing programs have long offered a readability ‘score’ feature. Should we make more use of it?
“The readability measures used in Word, such as the Flesch and Kincaid scores, are based on word length, and that’s a theory from the Sixties,” reveals Neil. “Word length is a good indicator of readability but isn’t always reliable. Words of three syllables or more are considered ‘hard words’ by the old formulas, but they may actually be words that are used frequently and so are easily understood.”
Word-frequency analysis is a more sophisticated tool, but its use wasn’t feasible back in the Sixties and advances in the field of computational linguistics came to something of a halt around that time. However, now that word-frequency analysis can be used much more easily, we’re starting to see progress again.
There’s still a problem, though. Even people who bother to use the readability tool in their word-processor won’t have a clue what the result means, or what they should do with it. Neil has the solution.
“Basically, the current readability tools give you a number, and people say 'So what?'. My Readability Report provides feedback that’s more useful than just a number,” he says proudly.
The main Readability report will grade your document as Simple, Easy, Good, Challenging or Difficult, and uses SmartTags to highlight difficult words and phrases in your text (as identified by Plain English Campaign). The ‘SimpleText SmartTags’ provide suitable alternatives for these phrases which can be inserted automatically into the text.
The Brain Overload Report measures the information density in the text. An expert in a particular subject will often use specific terms and jargon, resulting in too much information being presented to the reader within a short space of time. This report analyses how many concepts and ideas your text refers to, and rates your document as General, Introductory, Scholarly, Technical or Specialised.
The Cohesion Report uses techniques for automatic summarisation to measure how easy your document is to follow. It highlights the sentence that should be the most representative of your document and shows the words that are the strongest themes. This measure is for documents about a specific subject; fictional writing will often score low for cohesion. A document will be graded as Creative, Digressing, Consistent, Coherent or Fluent.
And what about spelling? Counter-intuitively, the program does not consider common spelling mistakes a great hindrance to readability.
“Common mis-spellings are considered relatively easy words by Readability Report,” explains Neil. “In other words, if you spell a word incorrectly and in an unusual way it will be considered more difficult than the same word spelled correctly, or spelled mistakenly but in a common way.
“This reflects how we handle spelling mistakes while reading - we are more likely to understand common mis-spellings than unusual ones.”
The program has already proven popular, with nearly 8,000 downloads since it was launched in the summer of 2009. What’s next?
“One thing I’ve been looking at is information retrieval for search engines,” says Neil. “At the moment, when you type in a query you get back a list of the most common results for that query.
“We want to start applying readability to that process, so when you type in a search query it returns texts that are suitable for your reading age, or ability, or your background.
“By analysing what texts a user has looked at, we can build up an idea of what their interests are, what level of expertise they have, and we can use that history to work out what results they’d most like to be given.”
The ramifications of results-tailoring are potentially enormous. Search-engine optimisation is big business, and a company’s success or failure may depend on how highly its website is ranked by Google. Should search engines become capable of reliably bringing you personalised results, that whole game is changed.
So, how readable is the text you have just read?
According to Word, its Flesch Reading Ease is 58.3, whatever that means. According to Readability Report, it is ‘good’ for readability, ‘general’ for brain overload and ‘creative’ for cohesion.
And that’s okay. We can work with that.
For more on Readability Report, please see this in-depth article by Neil Newbold.