Word and Phrase Repetition — the Proximity Question

I can write an algorithm to do pretty much anything with a simple list of words–which, at its most basic level, is what a short story, novel, screenplay, blog post, article or essay is.

Count the number of times a word appears–no problem. Count the number of times each phrase appears–a little trickier, but still no problem. And I can see the use of these checks to writers. Are you relying a little too much on a particular word or phrase? Do you have a habit of beginning one sentence too many with a particular word or phrase? Has an unusual or rarely used word turned up four times in your 80,000 word novel?

This is what computers do. It’s what they’re for. Spotting patterns, counting things, making it possible to draw conclusions by presenting facts that are difficult for the human eye to spot.

But these checks are only as useful as they are useful. Adding complexity just because you can–not a good idea.

I’ve been asked a few times by SmartEdit users to improve the word and phrase frequency counters to show the proximity of repetitions. Other software does this, so why doesn’t SmartEdit?

It’s been suggested that using the phrase “On the other hand” 250 times in a novel might not, in itself, be a problem, not if those 250 occurrences are spread out evenly over the entire work. Not if one use of the phrase doesn’t pop up within a page of another.

I disagree. Using that particular phrase half a dozen times in a full length novel, not to mention 250 times, is a problem regardless of how close one occurrence is to another.

The proximity checker?

Unless there’s real value for the writer, there’s no gain in adding such complexity. Data for the sake of data, complexity without purpose. Rather than add value, it detracts–from the software and from the results. It makes the writer’s job of editing more difficult, not less; makes them question deliberate use of phrase and word repetition.

The power of this check is not in catching excessive word or phrase usage in a paragraph or on a single printed page, it’s in catching the minor repetitions of words and phrases that jar the reader when encountered more than once.

I remember reading a novel where the writer used the word opined once in every chapter. Fair enough, they were long chapters, and the writer may have felt comfortable with the distance between each opined. But for me, the reader, every time I encountered that dreadful dialog tag was a slap in the face. What was he thinking? Had I not been inflicted with enough opines to last a dozen life times? Can his characters not simply say something?

Why write about this today?

I’m reading Jeanette Winterson’s children’s novel Tanglewreck, and have just reached a passage describing a character who I’m guessing will prove to be one of the bad guys. This wonderful description of Abel Darkwater would be flagged in bright red flashing lights by any sort of word repetition proximity checker.

Abel Darkwater was a round man.

He had a round face, and a round body, and round rings on his round fingers. The gold loops of his pocket-watch chain were round, and when he drew out his watch, which he did as the taxi pulled up to the door, his watch was round and fat and gold.

But the repetition is deliberate–obviously. As is the phrase repetition in Dr. King’s I Have a Dream speech.

Repetition, when those words or phrases are in near proximity to each other, is not the purpose of the various repetition checks in SmartEdit. Their purpose is to catch those dreadful opines, scattered like acne over the length of a novel, forever popping at the wrong times and spewing all sorts of unpleasantness upon the reader.

So, for now, the word and phrase repetition counters in SmartEdit will remain as they are. Because even though the algorithm could be written without much difficulty, I’m not convinced it should be.

07 Aug 2013