Extracting Character Names — How impossible it proved to be

The initial plan for version 2 of SmartEdit included extracting a list of character names and displaying them alongside all the dialog used by that character through an entire novel. This would have allowed the writer to easily check each individual character’s dialog as part of the editing process.

Sounds great, right?

If it had been possible to implement such a feature it would have been of huge benefit to writers, allowing them to monitor a character’s dialog over the course of an entire work — keeping a close watch on word usage to spot words, phrases and speech that didn’t fit the character, and allowing the writer to catch inconsistencies that are easy to miss when the dialog is read as part of a much larger work.

It wasn’t possible, and this surprised me. My early thoughts were that it would turn out to be a straight forward task. When we read a well written novel, there is rarely any doubt as to who is speaking. Dialog tags accompany most dialog (Sarah said, Mr. Smith replied, etc.), so how hard could it be to use those tags to build a dialog map for each character?

The problem I found was that in most novels, character names are rarely tied directly to dialog. Now and then, sure, but across the length of a novel they may only appear 5% of the time. The remaining 95% of that character’s dialog is attributed using “he said” or “she answered,” combinations of descriptive identifiers such as “her friend replied,” or no tags at all.

Once a well written conversation gets going, it’s not uncommon to read full pages of dialog where no actual character names are used. A sufficiently intelligent computer may eventually be able to work through this and put such a map together, but for version 2 of SmartEdit it was not possible.

10 May 2013