Summary
#e popularity of audio books is increasing. In the USA fewer people are reading books but 
many more are listening to them on tapes, CD’s and in MP3 format. #e phenomenon is 
redeﬁning the notion of reading. #e purpose of the paper is to present some pros and cons of 
listening to books instead of reading them. #e conclusions have been reached on the basis of a 
linguistic analysis of parts of two audio books belonging to two diﬀerent literary genres: a crime 
novel (Dan Brown, !e Da Vinci Code) and a comic one (Helen Fielding, Bridget Jones: !e Edge 
of Reason). 
Key words: literature, narratology, linguistics, prosody
 Povzetek
Priljubljenost zvočnic narašča. V ZDA vedno manj ljudi knjige bere, mnogo več jih knjige 
posluša na trakovih, zgoščenkah in MP3 predvajalnikih. Ta pojav poskuša redeﬁnirati branje. 
Namen tega članka je predstaviti nekaj dobrih in slabih strani poslušanja knjig. Zaključki so 
nastali na osnovi jezikovne analize nekaj odlomkov iz dveh zvočnic, ki sodita v dva zelo različna 
književna žanra: kriminalni roman (Dan Brown, Da Vincijeva šifra) in komični roman (Helen 
Fielding, Bridget Jones: Na robu pameti). 
Ključne besede: književnost, narativnost, jezikoslovje, prozodija
 DOI: 10.4312/elope.3.1-2.85-98
     
 An audio book is a recording of the contents of a book read by a professional reader, hence 
referred to as narrator. Some twenty years ago, the ﬁrst audio books appeared as cassette tapes, 
and nowadays, with the development of technology, audio books are distributed as CD’s or in 
digital formats. 
Audio books can be abridged or unabridged and read either by the authors themselves or by 
professional readers, often actors. Sometimes a book is read by more than one person and 
accompanied by music and sound eﬀects. 
Initially, the reason for recording books on tape was to provide people with poor sight with books 
which otherwise they could never enjoy. Eventually, other people seem to have recognized the 
beneﬁts that listening to books may have. 
#e phenomenon is particularly popular in the USA. According to the National Endowment 
for the Arts, fewer Americans are reading the books than a decade ago, but almost a third more 
are listening to them. An article published in !e New York Times (Harmon 2005) in May 2005 
presents the reasons why people have turned to audio books, as well as their attitudes to reading 
and listening of books.  
One of the most frequently expressed reasons why people prefer audio books is that they can 
listen to them almost everywhere: when driving a car, eating lunch, sitting in doctors’ waiting 
rooms, walking a dog or in bed with no lights on to disturb the sleeping partner. 
#e growing popularity of audio books has caused several debates among the critics, writers 
and readers. #e purists believe that listening to books is inferior to reading and look down 
upon audio book fans. Writers of books prefer the audience to read their books, but believe that 
listening to them is better than nothing. 
#ere are two types of audio book consumers: the ones who have never liked reading, and those 
who are simply too busy to spend time sitting and reading. #ey both claim that listening to 
books has several advantages to reading: they can jump among chapters, they have to listen to 
all the text, whereas when reading they tend to skip paragraphs, and, for some, the narrators 
untangle diﬃcult grammatical structures and complex sentences. #eir liking or disliking of a 
book often depends on the narrator’s ability to get the most out of the text. 
Due to their popularity, audio books deserve some literary and linguistic analysis. 
In this paper, the matter of linguistic analysis is two audio books read by a single narrator. #ese 
are: a crime novel by Dan Brown, !e Da Vinci Code, and a comic one by Helen Fielding, Bridget 
Jones: !e Edge of Reason. #e former is read by the actor, Jeﬀ Harding, the latter by the author 
herself. #e purpose of the analysis was to answer to the following three questions: Is reading a 
book really the same as listening to it? What is the function of the narrator in an audio book? 
Can he inﬂuence the popularity of a book?
 In order to be able to answer to the question whether reading a book is the same as listening to it, 
we have to acknowledge the fact that the two activities make use of diﬀerent cognitive processes 
which are the consequence of the basic diﬀerence between the spoken and written modes: speech 
is a linear, ongoing process, whereas writing is a complete product. #us in listening, the text 
is perceived dynamically, while in reading the text is presented synoptically. Our understanding 
of a spoken text largely depends on the intonation clues, such as the rhythm, the highlighting 
of important pieces of information, the pitch movement and the pitch range, volume, tempo 
and voice quality. In reading, the only clues that we have are punctuation and division into 
paragraphs and chapters. #ose do not always overlap with prosodic clues, as will be shown 
below. It can be said that the visual analogue to listening is a ﬁlm, to reading, a painting.
Knowing this, what are then the advantages and disadvantages of listening to reading? Apart from 
the fact that listening to books can be done while doing something else, there is no single and 
straightforward answer to this question. Some people prefer reading simply because their visual 
perception is better than the audial, others may be in favour of listening because they are slow readers 
or because their ability to extract the message is better when they listen to a text read aloud. 
#ere is, however, one very important diﬀerence between reading a book and listening to it being 
read by somebody else. Reading a book is a solitary experience during which an invisible and very 
intimate bond is established between the reader and the author. #e reader is allowed to make 
his own conclusions, opinions and interpretations as he goes on reading. In his mind, he creates 
his own images of the characters and hears their voices. Listening to a book read aloud is a sort 
of trilateral relation where the third party is the narrator acting as a go-between the author and 
the listener. Audio book listeners are deprived of the beautiful experience of being immersed into 
the story. Instead, the plot and the characters are interpreted for them and oﬀered ready made. 
In abridged versions of novels whole paragraphs are usually cut and left out. #ese paragraphs 
often provide important background information for the plot and the character development, as 
it is the case with !e Da Vinci Code and Bridget Jones: !e Edge of Reason. 
 #e person reading an audio book has a very serious and responsible job. #e success of the 
audio experience is judged by what we hear. It is the narrator’s job to make the story memorable, 
enjoyable and sometimes even understandable. In other words, a good narrator can contribute 
considerably to the success of a book, whereas a bad one can ruin it. 
#e task of the narrator is to transmit the story as accurately as possible but with necessary 
additions in the form of the voice quality, rhythm and pitch movement. In addition, he has to be 
able to interpret the text and by means of prosody make necessary alterations as to the coherence 
and cohesion of the text. #is is particularly important in abridged audio books where whole 
paragraphs are cut out. 
 A written text is not merely a string of isolated sentences. In order for a sequence of sentences 
to be considered a text, certain criteria have to be met. Among them the most important ones 
are cohesion and coherence. In this way meaningful units are created which often deal with 
topics, hence they can be called topical units. In written texts, the boundaries between topical 
units are visually marked by paragraphs, whereas the unity within a paragraph is made visible 
by punctuation. Similarly, if the written text contains direct speech of one or more people, their 
turns are also visually marked by means of punctuation marks. 
A hearer of a spoken text has no access to the visual clues of coherence and segmentation into 
paragraphs. Instead, he has to rely on prosodic clues to identify coherence relations, major breaks, 
changes of topics and subtopics, as well as the changes of speakers, their moods, emotions and 
attitudes.
#e theory of discourse analysis and intonation particularly distinguishes among several degrees 
of preparedness in speech. Wichmann (2000) describes speech as a continuum from scripted (i.e. 
read) to spontaneous, and claims that every utterance is made with some degree of preparedness. 
Even in the most spontaneous spoken interactions, speakers take time to plan ahead and prepare 
for the next utterance. #ere are spoken interactions where speech seems spontaneous, but is in 
fact very prepared (e.g. interviews, public lectures, etc.). Reading a written text aloud is mostly 
prepared. Variation may be found in how much it has been rehearsed. In the case of audio books, 
we can assume that the narrators spend quite some time preparing and rehearsing their reading, 
not to mention correcting the recording if they are not satisﬁed with it. 
Brazil et al. (1980) claim that the reader of a text has two options: he can either interpret and 
perform the text as if he himself were speaking to the listener, or he can step outside the text 
and simply stand as the medium. #is depends on the type of text a person is reading aloud. If 
the purpose of the reading is to entertain, then some artistic performance is necessary. If, on the 
other hand, the text read aloud is of a more informative nature, then the role of the reader is to 
pass on someone else’s message merely by converting the written text into speech.
#e two books which have been chosen for the linguistic analysis are novels belonging to two 
diﬀerent literary genres: a crime story (!e Da Vinci Code) and a comedy (Bridget Jones: !e Edge 
of Reason). #ey were abridged and recorded as audio books. We can assume that both narrators 
were very familiar with the texts and had prepared and rehearsed the reading before they went 
into the studio to record the audio books. #ey knew that the purpose of their job was to 
provide an entertaining and highly aesthetic reading of the novels. #us we can expect their use 
of prosodic features mirroring the text relations to be appropriate.
#e sub-sections to follow will discuss the prosodic features that the narrators in the two audio 
books have used to express:
 - division into paragraphs,
 - cohesion and coherence within and between paragraphs,
 - character portrayal 
 A paragraph is a coherent textual unit which usually consists of more than one sentence and deals 
with one topic. #e authors divide their texts into paragraphs according to the grammatical, 
textual and topical considerations. Grammatically speaking, a paragraph has to coincide with 
syntactically complete sentences. From the textual point of view, a paragraph has to be cohesive 
and coherent. #e topical consideration is probably most ﬂexible and left to the author’s narrative 
ability. However, the readers expect a paragraph to be a topically complete unit. 
In writing, we visually recognise a paragraph as a string of sentences separated from another 
string of sentences by an empty line or an indented beginning. In speech, paragraphs are marked 
by intonation. 
#e notion of paragraph intonation is not new. Yule (1980) proposed the existence of a paratone 
which covers a topic or a sub-topic in speech, roughly equivalent to a paragraph in writing
1
. #e 
ﬁrst extensive discussion, by Lehiste (1975), showed that the most common prosodic feature of 
a new paratone is an extra high pitch. Later studies by Brazil et al. (1980), Brown et al. (1980) 
and Yule (1980) all came to the same conclusion. “A new start is marked phonetically … by the 
speaker speaking high in his pitch range and speaking loudly” (Brown et al. 1980, 26). #e other 
important prosodic feature involved in the audial perception of a paratone is the ﬁnal low pitch 
contour, which is often accompanied by explicit phonetic criteria, such as “pause, lengthening of 
a preceding syllable or a break in the rhythm” (Williams 1996, 51). 
Initial extra-high pitch and ﬁnal low pitch accompanied by a pause are thus prosodic phenomena 
which occur at the boundary between two paratones. #e third prosodic feature typical of 
paratone intonation is a gradual declination of pitch. Wichmann calls this gradual lowering of 
pitch over a topic unit “supradeclination” (2000, 107) in order to distinguish it from the notion 
of pitch declination across a single utterance. In other words, there is a distinction between 
utterance and paratone declination. 
#e topical structure of a novel is the author’s construct and a reader has no access to the 
author’s intentions behind the narrative structure. However, there should always be some textual 
motivation behind it. Similarly, the narrator of an audio book has his own understanding of the 

narrative structure and his own interpretation of the story. #us we can expect that the written 
division into paragraphs may not always overlap with the spoken one. 
My analysis of paratone division in the spoken delivery of a novel was based on the comparison 
between the written division into paragraphs and the acoustic features of their spoken delivery. I 
decided for a paratone whenever there was a pause followed by a considerable change in pitch or 
just a pause with no change in pitch in the following initial accented syllables. 
 #e analysis of the written and spoken paragraph division in !e Da Vinci Code has shown that 
the narrator quite strictly follows the author’s division into paragraphs. #ere are, however, two 
types of deviation from the written text. #e narrator either divides one written paragraph into 
two paratones (1), or joins two paragraphs into one paratone (2).
(1) printed version: 
 Pain is good, Silas whispered, repeating the sacred mantra of Father Josemaría Escrivá 
– the T eacher of all T eachers. Although Escrivá had died in 1975, his wisdom lived on, 
his words still whispered by thousands of faithful servants around the globe as they knelt 
on the ﬂoor and performed the sacred practice known as ‘corporal mortiﬁcation’. (30)
 spoken version: 
 Pain is good, Silas whispered, repeating the sacred mantra of Father Josemaría Escrivá 
– the T eacher of all T eachers. [pause]  
 #
2
 Although Escrivá had died in 1975, his wisdom lived on, his words still whispered by 
thousands of faithful servants around the globe as they knelt on the ﬂoor and
3
 performed 
the sacred practice known as ‘corporal mortiﬁcation’.
(2) printed version:
 #e Louvre’s main entrance was visible now, rising boldly in the distance, encircled by 
seven triangular pools from which spouted illuminated fountains.
 La Pyramide.
 #e new entrance to the Paris Louvre had become almost as famous as the museum 
itself. #e controversial, neomodern glass pyramide designed by Chinese-born American 
architect I. M. Pei still evoked scorn from traditionalists who felt it destroyed the dignity 
of the Renaissance courtyard. (35)
 spoken version:
 #e Louvre’s main entrance was visible now, rising boldly in the distance, encircled by 
seven triangular pools from which spouted illuminated fountains. La Pyramide. #e new 
entrance to the Paris Louvre had become almost as famous as the museum itself. #e 
controversial, neomodern glass pyramide designed by Chinese-born American architect 
I. M. Pei still evoked scorn from traditionalists who felt it destroyed the dignity of the 
Renaissance courtyard.
 # $ & ( (& &( →
In order to understand why paratones do not overlap with the paragraphs, one has to look at the 
structure of the text. In (1) the division into two paratones can be explained by the narrator’s need 
to give special emphasis, and hence a new paratone, to the information about Father Escrivá. In 
(2) the narrator joins the three paragraphs into one paratone because they all concern the same 
topic, i.e. the pyramid in front of the Louvre. #e justiﬁcation for one paratone instead of three 
paragraphs is even stronger due to the abridged spoken version.
 #e analysis of the written and spoken paragraph division in Bridget Jones: !e Edge of Reason 
shows a much more faithful narration of the written text in spite of the fact that the novel is 
heavily abridged for the purpose of the audio recording. #e reason for that is certainly the 
fact that the narrator is the author herself. In addition, the novel is written as a diary in which 
each chapter begins with the day of the month and the entries in the diary are preceded by the 
time of the day. Each entry, unless it is very long, is typed as one paragraph. #e shorter entries 
often deal with more than one topic: Bridget’s thoughts and the actual events which happen 
at the indicated time of the day. By means of intonation the narrator successfully interprets 
the text and a listener has the impression that there is more than one paragraph. #e opening 
entry (3) deals with three ideas written as one paragraph but read as three paratones:
(3) printed version:
 7.15 a.m. Hurrah! #e wilderness years are over. For four weeks and ﬁve days now have 
been in functional relationship with adult male thereby proving am not love pariah 
as previously feared. Feel marvellous, rather like Jemima Goldsmith or similar radiant 
newlywed opening cancer hospital in veil while everyone imagines her in bed with Imran 
Khan. Ooh. Mark Darcy just moved. Maybe he will wake up and talk to me about my 
opinions. (3)
 spoken version:
 7.15 a.m. Hurrah! #e wilderness years are over. For four weeks and ﬁve days now have 
been in functional relationship with adult male thereby proving am not love pariah as 
previously feared. [pause]
 Feel marvellous, rather like Jemima Goldsmith or similar radiant newlywed opening 
cancer hospital in veil while everyone imagines her in bed with Imran Khan. [pause]
 # Ooh. Mark Darcy just moved. Maybe he will wake up and talk to me about my 
opinions.
Another example of disentangling the complicated paragraph structure of the written text, where 
a reader has to be quite attentive to learn who is talking and to whom, is (4) where Bridget receives 
a telephone call from her friend Magda, a mother of two baby sons, who is simultaneously 
talking to Bridget and the elder son:
(4) written version:
 ‘Bridget, hi! I was just ringing to say in the potty! In the potty! Do it in the potty!’ (11)
 spoken version:
 ‘Bridget, hi! I was just ringing to say [pause]
 # in the potty! In the potty! Do it in the potty!’
#ere are other cases of paragraph coalescence which are due to the abridgement and not to a 
diﬀerent interpretation of the written text.
 In written texts cohesion is expressed by diﬀerent types of grammatical and lexical references 
present either in the actual text or in the context of the situation. In speech cohesion is additionally 
expressed by means of diﬀerent prosodic features. #e pioneering and seminal work on discourse 
intonation was carried out by Brazil (1997) who has established that the prosodic features of 
‘tone’, ‘key’ and ‘termination’ play an important part in expressing cohesion and coherence in 
speech. 
#e ‘referring’ tones (i.e. the fall-rise, the rise) are to express the anaphoric reference to everything 
that is shared by the interlocutors, whereas the ‘proclaiming’ tones (i.e. the fall, the rise-fall) are 
usually used to express cataphoric reference, i.e. to introduce new information. 
#e analysis of narration in the two audio books has proved the validity of Brazil’s intonation 
model. #e anaphoric and cataphoric references realized by the tones are systematically 
used by both narrators, which suggests that the system works both in British and American 
pronunciation.
4
 
Example (5) is taken from #e Da Vinci Code and is a good example of interplay of a proclaiming 
tone (() introducing new information, followed by a referring tone ((&) providing the name of 
the place, and then followed by another piece of new information (() which winds up the whole 
informational unit and completes the sentence:
(5) Shaped like an enormous  ( horseshoe, / the (& Louvre / was the longest building in  
( Europe. (34)
In this way listeners are provided with prosodic clues of cohesion between the three pieces of 
information. 
By means of the same prosodic features cohesion is established within one paragraph in example 

(6), which is also taken from !e Da Vinci Code, and describes the room in which the albino Silas 
was staying during his visit to Paris:
(6) #e room was ( spartan / – hardwood & ﬂoors, / a pine &dresser, / a canvas ( mat / 
that served as his ( bed. He was a (&visitor here this week,/ and for many →years / he 
had been blessed with a (& similar sanctuary / in New York (City. (27)
#e fall on spartan introduces the topic of the paragraph while the four tone units which follow 
provide the explanation of the spartan ambient of the room. #e ﬁrst two are pronounced 
with referring tones (&) because they express a common bit of knowledge regarding simple 
accommodation. #e narrator has decided to pronounce canvas mat with a proclaiming tone (() 
thus treating this piece of information as diﬀerent from expectation; it is common knowledge 
that even the most uncomfortable and simple accommodation would have a bed. #e second 
sentence begins with a referring tone ((&), introducing the fact that Silas’s room was his 
temporary residence. #e repetition of the referring tone in the third tone unit of the second 
sentence not only makes reference to a miserable accommodation in New York City, but also 
implies Silas’s ascetic lifestyle. 
 If the proclaiming and referring tones are the prosodic features used to express cataphoric and 
anaphoric references respectively, then keys and terminations are the prosodic features used to 
express coherence, i.e. diﬀerent meaningful relations between sentences and paragraphs.
Brazil (1997, 40) distinguishes between three levels of key and termination: high, mid and low. 
Diﬀerent keys establish contrastive or equivalent meaningful relations between two pieces of 
information. #us the high key is used to express contrast and the low key equivalence between 
two packages of information. #e mid key is used to add one piece of information to another. 
#e function of termination is primarily to limit and predict the addressee’s response: the high 
termination is said to encourage further conversation, while the low termination indicates the 
possible end of conversation. #e low termination has an additional function: it marks the end of 
a unit, which Brazil (1997, 117) calls a ‘pitch sequence’ and which often coincides with a written 
paragraph. #e pitch sequence is hierarchically higher than a tone unit. Brazil claims that the 
choice of the key in the beginning of a pitch sequence puts the whole sequence in a meaningful 
relation with the previous pitch sequence. #us a pitch sequence with an initial high key puts the 
whole sequence in a contrastive relation with the previous one, whereas a pitch sequence with an 
initial low key establishes a relation of equivalence between the two successive pitch sequences. 
#e analysis of narration in the two audio books has conﬁrmed Brazil’s theory. Examples (7) 
and (8) are taken from !e Da Vinci Code and show the relations of contrast and equivalence 
between two paratones respectively. In example (7) there are two packages of information about 
the physical appearance of Robert Langdon and his public image which are in a contrastive 
meaningful relation:
(7) His usually sharp blue eyes looked hazy and drawn tonight. Around his temples, the grey 
highlights were advancing into his thicket of coarse black hair.
 # Last month, much to Langdon’s embarrassment, Boston magazine had listed him as 
one of that city’s top ten most intriguing people ... (23)
In example (8) a whole paragraph is dedicated to a detailed description of corporal mortiﬁcation 
performed by Silas which is wound up by the second, one-sentence long paragraph containing 
the expected consequences of the whole ritual and thus uttered in low key:
(8) Silas turned now to a heavy knotted rope coiled neatly beside him.  
 $ #e Discipline$. #e knots were caked with dried blood. Eager for the purifying 
eﬀects of his own agony, Silas said a quick prayer. #en, gripping one end of the rope, he 
closed his eyes and swung it hard over his shoulder, feeling the knots slap against his back. 
He whipped it over his shoulder again, slashing at his ﬂesh. Again and again, he lashed.
 $ Finally, he felt the blood begin to ﬂow.$ (30)
In example (8) there is also an internal relation of equivalence between the message of the ﬁrst 
sentence and the second one which can be interpreted as: !e Discipline is a heavy knotted rope.
It was said above that the initial extra-high pitch and the ﬁnal low pitch accompanied by a pause 
were the prosodic phenomena which marked the boundary between two paratones. Although 
this is very often the case, one cannot precipitously jump to the conclusion that this is the only 
possibility. #e examples (7) and (8) clearly indicate that the meaningful relations between the 
paragraphs have to be taken into consideration.
 Character portrayal is probably the most important and demanding element in audio book narration. 
It is also an element where the narrator’s own image of the characters and his own interpretations of 
their behaviour come to light. Although the narrator reads the words of the author, it is his or her 
voice which gives life to the characters and triggers our imagination. #e narrator takes the author’s 
cues and provides the dramatic experience. In doing so, the narrator has two choices: he can keep 
a low proﬁle and merely read the story providing subtle emotional and attitudinal colouring of the 
characters and the events; or he can go out of his way and use the author’s words to make his own 
story by adding passionate and exaggerated prosodic choices. A good narrator will know how to keep 
his interpretation within acceptable limits and not get carried away with exaggerated imitations of 
characters. He will know how to create the appropriate atmosphere, when to become an invisible 
channel of words, as well as when and how to make the narration vivid.
Evaluating the quality of the narration is just as diﬃcult and subjective as is the narration itself. 
Our judgements are conditioned by our personal preferences regarding the voice and the accent 
of the narrator. In my analysis of the two audio books I have tried to be as objective as possible 
but unfortunately could not help myself being partial and critical, too. 
#e characters in !e Da Vinci Code are of diﬀerent nationalities, occupations and ages. Although 
male characters prevail, one of the main protagonists is a young French woman, Sophie Neveu. 
#e main male character is a Harvard professor, Robert Langdon. Among other important 
characters there are several French police oﬃcers, a French albino called Silas, a Spanish bishop 
and a British historian of aristocratic origin, Sir Leigh T eabing. #e narrator, Jeﬀ Harding, has 
decided to give each of the characters his or her own tone of voice depending on the character’s 
age, occupation, nationality and gender. #us he uses English with a heavy French accent when 
reading the words of the French characters, adding a variety of diﬀerent voice qualities, such as 
roughness pitched low for the main detective, Bezu Fache, or very weak and frightened voices 
pitched either high or low for other minor police oﬃcers. 
If the narrator’s French English accent is acceptable to some extent, this cannot be said for his 
imitation of female voices because he sounds like Dustin Hoﬀman playing ‘Dorothy Michaels’ 
in the well-known comedy T ootsie. #is has created a totally misplaced humorous eﬀect in this 
crime story. 
Another weak point in Jeﬀ Harding’s narration is his imitation of the British English aristocratic 
accent which he uses for Sir Leigh T eabing, a knight and a historian of great inﬂuence, wealth and 
power. #e narrator has failed to acquire those British English vowels which are not present in the 
American English or have a diﬀerent quality and quantity. Among the consonants he uses voiced, 
alveolar tap /t / instead of the British voiceless, alveolar plosive /t/ and he pronounces post-vocalic 
/r/ also before consonants. Intonation-wise, he retains his American mid-level pitch contour instead 
of the more lively British intonation which exhibits a number of pitch jumps and pitch slumps. 
#e only intonation which he manages to get right is the intonation of exclamations and greetings 
where he puts on a very posh and aﬀected British English accent which is generally associated with 
older aristocrats. But, unfortunately, this produces inconsistency with his English accent.
#e narrative structure of the whole novel is quite varied: in the foreground there is the death of 
the Louvre’s curator, Jacques Saunière, and the mysterious message which he left, as well as his 
instruction to his granddaughter, Sophie Neveu,  to contact Robert Langdon. #e two become 
prime suspects in Saunière’s murder and try to escape the French police in order to decode 
the message and ﬁnd the real murderer. #e story is interrupted by Silas’s attempt to ﬁnd the 
keystone before Sophie and Robert. In order to understand the importance of the keystone, 
the action is often slowed down by longer explanations of historical nature, as well as the main 
characters’ inner thoughts. 
#e narrator is quite successful in using the prosodic features of rhythm and volume, as well as 
the pitch height to distinguish between the meditative and explanatory parts of the novel on the 
one hand, and the actions surrounding them, on the other. In the book, the parts that represent 
the characters’ thoughts are written in italics, and the narrator pronounces them slowly, silently 
and in the lower key of the pitch range:
(9) #e spiked cilice belt that he wore around his thigh cut into his ﬂesh, and yet his soul 
sang with satisfaction of service to the Lord.$Pain is good.$(27).
Increased volume and speed of speech are used to interpret the fast actions, such as the chase 
of the police after Robert Langdon who had faked the escape from the Louvre museum by 
throwing the tracking dot through a window on a by-passing truck. 
In sum, the narrator of !e Da Vinci Code quite successfully uses diﬀerent prosodic features to 
bring to life the intricate string of events and historical facts and assumptions, but fails with the 
portrayal of some of the important protagonists of the story.
 #e characters in Bridget Jones: !e Edge of Reason are all British in their thirties and forties. 
#e main protagonists are Bridget Jones and her boyfriend Mark Darcy, as well as a number 
of their female and male friends. #e novel is written as a diary in which Bridget diligently 
notes her daily thoughts and events. #e main theme of this comic novel is Bridget’s brand 
new relationship with Mark, their ups and downs and the cliché diﬀerences between male and 
female way of thinking and behaving. Since the narrator is the author herself, it is possible 
to assume that she knows her characters very well and will be able to bring them to life as 
accurately as possible.
And indeed her narration is faithful to the book in that it tries to produce the characters’ 
feelings and attitudes towards each other and the events surrounding them. #us she puts 
on the sobbing, bellowing, murmuring and whispering voices when her characters behave in 
those ways. When narrating the male voices, she does not exaggerate but simply drops her 
voice to her normal low pitch. 
#e narrative structure of the novel is actually a fast interchange of events and the characters’ 
thoughts, which the narrator manages to achieve by means of the rhythm of the narration. 
Bridget’s thoughts and meditations are narrated in slow rhythm and with neutral, mid 
pitch range. But when her dreamy, thoughtful mood is interrupted by a sudden event, her 
startled reaction is appropriately marked by a high pitched intonation and increased speed 
of delivery. In example (10) these changes of rhythm and pitch are well conditioned by the 
action:
(10) 11.05 a.m. Yes. As it says in How to Get the Love You Want – or maybe it was Keeping the 
Love You Find? – the blending together of man and woman is a delicate thing. Man must 
pursue. Will wait for him to ring me.
 11.15 a.m. Was Richard Finch yelling again. Have been put on the fox-hunting item 
instead of Labour Women and have got to do live insert from Leicestershire.
 Right, better get out cuts…
 [fast rhythm, very high pitch] (Oh. (T elephone. (10-1)
In the paper I have tried to present how narrators use their voices to deliver as accurately as 
possible the contents of a written novel in an audio book. I have speciﬁcally looked at diﬀerent 
prosodic features which are used to achieve cohesion and coherence within and between diﬀerent 
topical units or paratones, as well as how narrators use their voices for character portrayal and the 
narrative structure of the plot.
#e analysis of two audio books of two diﬀerent genres has shown that the written division into 
paragraphs is not observed in the spoken version. In the analysis I decided to treat a shorter or 
a longer pause as the only clue for the paratone division. #e reason for this is to be found in 
Brazil’s intonation theory, which claims that the paratone initial key puts the whole paratone in 
a particular relation with the previous paratone. #e relation can be either that of contrast or 
equivalence. In this way, a speaker achieves coherence between two paratones. 
#e analysis into paratones has shown that the two narrators have not strictly observed the 
division of the written text into paragraphs. #ere are two reasons for that: the ﬁrst one is due to 
the abridgement of the written novels for the purpose of audio book narration, the second one 
has to do with the narrators’ perception of the information packed into one paragraph. #ey 
have either decided to split a written paragraph into two or more paratones or they joined two or 
three paragraphs into one paratone. Both decisions have been made according to the narrators’ 
perception of topical unity.
Cohesion in speech is achieved by means of diﬀerent tones, whereas coherence is expressed 
by the prosodic features of key and termination. #e interplay of new and old or referred-to 
information has been analysed and compared with the prosodic realizations. #e analysis has 
proved the theoretical assumptions made by Brazil (1997) that the fall-rising and the rising 
tones make reference to shared knowledge, whereas the falling tone is used to introduce new 
information. #e analysis of paratone initial keys has also conﬁrmed Brazil’s theory of high key 
expressing contrast and low key expressing equivalence between to topical units.
Another very important element in the audio book narration is the character portrayal and the 
delivery of events. Although this is the most diﬃcult element to evaluate because it is often very 
subjective, I can nonetheless conclude that some artistic performance is necessary to transmit the 
right moods and attitudes of the characters, as well as the pace of events. However, an exaggerated 
performance can achieve an unwanted eﬀect, as is the case with the imitation of a female voice by 
the male narrator in !e Da Vinci Code.
In conclusion I would like to return to the initial dilemma whether listening to audio books will 
eventually replace the traditional reading of written books. I believe that this will not happen and 
that the present enthusiasm will eventually wane. Audio books may be a good solution for some 
types of books and some people. Although the idea of listening to a crime story while cooking 
or driving a car may be pleasing, it cannot compare to a quite and intimate experience of being 
immersed in the story and carried away by the characters.