Subject: sum : reactions to synthetic speech

a number of weeks ago i asked very informally for people 's reactions to synthetic speech ( also prerecorded speech ) and for studies on emotional reactions to synthetic speech . i wish to thank those who responded : osamu fujimura margaret jackman randall a . major corey miller johanna rubba stephen p . spackman i had hoped for more responses , but i have started to collect information from friends and colleagues as well . i ' ve realized that i need a more structured way of gathering info - with the possiblitiy that this ( up until now ) rather informal approach to the matter may suddenly turn into a more formal study . the respondents react both positively and negatively to synthetic speech ; one may be irritated at the " bluntness " of the machine , the lack of flexibility in the programs , etc . but still find the synthetic vocal information handy . > from per egil heggtveit at telenor , norway , i have received a list of references on synthetic speech , but none of the stuides cover emotional reactions . osamu fujimura wrote : > i suggest that you ask the question to marian macchi . i did . she responded the following : > two of the us telephone companies have > introduced a service called " reverse directory assistance " , which > is available to telephone customers . this is a telephone service whereby > a customer calls a special number , enters a telephone number using > the touchtone pad , and hears the name and address of the person to > whom that telephone number is listed . a speech synthesizer ( orator , > a text-to - speech synthesizer that we have developed here at bellcore ) > is used to speak the name and address . > before the introduction of this automated service , one of the > telephone companies offered the service with real human operators . > today the complaint rate from customers is no higher than it was > when the service was offered with real operators . > > this is not to say that use of synthetic speech is always acceptable . > in fact , many applications for synthetic speech are not adopted > becasue the speech sounds too robotic . margaret jackman wrote : > my experience with synthetic vocies is with our telephone information > system . it asks what is the name and address of the person for whom > we want the phone number . i am always annoyed since i know i will > usually have to repeat it to a real person later . > > i am also annoyed with voice mail systems that go on forever - giving > me 10 different options , instead of the voice operator who puts me > through to the person i want . > > i suppose the problem is n't the synthetic language - it is generally > very clear and concise . the problem is that when i get one it > generally wastes my time , and for that reason , i have a negative > reaction to them . . randall a . major wrote : > i ' m not sure if they ' ve worked on reactions or not , but you should try > contacting barbara grosz at > grosz @ eecs . harvard . edu > they ' ve done a lot of work on synthetic speech and she may be able to > help you . good luck ! i contacted barbara grosz , who wrote : > sorry , but i have not done any experiments of this sort , though i have > done some work on speech synthesis . my colleague , julia hirschberg , > at at&t research may know of some research in this arena , though > i do n't believe she has done any either . i have n't contacted julia hirschberg yet , but i intend to . corey miller wrote : > you may want to look at an article on the perception of synthetic > speech by david pisoni , in progress in speech synthesis , > van santen , sproat , olive and hirschberg , springer , 1997 . i ' ve tried to get a copy of the article through our university library , but the book is too recent , and i was told no copies are available yet . johanna rubba wrote : > my personal reaction to a synthetic voice on the phone is negative . i > experience > offense ( because the company involved does not care enough to have a > real person staffing the phone line ; they 'd rather downsize and replace > people with machines ) ; irritation ( because i am not going to be able to > get any questions answered , and am going to be obliged to follow the > inflexible program set down by the corporation [ and these are inevitably > not well-desgined , they waste the customer 's time ] . i also experience > irritation because synthetic voices do not sound like real voices , > meaning i have to put forth extra effort to parse their output , and also > because i am a perfectionist and do n't understand why even relatively > simple things like normal list intonation ( not the weird system used on > the [ non-synthetic , just pre-recorded ] directory assistance systems ) > can't be gotten right . > > i know enough about computational linguistics to know that achieving > real-sounding synthetic speech is extremely difficulty , esp . if context > has to be taken into account . is this an excuse for ugly synthetic > speech ? only if you think we really need synthetic speech . do we ? > > oh , it 's not all negative - - i do experience a low level of curiosity and > amusement in hearing how much of the sound of real speech the designers > have managed to capture in the artificial speech , and the particular > distortions that are found in synthetic speech ( my intro ling students > love it when i mimic synthetic speech for them and point out things like > stress and intonation . i think some progress has been made in this area , > but they sure do recognize that flat , syllable-timed , nasal voice ! ) > > i just thought of a good use of synthetic speech that i do like . my word > processor has an auditory editor that reads my texts back to me . though > the speech has some flaws , it 's not too terribly bad , and it is a very > useful function when the eyes are no longer capable of seeing the errors . > note that i like this because it 's not an interaction ; i get to choose > when i use it , and i do n't expect to have a conversation with it . and finally , stephen s . spackman wrote : > myself , i * like * machines . i use bank machines instead of live tellers > whenever practical . but ( and this does n't all bear directly on your > query , but maybe i ' m talking to someone who wants to listen . . . ! ) : > > ( 1 ) no deception . a machine should announce itself as such - ideally by > going " boing " or something before it starts to talk . it 's extremely > annoying to find yourself trying to talk * with * a machine thinking it is > human . when you find out otherwise you feel both stupid and annoyed at > your wasted effort . even answering machine messages have this problem . > > ( 2 ) machines are not excused from clearing their throats and saying > hello . again , " boing " will do and may even be preferable to " ahem " as > just mentioned . but i once nearly died of fright when a computer behind > me in a darkened room in a deserted bulding at 3am suddenly said " your > printer is out of paper . " in an extremely calm , pleasant voice but with > inadequate warning . > > ( 3 ) machines are not excused from boundary markers . one of the things i > * loathe * about automated directory assistance systems and talking clocks > is that they use the same recorded digits in all positions . this makes > it extremely hard to copy numbers down and know that you have them > right , as well as being simply annoying . even just having separate > final / nonfinal digits would be an improvement . this is actually * less * > of a problem with synthesised speech , partly because synthesis systems > are more likely to do contour , and partly because they sound uniformly > bad rather than atrociously edited ! > > ( 4 ) machines are not excused from rephrasing . a computer reading phone > numbers should say , " seven two _ six _ , one _ three _ zero _ three _ " , but if > asked to repeat itself should use " seven twenty-six , thirteen oh three " . > > ( 5 ) speech * recognition * systems , at present , fail * consistently * for > some speakers . the statistics on successfully completed transactions may > be looking great , while some customers are effectively faced with > termination of service ! > > what 's specifically wrong with synthetic speech ? total absence of > pragmatic markers at every level , poor pitch contours , lack of > interactive adaptation with interlocutor at every level , poorly modelled > interaction between adjacent segments ( which decreases noise immunity > rather than increasing it , no matter what one 's engineering intuitions > might say : - ) . thanks again to all respondents ! bente # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # bente henrikka moxness research assistant dept . of linguistics ntnu ( norwegian university of science and technology ) 7055 dragvoll norway tel : + 47 73 59 15 16 fax : + 47 73 59 61 19 e-mail : benmox @ alfa . itea . ntnu . no # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
