Subject: sum : ocr software

a couple of weeks back i posted a query about ocr software for the mac that is trainable enough to be useful to a linguist scanning latin or ipa - based non - english texts . thanks to jakob dempsey sarah rilling michael betsch andrew arefiev marc fryd and daniel loehr for their responses . in the mac world , it appears that the front-runner in this area is the widely-available omnipage programme from caere corporation ( http : / / www . caere . com for info ) . it is apparently trainable although one respondent expressed some doubts about being able to train it to handle more than a single special font . i should also mention that the first sales rep i talked to previously about omnipage seemed to think that it might have trouble with the combinations of letters and diacrits typical of ipa - based alphabets . however , the publicity literature on the web site seems to imply that it can be trained to recognize combinations of separate characters and the last sales rep i talked to seemed to think that there was no doubt that omnipage could do the job . jakob dempsey also mentioned an " expensive kurzweil product " for the mac , but i have n't heard anything further about this . i also got two responses that mentioned windows - based applications that are highly trainable . one is a german product called optopus made by a german company called makrolog in wiesbaden which is " exclusively trainable " - - that is , it needs to be trained from scratch and so can be configured to any alphabet you like . the other is by a russian company called bit software ( www . bitsoft . ru ) ; their programme is called finereader and in addition to having a wide range of set alphabets for langauges using both latin and cyrillic , they report having sucessfully trained it to recognize icelandic and tibetan fonts ) . david beck = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = david beck department of linguistics sixth floor , robarts library 130 st . george st . university of toronto toronto , ontario m5s 3h1 canada e-mail : dbeck @ chass . utoronto . ca phone : ( 416 ) 978-4029 ( 416 ) 923-2394 ( home ) fax : ( 416 ) 971-2688
