Subject: machine usable dictionary

as you may know from postings i have made to this list over the last couple of months , derek bickerton and i are developing a parser based on a theory of syntax that he and i have been developing over the last four years . we are about to purchase a machine usable dictionary with approximately 70 , 000 entries for $ 2500 . if anyone could advise us whether or not that is our best bet , or where we might find other dictionaries , we would appreciate hearing from you . we are currently working with a dictionary of under 1000 words , so it is imperative that we obtain a larger one , so we may begin working with larger corpora . toward that end we would also like to find out which texts were used in past parsing competitions and where the results of these competitions are published . we believe that with a few weeks of work we should be able to modify a dictionary sufficiently to allow us to begin experinmenting with texts that were used in past parsing competitions . here are the specs the parser . it is based on a series of algorithms that have been four years in the making , but the programming required to create this parser has only taken 300 hours using c + + . there areapproximately 3000 lines of code that take up 150k executable on disk . about 100k of ram is required to run the parser . 30k on disk is required for a 300 word dictionary . an average sentence takes under 4 seconds to process on a 486 ibm compatible . since this is only a development version , we expect these numbers to change . to date , no optimizations have occurred , and we expect to significantly shrink the dictionary disk usage and the execution time . phil bralich bralich @ uhccux . uhcc . hawaii . edu
