Subject: proper names

dear linguists , i am looking for information and references regarding proper names ( especially people 's names ) . proper names are difficult to handle for nlp applications , for various reasons ; but one thing is sure : if you wish to analyse large corpora of texts , you will bump into a large number of proper names , and many of of these will be foreign proper names . so i am trying to gather any possible piece of knowledge ( references and pointers are welcome ) for as many languages as i can regarding the following points , in order to enhance a syntactic analyser for french : - the standard fashion to name people ( in french , usually a first name and a last name , but some language may add to this simple pattern additonal pieces . - what kind of unusual character may be found in proper names : for example , in france you may find names from brittany like rowarc ' h or floc ' h , where the quote is likely to cause segmentation errors ; the same holds for some dutch names such as op ' t hof , etc . - what are the official rules for spelling proper names , especially names that are composed of prepositional phrases , like " de gaulle " , or " van voorst tot voorst " : in other words , which parts are capitalized , and which are not . - what are the most frequent morphemes used in proper names , or in other words if there is ( used to be ) a preferred way for creating new proper names from a restricted list of morphemes . - what are the compounding rules for proper names . if there is enough interest , i ' ll post a summary . many thanks in advance , francois .
