Subject: muc - 7 call for participation

* * * call for participation * * * seventh message understanding system evaluation and message understanding conference ( muc - 7 ) evaluation : 2 - 6 march 1998 conference : april 1998 washington , d . c . area sponsored by : the human language systems tipster text program of the defense advanced research projects agency information technology office ( darpa / ito ) the message understanding conferences have provided on ongoing forum for assessing the state of the art and practice in text analysis technology and for exchanging information on innovative computational techniques in the context of fully implemented systems that perform realistic tasks . the evaluations have provided researchers and potential sponsors and customers with a quantitative means to appreciate the strengths and weaknesses of the technologies , and the results reported on at the conferences have sparked customer interest in the potential utility of the technologies . the seventh message understanding conference ( muc - 7 ) will provide an opportunity for both new and experienced muc participants to participate in a flexible evaluation , suited to development needs and abilities . it will provide : * opportunity to select among a variety of tasks : named entity ( ne ) , coreference ( co ) , template element ( te ) , template relationship ( tr ) and scenario template ( st ) . * two tasks for evaluating component technologies ( ne and co ) , which use standard generalized markup language ( sgml ) as output format * redesigned information extraction ( ie ) task , with two domain-independent subtasks ( te and tr ) separated from domain-dependent subtask ( st ) . * emphases of st task on portability and on minimizing human resources required to participate in the evaluation . * three experimental tracks to explore new data sets and tasks . participation in muc - 7 is actively sought from both new and veteran organizations . with the new and redesigned evaluation tasks , muc - 7 offers a good opportunity for organizations to try out new ideas for handling nlp problems that are of both scientific and practical interest without having to participate in the entire range of tasks . the conference itself will consist primarily of presentations and discussions of innovative techniques , system design , and test results . there will also be an opportunity for participants to demo their evaluation systems . attendance at the conference is limited to evaluation participants and to guests invited by the darpa tipster text program . a conference proceedings , including test results , will be published . schedule : 1 july 97 : application deadline for participation 15 july 97 : release of ne , co , te , tr , and example st training data and scorer 8 september 97 : release of dry run st task definition , training data , and scorer 29 sept - 3 oct 97 : muc - 7 dry run ( all participants ) 6 february 98 : release of formal test st task definition , training data , and scorer 2 - 6 march 98 : muc - 7 formal run 7 - 9 april 98 : 7th message understanding conference ( tentative dates ) data and task description : the texts to be used for system development and testing are news service articles from the new york times news service , supplied by the linguistic data consortium ( ldc ) [ ldc @ ldc . upenn . edu ] . training , dry run , and test data for all the tasks are extracted from a corpus of approximately 158 , 000 articles . sets of articles to be used in the muc - 7 evaluation will be distributed via ftp upon payment of a one time fee of $ 100 and upon signing of a user agreement for the use of these texts . the user agreement can be retrieved from the ldc catalog ( evaluation agreements ) . the url for the ldc home page is : http : / / www . ldc . upenn . edu . five separate evaluations will be conducted as part of muc - 7 . the definition of these evaluations has been worked out since late 1996 by members of the muc - 7 planning committee . the evaluations may be viewed as capturing the results of text analysis at various levels of aggregation of information : * named entity ( ne ) requires only that the system under evaluation identify each bit of pertinent information in isolation from all others . * coreference ( co ) requires connecting all references to " identical " entities . * template element ( te ) requires grouping entity attributes together into entity " objects . " * template relationship ( tr ) requires identifying relationships between template elements . * scenario template ( st ) requires identifying instances of a task-specific event and identifying event attributes , including entities that fill some role in the event ; the overall information content is captured via interlinked " objects . " * experimental tracks using new data sets are variants of the ne task . the task definition is the same as for the basic ne task , but the texts are different . * experimental track involving a new task is a simplified version of the te task . key things to note about each evaluation task : * ne covers named organizations , people , and locations , along with date / time expressions and monetary and percentage expressions ; it requires production of sgml tags as output . * co covers noun phrases ( common and proper ) and personal pronouns that are " identical " in their reference ; it requires production of sgml tags as output ; the tags for coreferring strings form " equivalance " classes , which are used for scoring . * te covers organizations , persons , and artifacts , which are captured in the form of template " objects " consisting of a predefined set of attributes . * tr covers relationships among template elements , including location and time relationships , which are captured in the form of template " relations " consisting of a relationship and the template elements participating in that relationship . tr is a new task for muc - 7 . * st covers a particular scenario , which is kept secret until one month prior to testing in order to focus on system portability ; however , the generalized structure of a scenario template is predefined , and example scenarios are available for participants to examine . this task is domain dependent . * tasks for the experimental tracks are derived from ne and te . there is a world wide web site that allows automated testing following the rules of muc - 6 . it will be of particular value to new participants . the website is password protected and you need to be licensed to access the acl / dci disk from the ldc to obtain a password from chinchor @ gso . saic . com . muc - 6 articles were taken from the acl / dci disk . an anonymous ftp site will be available for downloading muc - 7 related material . this cfp and the muc - 7 participant agreement are available to the public from the ftp site . each participant ( after signing the ldc user agreement and a muc - 7 participation agreement ) will receive a password to download the muc - 7 data , definitions , and scoring software at the release times noted above . the url of the website is http : / / muc . saic . com . the ftp site is ftp . muc . saic . com . test protocol and evaluation criteria : muc - 7 participants may elect to do one or any combination of tasks and experimental tracks . participants will have access to shared resources such as the training texts and annotations / templates , task documentation , and scoring software . all muc - 7 participants are encouraged to participate in the dry run and take advantage of material available . the formal test will be conducted during the first week in march . it will be carried out by the participants at their own sites in accordance with a prepared test procedure and the results submitted to the ftp site for official scoring with the software prepared by saic for muc - 7 . test sets used for the evaluations will consist of 100 texts , with subsets for some of the tasks . there will be different data sets for the dry run and the formal test . systems will be evaluated using recall and precision metrics ( all tasks ) , f - measure ( all tasks ) , and error-based metrics ( all tasks except co ) . the computation of these metrics is based on the scoring categories of correct , partial , incorrect , spurious , missing , and noncommittal . muc - 7 participants will be able to familiarize themselves with the evaluation criteria through usage of the evaluation software , which will be released along with the training data . instructions for responding to the call for participation : organizations within and outside the u . s . are invited to respond to this call for participation . by the time of the actual testing phase of the evaluation , systems must be able to accept texts without manual preprocessing , process them without human intervention , and output annotations ( ne , co ) or templates ( te , tr , st ) in the expected format . organizations should plan on allocating approximately two person-months of effort for participation in the evaluation and conference . it is understood that organizations will vary with respect to experience with sgml text annotation , information extraction , domain expertise / engineering , resources , contractual demands / expectations , etc . recognition of such factors will be made in any analyses of the results . organizations wishing to participate in the evaluation and conference must respond by july 1 , 1997 by submitting a short statement of interest via email and a signed copy of the muc - 7 participation agreement via surface mail . 1 . the statement of interest should be submitted via email to marsh @ aic . nrl . navy . mil and should include the following : a . evaluation task ( s ) ( choose one or more ) * named entity * coreference * template element * template relationship * scenario template b . primary point of contact . please include name , surface and email addresses , and phone and fax numbers . c . does your site have a copy of the muc - 6 proceedings ? 2 . the participation agreement can be downloaded from the anonymous ftp site ( ftp . muc . saic . com ) . a signed copy should be sent by surface mail to elaine marsh , nrl - code 5512 , 4555 overlook ave . sw , washington , d . c . 20375-5337 , usa . if some questions cannot be deferred until the deadline for responding to this call for participation has passed , you may send them by email to elaine marsh ( marsh @ aic . nrl . navy . mil ) , with copies to ralph grishman ( grishman @ cs . nyu . edu ) and nancy chinchor ( chinchor @ gso . saic . com ) to ensure that your message receives a timely response from one of us . muc - 7 planning committee : ralph grishman , new york university , program co-chair elaine marsh , naval research laboratory , program co-chair chinatsu aone , systems research and applications lois childs , lockheed martin nancy chinchor , science applications international jim cowie , new mexico state university rob gaizauskas , university of sheffield megumi kameyama , sri international tom keenan , u . s . department of defense boyan onyshkevych , u . s . department of defense martha palmer , university of pennsylvania beth sundheim , nccosc nrad marc vilain , mitre ralph weischedel , bbn systems and technologies
