DataUsing DataProviding
About LDCMembersCatalogProjectsPapersLDC OnlineSearchContact

LDC Staff

Staff Member Title Email Phone
Ahtaridis, Ilya

Membership Coordinator

ilya (215) 573-1275
Awoyale, Yiwola Senior Researcher awoyale (215) 573-5492
Bamba, Moussa Senior Researcher bamba2 (215) 573-1201
Bies, Ann Senior Research Associate bies (215) 746-0245
Bragilevskaya, Natalia Business Administrator natalia (215) 898-4055
Brandschain, Linda Project Manager, Human Subjects Data Collections brndschn (215) 573-9182
Caruso, Christopher Programmer Analyst carusocr (215) 898-2988
Cieri, Chris Executive Director ccieri (215) 573-5489
Ciul, Mike Programmer Analyst mciul (215) 746-0246
Czoka, Karina Financial Coordinator kczo (215) 573-5488
DiPersio, Denise Manager of External Relations dipersio (215) 573-9243
Ellis, Joseph Lead Annotator joellis (215) 573-6072
Graff, David Lead Programmer Analyst graff (215) 898-0887
Griffitt, Kira Lead Annotator kiragrif (215) 573-3595
Grimes, Stephen Programmer Analyst sgrimes (215) 898-0497
Haun, William Programmer Analyst whaun (215) 746-2797
Hill, Wayne Systems Administrator hillw (215) 898-0464
Ismael, Safa Arabic Language Analyst safa (215) 898-0946
Jones, Karen Lead Annotator karj (215) 898-4245
Krug, Gary Programmer Analyst gkrug (215) 573-5491
Kulick, Seth Associate Director skulick (215) 898-0358
Lee, David Programmer Analyst david4 (215) 746-2797
Lee, Haejoong Programmer Analyst haejoong (215) 746-2797
Li, Xuansong Lead Annotator xuansong (215) 898-0946
Liberman, Mark Director myl (215) 573-5490
Ma, Xiaoyi Programmer Analyst xma (215) 898-0887
Maamouri, Mohamed Senior Research Administrator maamouri (215) 746-0244
Mandel, Mark Research Administrator mamandel (215) 573-5615
Mazzucchi, Andrea Senior Systems Analyst amazzu
McMackin, Andrew Systems Programmer amcm (215) 898-0464
Mendonça, Angelo External Relations Programmer Analyst mendonca
Morris, Amanda Lead Annotator amandamo (215) 573-3352
Mott, Justin Lead Annotator/Technical Assistant jmott (215) 898-6072
Parker, Robert Programmer Analyst parkerrl (215) 746-2797
Reed, Marian Marketing/Communications Coordinator mreed (215) 898-2561
Sawyer, Ann Coordinator sawyera (215) 573-5491
Sessa, Stephanie Lead Annotator ssessa (215) 573-6072
Song, Zhiyi Project Manager zhiyi (215) 573-4108
Strassel, Stephanie Senior Associate Director strassel (215) 898-9681
Tavano, Erin Programmer Analyst etava (215) 764-7089
Turner, Ikeila Business Office Coordinator ikeilat (215) 898-0464
Walker, Kevin Senior Programmer Analyst walkerk (215) 573-4172
Wright, Jonathan Research Programmer jdwright (215) 746-0243
Yuan, Jiahong Assistant Professor of Linguistics jiahong (215) 746-3136
Zaghouani, Wajdi Programmer Analyst wajdiz (215) 898-5504
Zakhary, Dalal Lead Annotator/Coordinator zakhary (215) 898-0464
Zakhary, Ramez Lead Annotator rzakhary (215) 573-3595

Ilya Ahtaridis E-Mail

Membership Coordinator. Ilya fosters the continuance and development of LDC membership, oversees all distribution of corpora, and provides information regarding data licensing and usage.

Dr. Yiwola Awoyale E-Mail Home Page

Yiwola's main research program is to prepare an electronic database for a modern dictionary of the Yoruba language, a Kwa language of the Niger-Congo family. Yiwola has joined LDC on sabbatical leave from the University of Ilorin, Nigeria where he is Professor and Head of Department of Linguistics and Nigerian Languages. Yiwola participates actively in the teaching and planning of the Yoruba language for many years, both at university and national levels. In addition to the dictionary project, he also co-ordinates the activities of the African Language Research Council (ALRC) jointly hosted by the LDC and the African Studies Centre. He teaches Yoruba in the African language teaching program of the African Studies Centre.

Dr. Moussa Bamba E-Mail Home Page

Senior Researcher. Moussa's research programme is to prepare an electronic database for the lexicon of three Mandekan (Manding) languages of the Mande group of the Niger-Congo family: (a) Bambara (Bamanankan or Bamanan), (b) Mawukakan (also known as Mawu, Mahou or Mau) and (c) Odienne Jula. Moussa has been a Postdoctoral Fellow at IRCS since 1994 and is actively involved in the teaching of Manding languages both in Africa and outside for many years.

Ann Bies E-Mail Home Page

Senior Research Associate. Ann works on projects involving the syntactic ("treebank") and semantic annotation of English and Arabic, including the Arabic Treebank, the Biomedical Information Extraction (ITR/E) project, the English/Chinese Treebank, and the English/Arabic Treebank. Her previous work includes contributing to the original Penn Treebank, as well as creating an annotated corpus of Early New High German.

Natalia Bragilevskaya E-Mail

Business Administrator. Natalia provides direct financial and administrative support to the Linguistic Data Consortium. Manages all financial transactions and financial reporting. Assists director, executive director and project managers in long and short term planning, creation of budgets, and budget analysis. Acts as liaison between LDC staff and management as well as multiple Penn departments, and government sponsors. Manages HR activities and payroll for all full-time and part-time employees.

Linda Brandschain E-Mail

Linda has been a Project Manager and the head of Human Subjects Data Collection since September 2006. She establishes procedures and specifications for various speech data collection projects such as MIXER, trains and supervises staff, and represents LDC's efforts to sponsors and the research community.

Christopher Caruso E-Mail

Chris is a programmer analyst in the data collections group, providing support to local and remote data collection, automated speech recognition, and machine translation systems. He has worked on the LCTL, Greybeard, and TRECVID projects and currently supports GALE and Mixer.

Dr. Christopher Cieri E-Mail Home Page

Executive Director. Chris provides oversight for the Linguistic Data Consortium including planning, operations, project management, external relations and financial performance.

Mike Ciul E-Mail

Programmer Analyst. Mike is involved with the Arabic Treebanking program, improving tools for annotation of Arabic corpora and English translations and streamlining the annotation pipeline. He is also involved in the development of online tools for teaching Arabic. His contributions include GUI and web development with an eye towards improving usability. Mike's other pursuits include generative computer music written in the Supercollider language.

Denise DiPersio E-Mail

Manager, External Relations. Denise is responsible for the overall management of the External Relations area which includes intellectual property rights, licensing, distribution, publications, membership and the LDC's newswire and broadcast data collections.

Joseph Ellis E-Mail

Lead Annotator. Joseph acts as lead linguistic annotator and coordinator on projects including Machine Reading and Knowledge Base Population. His responsibilities include developing annotation procedures, training and supervising annotation staff, performing linguistic annotation, and overseeing annotation studies involving human subjects.

David Graff E-Mail

Lead Programmer Analyst. As the founding member of the LDC's technical staff, Dave has played a role in the production, preparation and maintenance of virtually every data collection that has been made available through the LDC. Dave's focus has recently shifted toward design of corpus structures and specifications, design of custom user interfaces for transcription and annotation, and the planning and layout of computational and media resources to accommodate the capture and handling of data in large quantity from varied sources.

Kira Griffitt E-Mail

Lead Annotator. Kira is responsible for linguistic annotation, guidelines development and annotator training for projects including DARPA Machine Reading.

Dr. Stephen Grimes E-Mail Home Page

Programmer/Analyst. Stephen supports parallel text and word alignment projects for Chinese and Arabic. This data is used by the GALE project for applications such as machine translation. He also assists with the MADCAT project. Stephen did his graduate work in linguistics at Indiana University and has active interests in many aspects of the Hungarian language.

Safa Ismael E-Mail

Arabic Language Analyst.  Safa performs linguistic annotation and analysis, supervises Arabic annotation staff and coordinates Arabic human subjects collection.  He supports multiple projects including GALE, MADCAT and TransTac.

Gary Krug E-Mail

Programmer/Analyst.  Gary performs technical support and data/tool development for LDC annotation projects including GALE translation and word alignment.

Seth Kulick E-Mail

Seth works on various aspects of annotation projects for English and Arabic, concerning both morphological annotation and syntactic structure ("treebanking"). He is particularly concerned these days with adapting ideas from natural language processing (NLP) to create new techniques for quality control and consistency checking in treebanks. He is also working on issues related to the integration of Arabic morphological and syntactic information in a pipeline, both for annotation and NLP.

David Lee

Programmer/Analyst.  David performs technical support and data/tool development for LDC annotation projects including MADCAT.

Haejoong Lee E-Mail

Programmer Analyst. Haejoong's primary tasks include annotation tool development, data modeling and data handling. He is interested in linguistic database systems that enable efficient storage and query of linguistic data, and works for the Querying Linguistic Database (QLDB) project. He is currently managing the Annotation Graph Library (AGLIB) software package, a part of AGTK. Open Language Archives Community is another project that he provides with programming support.

Dr. Xuansong Li E-Mail

Xuansong acts as Chinese lead annotator and coordinator for several projects including GALE Word Alignment. She is responsible for hiring and training new annotators, developing annotation guidelines and quality control methods and managing the daily activities of the Chinese annotation team.

Dr. Mark Liberman E-Mail Home Page

Mark is professor of Linguistics and Computer and Information Science at the University of Pennsylvania (1990-) and director of the Linguistic Data Consortium (1992-). His research interests are in phonetics, phonology, speech technology and computational linguistics. He is on the editorial boards of Speech Communications, Computer Speech and Language and The International Journal of Corpus Linguistics. Mark came to Penn after being a member of the technical staff and department head of the Linguistics Research Department at AT&T Bell Laboratories (1975-1990).

Xiaoyi MaE-Mail

Xiaoyi's responsibilities include using speech recognition techologies to collect and create linguistic data, researching data collection technology, contributing to LDC online and other tasks related to computational linguistics.


Mohamed Maamouri E-Mail

As of November 2001, Mohamed Maamouri is a Senior Research Administrator at LDC where he heads the Arabic Treebanking Group and the development of Arabic resources and projects. He was from1995-2001 the Associate Director of the International Literacy Institute (ILI) at the Graduate School of Education of the University of Pennsylvania. Dr. Maamouri is a Professor of Linguistics and English at the University of Manouba (1967-1995) in Tunisia, and formerly the Director of the Bourguiba Institute of Modern Languages (19975-1988) at the University of Tunis. Dr. Maamouri specializes primarily in Arabic linguistics, reading, language development, corpus linguistics, and sociolinguistics. His other interests include educational linguistics, language and literacy acquisition, language policy and planning, as well as bilingualism and multilingual issues.

Mark Mandel E-Mail Home Page

Mark is the Research Administrator of PennBioIE and Less Commonly Taught Languages. For the former, he plans and manages the annotation of biomedical texts. For the latter, he does linguistic work for the collection and development of resources in languages where they are in short supply. He views the coordination part of his work as "Oh, it's an interpretation task! I can deal with that."

Angelo Mendonca E-Mail

External Relations Programmer. Angelo supports LDC's External Relations Group by developing and maintaining LDC's business systems and coordinating and preparing publications of language resources. Denise

Justin Mott E-Mail

Lead Annotator and Technical Assistant. Justin works on English Treebank projects; his responsibilities include performing linguistic annotation, training and managing part-time annotation staff and providing technical support. Prior to joining the LDC in 2005, his graduate work concentrated on historical linguistics, Sanskrit and Paninian grammar.

Bob Parker E-Mail

Programmer/Analyst - Bob provides technical support and tool and data development for the annotation and newswire/broadcast collection groups.

Marian Reed E-Mail

Marketing/Communications Coordinator - Marian develops marketing strategies to increase the dimensions of the LDC's community in addition to conducting the LDC's market research efforts and target market identification.

Zhiyi Song E-Mail

Project Manager. Zhiyi manages translation activities for sponsored programs including GALE and NIST Open MT. Previously Zhiyi managed GALE Distillation and ACE and acted as lead annotator for several Chinese annotation efforts.


Stephanie Strassel E-Mail

Senior Associate Director. Stephanie manages LDC's Annotation Group and oversees linguistic resource development for sponsored programs including GALE, MADCAT, Machine Reading, ACE, REFLEX, EARS, TIDES and NIST Open Technology Evaluations including MT, TAC, RT and TREC Video Events.

Kevin Walker E-Mail Home Page

Kevin has been on the programming staff at the LDC since 1998. He is the POC for the LDC's video contributions to TREC-VID and VACE. His areas of responsibility include:

  • Computer telephony software development (call logging, IVR)
  • Broadcast video collection systems design and integration
  • Tool building for video and speech processing
  • Project specific database design (Oracle, Mysql)
Kevin's interests include signal processing, telecommunications, and systems design/analysis, particularly as they relate to large scale, high quality linguistic data collection.

Dalal Zakhary E-Mail

Lead Annotator/Coordinator. Dalal is responsible for Arabic translation resources including parallel text and word alignment. She manages the Arabic translation and word alignment teams, performs quality control and develops annotation guidelines. Past projects have included Biomedical Information Extraction (ITR/E) and TIDES.

Ramez Zakhary E-Mail

Lead Annotator. Ramez acts as Arabic lead annotator for ACE and GALE Transcription, and his duties include annotator hiring, training, management and quality control. He has also acted as lead annotator for the Biomedical Information Extraction project. Ramez has a B.S. in biological sciences, M.D. degree and Master degree in Microbiology from Egypt.


Visiting Scholars

Dr. Steven Bird E-Mail Home Page

Senior Research Associate. Steven works on data models, formats and tools managing language resources. Steven is Associate Professor of Computer Science at Melbourne University (Australia) and collaborates on several LDC research projects.

Wajdi Zaghouani E-Mail

Wajdi is currently managing the Arabic POS and Treebank workflow effort locally and remotly. He had been working as an Arabic computational linguist at Nstein Technologies in Montreal and than at the JRC, a research center of the European Comission in Italy. His main interests are Arabic computational linguistics and in particular named entities extraction techniques, Propbank and lexicon creation. Wajdi holds a B.A in computational linguistics from the University of Quebec at Montreal and an M.A in linguistics from the University of Montreal.

LDC Alumni

Anthony Castelletto E-Mail

Tony is the primary programmer for the LDC Publications Group and is responsible for converting, documenting, and verifying LDC publications, as well as managing and training publications staff in order to assure the release of the publications on their scheduled dates. He has contact with LDC data providers and sponsors to determine data quality, structure, and output specifications.

Bill Clark

Bill is a Systems Administrator for the LDC Systems Department with over twelve years of experience in administering BSD systems. He has a BA in Philosophy from Rutgers University in New Brunswick, NJ, and is currently taking graduate courses in Computer Information Systems and Linguistics at Penn. He lives on a boat on the Delaware River and is an avid bicyclist.

Basma Bouziri E-Mail

A graduate student and teaching assistant at ISLT, Basma is serving as a junior visiting scholar. While here Basma will focus on learning Arabic Treebank II annotation and methodology.

James Fiumara 

Until August 2005, James was the Manager of the External Relations areas of Membership and Intellectual Property Rights (IPR) & Licensing. He had oversight for LDC's Newswire & Broadcast News data collection efforts and coordinated LDC data resources and licenses with various research projects and sponsored evaluations.

Meghan Glenn E-Mail 

Project Manager. Meghan manages speech annotation projects including GALE, RT and Phanotics. She also manages machine translation post-editing efforts for the GALE and MADCAT projects. Previously Meghan acted as lead annotator and manager for multiple LDC projects including HARD, TDT, TREC and DUC.

Johnathan Hamiter 

Financial Coordinator. Johnathan has several financial duties including: processing financial transactions, maintain records of activities and funding schedules for all LDC grants and administrative accounts, resolve issues with vendors, reconcile transactions, resolve wayward charges, and assist the Business Administrator with grant management and audits.

Chad Jackson

Chad Jackson is a systems programmer who is responsible for creating and administering servers for the LDC. He also provides systems-level programming for the LDC as well as desktop, workstation, and server support. Chad is currently pursuing a Masters' Degree in Computer Science from the University of Pennsylvania.

Kazuaki Maeda

Sr. Research Programmer/Manager of Software Development. Kazuaki leads a group of technical staff to create various technical and research resources for LDC's data creation projects, such as GALE, MADCAT, MachineReading, NIST OpenMT, and TAC/KBP. He is also a linguist with interests in phonetics, phonology and computational linguistics.

Nii Olokwei Martey 

Until Summer 2005, Nii managed and coordinated various work areas at the LDC, including collection, recruitment, annotation, training and quality control processes. Starting in 1996, Nii worked on and managed numerous projects at the LDC, including HUB4, HUB5, SPINE and the TDT project.

Abby Neely E-Mail

Human Subjects Data Collection Coordinator. Abby coordinates various human speech collection projects including Mixer, LVDID, and Greybeard. She helps develop guidelines, train new employees, and handle payments.

Michael Maxwell 

Until Fall 2005, Mike collected computational linguistic resources for Less Commonly Taught Languages (aka "Low Density Languages"). He is particularly interested in techniques to rapidly analyze the morphology of a language, including both ways of assisting linguists to do morphological and phonological analysis, and machine learning methodologies. His research interests include phonology and morphology, particularly as these relate to computational linguistics. In the past, he has helped document endangered languages of Ecuador and Colombia.

J. Michael Schultz 

Until Summer 2005, Mike researched and developed search technology for use at the LDC. Two main applications of his research are LDC Online and text annotation. Mike's main interests are in information retrieval, topic detection and tracking and information extraction.


Heather Simpson E-Mail

Project Manager. Heather manages information extraction projects including GALE Distillation, TAC and TRECVid Events. Previously Heather managed the Less Commonly Taught Languages project, and acted as lead annotator for LCTL, ACE and Distillation. 

Shudong Huang 

Project Manager: Until December 2006, Shudong functioned as a Project Manager for the collection and annotation of language data (especially Mandarin/English). He established procedures for projects and trained and supervises his staff. Shudong also provided needs assessment and high level design of project interfaces, as well as representing the LDC's efforts to sponsors and the research community.

Ke Chen 

Programmer Analyst. Until Fall 2006, Ke developd software for collecting, processing, and delivering newswire data. He was also responsible for designing and implementing annotation interfaces for broadcasting audio data. He processed and created parallel-language data for MT research community. He took principle roles in creating Gigaword, Cynewulf, and Aquaint corpora.

Dr. Olga Babko-Malaya 

Project Manager, Text Annotation. Olga works on projects involving semantic annotation, including distillation task for GALE and Arabic Propbank. She previously contributed to the English Propbank and Ontobank projects and managed the Propbank II annotation effort. Her work experience includes knowledge engineering at Teknowledge Corporation and research in computational linguistics at the Academy of Sciences in St.Petersburg, Russia. Olga holds a PhD in Linguistics from Rutgers University with a specialization in lexical and formal semantics and syntax-semantics interface.

Tim Buckwalter 

Senior Programmer Analyst. Tim is involved in Arabic POS-tagging, Treebanking, and EARS. His Arabic morphological parser is distributed by the LDC. His primary research interest is Arabic corpus-based lexicography. He previously was involved in Arabic MT at Alpnet and Arabic text input methods for cell phones at Tegic/AOL.

Andrew W. Cole 

Andy is the Associate Director for Operations; which includes External Relations, Publications, Management Information Systems (MIS), and IT/System Administration. He is married to Janet Lewis, MSN/CNM (Certified Nurse Midwife), and lives in West Philadelphia.


Frank Di Maria 

Programmer. Frank works for the External Relations department and has been active in the development of the web interface for LDC business systems, developing licensing and other sales data for the LDC.


Lauren Friedman

External Data Coordinator. Lauren oversees technical infrastructure development for the Annotation Group and for externally funded projects including GALE. Previously Lauren acted as project manager for translation, and managed outsourced annotation

Huaichuan "Hubert" Jin 

Hubert is a senior programmer analyst at LDC. His current primary project is the Arabic Treebanking, where he develops tools and manage data for the project. His interests are information retrieval, machine learning, natural language processing and speech recognition. He was a former researcher at BBN working on Hub4 and TDT.

Mark Kimelheim 

Systems Administator. Mark is responsible for the design, implementation and day-to-day operation of the LDC's


Dr. Wigdan A. Mekki 

Research Coordinator. Wigdan joined the LDC in April of 2002. She had been working as a computational linguist at France-Telecom Research and Development as a Post-Doc Fellow before joining the LDC. Her research dealt with morpho-syntactic analysis, summarization and grammar formalization. She earned her Ph.D in France at the University of Lyon. Her mission at the LDC is to work as a lead linguist on multiple projects involving the creation of Arabic language resources, development of annotation guidelines and quality assurance measures. She will be providing training, support, and documentation for the annotation staff. Her primary project is currently Arabic TreeBanking.

Julie Medero 

Julie develops tools for use by the annotation group. Her current primary project is ACE; she is responsible for developing tools for the creation, conversion, validation and analysis of annotation data. In addition, she is in charge of the workflow system used to manage several annotation projects.


Shawn Medero 

From February 2005 till January 200, Shawn designed user interfaces,  developed web applications, and helped maintain systems  infrastructure. He was most proud of his work on LDC Online and the  Annotation Collection Kit (ACK). His main interest were user-centered  design, data mining & visualization, internet technologies, large  scale systems and virtual machines.

Alexis Mitchell 

Coordinator. Alexis is responsible for human subject coordination for LDC's Telephone Collections. Previously she acted as English lead annotator for the ACE Project.


Claudia Mottola 

Claudia provides direct financial and administrative support to the Linguistic Data Consortium. Manage all financial transactions and financial reporting. Assists director/executive director/and other principle investigators in long and short term planning, creation of budgets, and budget analysis. Act as liaison between LDC staff and management as well as multiple Penn departments, and government sponsors. Manage HR activities and payroll for all full-time and part-time employees.

Christopher Walker E-Mail

Project Manager, Information Extraction. Christopher coordinates several Information Extraction annotation projects including ACE, and he manages the Less Commonly Taught Languages project. His primary responsibilities include working with program sponsors and affiliated researchers to specify corpora and annotation tasks, identifying deliverables that support program goals, and managing the execution of those deliverables.

Carrie Ann Theisen E-Mail

Lead Annotator. Carrie is the Lead Annotator for the Less Commonly Taught Languages project. She is responsible for hiring, training, and managing native speakers of Less Commonly Taught Languages.

Keith Whiteman E-Mail

Office Manager. Keith's responsibilities include managing the business office, weekly payroll, and facilities liaison.


About LDC | Members | Catalog | Projects | Papers | LDC Online | Search / Help | Contact Us | UPenn | Home | Obtaining Data | Creating Data | Using Data | Providing Data

Contact ldc@ldc.upenn.edu
Last modified: Friday, 08-Jan-2010 03:24:39 EDT
© 1996-2010 Linguistic Data Consortium, University of Pennsylvania. All Rights Reserved.