SI 661 / EECS 595 / LING 541 Natural Language Processing Fall 2005 Thursdays 2:30-5:30 PM 412 West Hall Natural Language Processing (NLP) is the study of the computational treatment of natural language. NLP draws on research in Linguistics, Theoretical Computer Science, Mathematics and Statistics, Artificial Intelligence, Psychology, etc. The course has three major parts: - Linguistic, mathematical, and computational background - Levels of linguistic processing: morphology, syntax, semantics, discourse, pragmatics - Applications: information retrieval, speech processing, text generation, machine translation, information extraction, etc. It also serves three major goals: - Learn the basic principles and theoretical issues underlying natural language processing - Learn techniques and tools used to develop practical, robust systems that can communicate with users in one or more languages - Gain insight into many open research problems in natural language READINGS: Speech and Language Processing (Daniel Jurafsky and James Martin) Prentice-Hall, 2000 ISBN: 0-13-095069-6 ASSIGNMENTS: Four homework assignments (40%) Midterm (15%) Final project (20%) Final exam (25%) Additional requirements for SI PhD students SYLLABUS: Introduction (JM1) Linguistic Fundamentals Regular Expressions and Automata (JM2) Morphology and Finite-State Transducers (JM3) Word Classes and Part of Speech Tagging (JM8) Context-Free Grammars for English (JM9) Parsing with Context-Free Grammars (JM10) Features and Unification (JM11) Natural Language Generation (JM20) The Functional Unification Formalism (Handout) Lexicalized and Probabilistic Parsing (JM12) Language and Complexity (JM13) Representing Meaning (JM14) Semantic Analysis (JM15) Discourse (JM18) Rhetorical Analysis (Handout) Dialogue and Conversational Agents (JM19) PROJECTS: Each student will be responsible for designing and completing a research project that demonstrates the ability to use concepts from the class in addressing a practical problem. A significant part of the final grade will depend on the project assignment. Students can elect to do a project on an assigned topic, or to select a topic of their own. The final version of the project will be put on the World Wide Web, and will be defended in front of the class at the end of the semester (procedure TBA). In some cases (and only with instructor's approval), students may be allowed to work in pairs when the project's scope is significant. SAMPLE PROJECTS: Noun phrase parser Paraphrase identification Question answering NL access to databases Named entity tagging Rhetorical parsing Anaphora resolution Entity crossreference Document and sentence alignment Encyclopedic knowledge extraction Information extraction Speech processing Semantic analysis