|Time and Location||T 11:45am-1:45pm, Room 6496|
Prof. Liang Huang (huang at cs.qc), Instructor|
James Cross (jcross at gc.cuny), TA
|Office Hours||Tuesday afternoons at CS Lab. Additional office hours available before HW dues and exams.|
|Prerequisites||CS: algorithms and datastructures (especially recursion and dynamic programming).
solid at programming (in Python). basic understanding of formal language and automata theory.
LING: minimal understanding of morphology, phonolgy, and syntax (we'll review these).
MATH: good understanding of basic probability theory.
|Textbooks / MOOCs||This course is self-contained (with slides and handouts) but you may find the following textbooks helpful:
You might also find these Coursera courses helpful:
|1||Sep 2||Intro to NLP and Rudiments of linguistic theory|
Intro to Python for text processing
|Unit 1: Sequence Models and Noisy-Channel: Morphology, Phonology|
|2||Sep 9||Basic automata theory. FSA (DFA/NFA) and FST.|
|3||Sep 16||FSAs/FSTs cont'd
The Noisy-channel model.
|HW1 out: FSA/FSTs, carmel; recovering vowels|
|RELIGIOUS HOLIDAY - NO CLASS|
SVO/SOV vs. infix/postfix; adv of SVO: less case-marking; adv of SOV: no attachment ambiguity
language model: basic smoothing: Laplacian, Witten-Bell, Good-Turing
language model (cont'd): information theory, entropy and perplexity, Shannon game
Viterbi decoding for HMM; transliteration
|hw2 out: English pronunciation, Japanese transliteration|
discussions of HW2.
More on HMM/Viterbi; sample code.
intro to HW3 (semi-markov).
|hw3 out: decoding for Japanese transliteration|
|Unit 2: Unsupervised Learning for Sequences: Transliteration and Translation|
Korean vs. Japanese writing systems.|
More on semi-markov Viterbi.
EM for transliteration.
More on EM: forward-backward and theory||hw4 out: EM for transliteration.|
|10||Nov 4||Machine Translation: IBM Models 1-2|
|11||Nov 11||EM for IBM Model 1|
EM/HMM demo from Jason Eisner
Pointwise mutual information vs. IBM model 1 and IBM model 4
|Unit 3: Tree Models: Syntax, Parsing, and Semantics|
|13||Nov 25||CFG and CKY||hw5 out: IBM model 1|
|14||Dec 2||CKY (cont'd); semantics intro||hw6 out: parsing|
|compositional semantics (Montague); a bit of distributional semantics (word2vec)|