Assignment 5: CKY Parser Solution

$35.00 $24.00

In this assignment, you will implement the CKY algorithm for English and apply it to the word recognition and parsing problem. You can use the NLTK modules for representing context-free grammars and parse trees, but you should implement the parser from scratch.   Ingredients. We provide the grammar and the test sentences. The grammar stems…

You’ll get a: . zip file solution

 

 
Categorys:

Description

5/5 – (2 votes)

In this assignment, you will implement the CKY algorithm for English and apply it to the word recognition and parsing problem. You can use the NLTK modules for representing context-free grammars and parse trees, but you should implement the parser from scratch.

 

Ingredients. We provide the grammar and the test sentences. The grammar stems from the

 

Airline Travel Information System (ATIS), a project working on spoken dialog systems for air

 

travel. The ATIS CFG is available in the NLTK data package, together with 98 test sentences.

 

You can initialize the resources this

 

way:

 

# load the grammar

 

grammar = nltk.data.load(“grammars/large_grammars/atis.cfg”)

 

# load the raw sentences

 

s = nltk.data.load(“grammars/large_grammars/atis_sentences.txt”, “raw”)

 

# extract the test sentences

 

t = nltk.parse.util.extract_test_sentences(s)

 

 

 

NLTK already implements a number of parsing algorithms (see nltk.parse for the list). You can try one to see if you loaded the grammar correctly:

 

# initialize the parser

 

parser = nltk.parse.BottomUpChartParser(grammar)

 

# parse all test sentences

 

for sentence in t:

 

parser.chart_parse(sentence[0])

 

However, the NLTK version of the ATIS grammar is not in Chomsky normal form (CNF), which you will need for your CKY parser. Feel free to implement a conversion module. You can then read the grammar from the file using nltk.data.load() and utilize the nice features of the nltk.grammar module on the resulting object.

 

[Note that chart parse() throws an error when it encounters an unknown word, which is undesirable behavior for any parser. If you want to test the grammars with the pre implemented parser (for example, to check whether the CNF version is in fact weakly equivalent), you may need to catch this error]

 

 

 

Problem Statement: ​Parsing

Implement the CKY algorithm for parsing. Also implement a function that extracts the

 

set of all parse trees from the backpointers in the chart. Feel free to use the NLTK module

 

nltk.tree for this purpose; notice that only Immutable Trees can be used as elements of Python sets, whereas raw Trees cannot.

 

Please submit your code, outputs, and a README file containing instructions for running your recognizer and parser. The outputs should consist of at least:

 

  • the list of ATIS test sentences with tab-separated numbers of parse trees

 

  • pictures of the parse trees for an ATIS test sentence of your choice with a number of parses p such that 1 <p< 5.

 

You can visualize an NLTK tree using its draw method.