How to train a model for statistical play-in

From WeizmannWiki
Jump to: navigation, search

For NL play-in, one can improve the performance of the statistical parser by adding training examples. A training example is a file with an .st suffix that contains the parse trees of previously-defined LSC examples.
For existing LSC files created using NL play-in, you can generate the corresponding .st file by selecting the relevant .lsc file, right-clicking it, and choosing "export syntax tree".

To train the statistical model, follow these steps:

  1. Click the 'Train' toolbar button:

    TrainAction.png

  2. Click the 'Browse' button to select the full path and choose a name for the statistical grammar file you are about to create:

    TrainDialogBrowse.png

  3. Click the 'Add' button to select .st files for the model to be trained on. You can add (and remove) as many .st files as you wish:

    TrainDialogAdd.png

    Note that .st files for training, can be generated from LSCs created using NL play-in, by right-clicking a .lsc file and choosing 'Export Syntax Tree'. Examples for LSCs created using NL play-in can be found here.

  4. When training a new model using user defined examples, we allow the user to interpolate the weights of the parameters learned from the user specified sample with previously learned weights from our set of 10000 randomly generated examples. Use the text boxes to define the weights of the parameters learned from the user examples and those previously learned by the system from random examples. By default, we assign 99% of the parameter weight for the parameters learned from user specified examples.

    TrainDialogWeights.png

  5. Click 'OK'.