Get morphy from sentences using LSTM
Example of morphy.
-
find the data here: http://hdl.handle.net/11234/1-5287
-
Download ud-treebanks-v2.13.tgz
-
use
tar -xvzf ud-treebanks-v2.13.tgzto unstack the tgz file -
and move data in order have a path like:
.
├── ...
├── data
│ ├── UD_Abaza-ATD
│ │ ├── abd_atb-ud-test.conllu
│ │ ├── abd_atb-ud-test.txt
│ │ └── ...
│ ├── UD_Afrikaans-AfriBooms
│ │ ├── af_afribooms-ud-dev.conllu
│ │ ├── af_afribooms-ud-dev.txt
│ │ ├── af_afribooms-ud-test.conllu
│ │ ├── af_afribooms-ud-test.txt
│ │ ├── af_afribooms-ud-train.conllu
│ │ ├── af_afribooms-ud-train.txt
│ │ └── ...
│ ├── ...
│ └── ...
To run the code you need python (We use python 3.9.13) and packages that is indicate in requirements.txt.
You can run the following code to install all packages in the correct versions:
pip install -r requirements.txtThe main.py script is the main entry point for this project. It accepts several command-line arguments to control its behavior:
--modeor-m: This option allows you to choose a mode between 'train', 'baseline', 'test', and 'infer'.--config_pathor-c: This option allows you to specify the path to the configuration file for training. The default isconfig/config.yaml.--pathor-p: This option allows you to specify the experiment path for testing, prediction, or generation.--taskor-t: This option allows you to specify the task for the model. This will overwrite the task specified in the configuration file for training.
Here's what each mode does:
train: Trains a model using the configuration specified in the--config_pathand the task specified in--task.baseline: Runs the baseline benchmark test.test: Tests the model specified in the--path. You must specify a path.infer: Runs inference using the model specified in the--pathIt will run inference on some exemple sentences located in "infer/infer.txt" and put the results in "infer/configname_infer.txt". If the path is 'baseline', it will run the baseline inference. You must specify a path.
Here's an example of how to use the script to train a model:
python main.py --mode train --config_path config/config.yaml --task get_posThis command will train a model using the configuration specified in config/config.yaml with a task=get_pos.
Here's an example of how to run a test on the experiment separete:
python main.py --mode test --path logs/separeteHere is an exemple of how to run inference using the baseline model:
python main.py --mode infer --path baselineinput shape:
input shape:
| model | crossentropy | accuracy micro | accuracy macro |
|---|---|---|---|
| GET_POS | 0.204 | 0.944 | 0.816 |
Table 1: Test results for pos prediction
| model | crossentropy | accuracy micro | all good |
|---|---|---|---|
| BASELINE | - | 0.980 | 0.791 |
| SUPERTAG | 1.700 | 0.436 | 0.002 |
| SEPARATE | 1.70 | 0.893 | 0.046 |
| FUSION | 1.698 | 0.884 | 0.154 |
Table 2: Test results for morphy prediction




