Skip to content

C++ script for converting csv data into the format needed for Field-Aware Factorization machines.

Notifications You must be signed in to change notification settings

pklauke/LibFFMGenerator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

LibffmGenerator

Script for converting tabular csv data into the data format needed for Field-Aware Factorization Machines. Libraries supporting Field-Aware Factorization Machines are provided with libraries like LibFFM and xLearn.

This script supports up to 3 files (train / validation / test). The label is expected to be in the first column and numerical columns in the following columns. Categorical columns are expected to be last. The number of numeric columns must be specified with the parameter --numeric.

The first file given is expected to be the training set. Validation and test sets can be specified by the use of the arguments --validand --test. To speed up memory allocation the number of samples per set can be specified. This is possible by the use of the parameters --n_train, --n_valid and --n_test.

An example would look like:
LibffmGenerator.out train.csv --valid valid.csv --test test.csv --n_train 70000 --n_valid 10000 --n_test 20000

Installation

For installation just clone the repository:
git clone https://github.com/pklauke/LibffmGenerator

If the binary file isn't executable on your system recompile it e.g.
g++ -o LibffmGenerator.out LibffmGenerator.cpp

On some operating systems (e.g. Ubuntu) errors may occur during compilation. If this happens try out
g++ -o LibffmGenerator.out LibffmGenerator.cpp -std=c++0x
instead.

About

C++ script for converting csv data into the format needed for Field-Aware Factorization machines.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages