Subsections

User's guide

This section describes the main features of the 3 interfaces provided with the library: the web interface, the command-line tools and the matlab interface. The complete list of command-line parameters can be obtained by calling the tools without arguments, e.g., trainmsvm. See section 4 for more information on technical features that may help to deal with large data sets.

Quick start with the web interface

To start the MSVMpack Server, simply issue the command

msvmserver start
The web interface can now be accessed locally with a browser by entering the following address (see section 2.5 for other options):
http://127.0.0.1:8080/
At the first connection, you will be asked to setup the server from the admin page to which you can login with the admin user and default password (admin). You can safely click the ``Save settings and start using MSVMpack server" on this page for now (these settings can be changed later) and go to the HOME page from where you can train and test the M-SVM models.

In the ``Train a model" section, choose a data file from the list, e.g., iris.train (the number of instances and features should then appear to the right of the list). Choose one of the four M-SVM algorithms, a kernel function, e.g., the Gaussian RBF, and click the ``Start training" button at the bottom. The complete command line issued to produce the result is printed and you can click on ``WATCH" to see the output of the algorithm.

When training is completed you can see the resulting model, e.g., msvm.model (right-click on the msvm.model file and save link as... to download), and go back to the home page.

To test the new model, scroll down to the ``TEST" section of the HOME page and choose a data file, e.g., iris.test, before clicking the ``Test" button. The recognition rate with some additional statistics will be printed on the next page and you can look at the output of the classifier for each data point in the ``output file".

Quick start with the command-line tools

To train an M-SVM with default parameters, simply use the following command1 (assuming doc/ is the current directory in which myTrainingData can be found):

trainmsvm myTrainingData
This command trains an M-SVM$ ^2$ model as described in [5] (see the next section and Table 1 for alternative M-SVM models). At the end of training, the resulting M-SVM is stored by default in the file msvm.model. Note that the program assumes that the labels are given in the last column of the file myTrainingData and that they take their values in the set $ \llbracket1,Q\rrbracket$, where $ Q$ is the number of classes (see Sect. 3.9 for a full description of the file format).

The following example shows how to set the parameters otherwise than to the default values:

trainmsvm myTrainingData myMSVM.model -m WW -c 3.0 -k 2 -p 2.5
When a .model file is specified on the command line, it is used to store the resulting M-SVM model. In this example, the model of Weston and Watkins (WW) [11] is considered, the soft-margin parameter2 $ C$ is set to $ 3$ and the kernel type is set to 2 (Gaussian RBF) with a kernel parameter value of $ 2.5$ (corresponding here to the standard deviation). Use trainmsvm without argument to see the full list of parameters and their default values.


Once training is done, you can test the classifier on another data set by using predmsvm as:

predmsvm myTestData myMSVM.model pred.outputs
where myTestData is the file containing the test data and myMSVM.model is the file of the trained M-SVM (assumed to be msvm.model if omitted). On output, the file pred.outputs will contain, for each test point, the computed output values for the $ Q$ classes and the predicted label. If the data file myTestData additionnally contains the labels of the test examples in the last column, the test error and the confusion matrix are automatically computed.
Note: the training and test files must be specified as the first argument to trainmsvm and predmsvm, respectively.


trainmsvm usage

The default behavior of trainmsvm is to keep training until a predefined level of accuracy is reached in terms of the ratio

$\displaystyle R = \frac{\mbox{value of the dual objective function}}{\mbox{upper bound on the optimum}},$ (1)

thus defining the stopping criterion as $ R \geqslant 1- \varepsilon $ (see [7] for details on the computation of the upper bound). The value of the accuracy level $ 1-\varepsilon $ can be set through the option -a, e.g., for an accuracy level of $ 95\%$:
trainmsvm myData myMSVM.model -a 0.95
In practice, the ratio is only computed every 1000 iterations on the basis of the best upper bound, i.e., the smallest value of the upper bound obtained till that point. Each one of these values corresponds to the estimate of the primal objective function computed at the current iteration. In addition to these periodic evaluations, the user has the possibility to force the evaluation on demand with the key shortcut ctrl-C. This extra evaluation also prints some additional information and asks the user if he wants to stop training or continue.

The following describes some additional features of the trainmsvm tool.

Infinite training.

Setting the accuracy level to zero with -a 0 disables the evaluation of the model during training and the computation of the ratio (1), which leads to slightly faster training. However, in this case, training must be explicitly stopped by the user through the key shortcut ctrl-C.

M-SVM model type.

The four M-SVMs found in the literature are implemented in MSVMpack. The -m flag allows one to choose the type of M-SVM model according to Table 1.


Table 1: Model types of the four M-SVMs.
Option flag M-SVM model type name & reference
-m WW Weston and Watkins [11]
-m CS Crammer and Singer [2]
-m LLW Lee, Lin, and Wahba [8]
-m MSVM2 (default) Guermeur and Monfrini [5]

Kernel function.

The -k flag is used to choose the kernel function according to Table 2, while -p value is used to set the value of the kernel parameter.


Table 2: Kernel functions.
Option flag Kernel function Default parameter
-k 1 (default) Linear $ k(x,z) = \langle x, z \rangle $  
-k 2 Gaussian RBF $ k(x,z) = \exp(\frac{-\Vert x - z\Vert _2^2}{2\sigma^2})$ $ \sigma = \sqrt{5\times dim(x)}$
-k 3 Homogeneous polynomial $ k(x,z) = (\langle x, z \rangle )^d$ $ d=2$
-k 4 Non-homogeneous polynomial $ k(x,z) = (1+\langle x, z \rangle )^d$ $ d=2$
-k 5, -k 6, -k 7 Custom kernels 0

In addition to the standard kernel functions, MSVMpack allows one to easily add up to three custom kernels as detailed in section 3.3. These kernels can be used by invoking trainmsvm with -k 5 for custom kernel 1, -k 6 for custom kernel 2 or -k 7 for custom kernel 3.

Multiple kernel parameters.

For kernel functions that take multiple kernel parameters, their values are set by -P #parameters value1 value2 value3... (with a capital P). Alternatively, -P filename or -P #parameters filename can be used to read the parameter values from the file filename containing either the number of parameters followed by their values or only the parameter values, respectively. Note that filename cannot start with a number (i.e., 01mykernel_par is wrong but mykernel_par01 or mykernel_par.01 are correct). See section 3.9 for examples.

Class-dependent values of $ C$ for unbalanced data sets.

For unbalanced data sets, using a different value of the hyperparameter $ C$ for each class can be helpful. The default behavior of the -c option is to assign the same value of $ C$ to every class. However, with the option -C (with a capital C), the user can set the values of $ C_k$, for all $ k\in \left[\hspace{-0.9ex}\left[\hspace{0.5ex}1,Q\hspace{0.5ex} \right]\hspace{-0.9ex}\right]$, as for instance in
trainmsvm myTrainingData myMSVM.model -C 1 2.2 0.3
where $ C_1 = 1$, $ C_2 = 2.2$ and $ C_3 = 0.3$.

Cross validation.

MSVMpack implements $ k$-fold cross validation by first computing a random permutation of the data instances and then dividing the data set in $ k$ subsets of equal size (note that one of the subsets can be slightly larger in the case where the number of data is not a multiple of $ k$). This random permutation is made to ease the obtention of subsets that contain examples of all categories in the typical case where the data set is sorted with respect to the class labels. To compute the k-fold estimate of the error rate, add the option -cv k to trainmsvm command line, e.g.,
trainmsvm myTrainingData -m CS -c 7.0 -cv 5
for a $ 5$-fold cross validation, which results in the following output:
================== Cross validation summary =======================
 Fold    Training error rate     Test error rate
  1      6.62 %                  7.00 %
  2      5.88 %                  8.50 %
  3      6.25 %                  3.50 %
  4      7.00 %                  6.50 %
  5      6.12 %                  7.50 %

====== 5-fold cross validation error rate estimate = 6.60 % =======
During the procedure the models trained on data subsets and their outputs are stored in the files cv.fold-XX.model and cv.fold-XX.outputs, where XX stands for the index of the corresponding data subset.

Optimization method.

The optimization method used to train an M-SVM can be chosen with the -o flag. The default optimization method (-o 0) in MSVMpack is the Frank-Wolfe method [4]. Rosen's gradient projection method [10] is also available through -o 1 for the M-SVM$ ^2$ model type.

Data normalization.

Data normalization in MSVMpack is embedded in the model, i.e., the data files need not be changed and the data can retain their original (and usually more meaningful) scale. To normalize a data set before training, use the -n flag on the command line. Note that, even if this flag is missing, MSVMpack may recommend normalization in cases where large differences are detected between the scales of different features. The -u flag can be used to bypass this test and force MSVMpack to use unnormalized data.

Data format.

MSVMpack can handle data in different formats and use a different kernel function for each format. This may be used for instance to increase the speed of kernel computations on single precision floats or integers. See section 4.4 for the details.

Automatic testing.

Computing the test error of an M-SVM at the end of training can be done by simply appending a test data file to the command line, i.e.,
trainmsvm myTrainingData myMSVM.model myData.test
In this case, the corresponding filename (here, myData.test) must include the extension .test. If, in addition, another filename with the .outputs extension is specified, then the outputs of the M-SVM computed on the test set are stored in this file.

Initialization and saving options.

By default, training starts with all $ \alpha_{ik} = 0$ (except for the M-SVM of [2], in which case $ \alpha_{iy_i} = C$), but an initialization file (with the .init extension) conforming to the matrix file format (see Sect. 3.9) for the $ \alpha_{ik}$ can also be specified on the command line. In addition, the values of the $ \alpha_{ik}$ at the end of training can also be extracted from the model and saved in a dedicated file (in matrix format) by specifying a filename with the .alpha extension on the command line. In this case, the command line could look like:
trainmsvm myTrainingData myMSVM.model myAlpha.init myAlpha.alpha \ 
          myTestData.test myPrediction.outputs -c 10.0 -k 2
The order in which the optional files are specified is free (but the extensions are not). However, the training data must always be the first argument.

How .com files can help.

Another way to train an M-SVM with full control over the parameters is to use a command file with a .com extension (this extension is mandatory, but can be further suffixed as in first.com.example). A command file assumes the following particular format:

3                          --> Number of classes Q
2                          --> Nature of kernel function
10.0 10.0 10.0             --> Values of C (one for each class)
4                          --> Chunk (or working set) size
Data/iris.app              --> Training data file
Alpha/iris.init            --> File of initial values of alpha
Alpha/iris.alpha           --> File for saved alpha
Assuming myComFile.com is such a file, the corresponding M-SVM is trained by running:
trainmsvm myComFile.com myMSVM.model
A few example .com files are provided in the Comfiles/ directory.
Note: all parameter values from a .com file can be overridden by a command-line option, e.g.,
trainmsvm myComFile.com myMSVM.model -c 10.0
will always use $ C=10$, irrespective of the contents of myComFile.com.

Retraining.

To retrain an existing model, i.e., resume training from where it was stopped, use the -r option. This can also be used to increase the accuracy level of a model in terms of the ratio (1). For instance, if myMSVM.model was trained with -a 0.95,
trainmsvm -r -a 0.99 myMSVM.model
further trains the model until it reaches an accuracy level of $ 99\%$ (no need to respecify the training data or the parameter values). Note however that the upper bound in (1) is not saved in the model, so that training is restarted without a good upper bound on the optimum. Thus, lower values of the ratio $ R$ can be observed at the beginning of resumed training without implying a loss of training time. In such cases, the accuracy level is simply underestimated until a good upper bound is found.

Model sparse format.

To save the model in sparse format (see Sect. 3.9), append the -s option to the command line. To convert an existing model to the sparse format, use
trainmsvm -S myMSVM.model
Note that sparse models cannot be retrained, since some of the training data are lost.

Log files.

Information on the training process is logged to the .log file specified in the command line (if any). This piece of information can then be plotted through a call to plotlog (or plotlog_msvm2 for an M-SVM$ ^2$ model), e.g.:
trainmsvm myData -m WW myTraining.log
plotlog myTraining.log myPlot.ps
In this example, optimization information is logged to myTraining.log, while an M-SVM model of WW type is trained with default parameters on myData. The recorded information (see Sect. 3.9 for details) is then plotted on screen and the plot is saved to myPlot.ps (if the .ps file is omitted, myTraining.log.ps is used). Note that plotlog requires gnuplot for plotting.

Model saved periodically.

In case the program stops unexpectedly (e.g., due to a power failure) the model is periodically saved in the temporary file msvm.model.tmp (where msvm.model is the chosen filename for the model). If training stops normally, the trained model is saved in, e.g., msvm.model and the temporary file is removed. The period is set to 20 minutes (via the constant MONITOR_PERIOD in libtrainMSVM.h).

predmsvm usage

The additional features for predmsvm are the following.


msvmserver usage

The MSVMpack Server can be started with the command line

msvmserver start
optionally followed by a port number (the default port is 8080, since port 80 requires root privileges). The following options are used to control the server launched from the current directory.

Warning:

if started, the MSVMpack Server provides unauthenticated HTTP access to the computer, including the possibility for users to upload any file in the Data/ subdirectory and to process this file with MSVMpack, which was not written with security in mind. This means that running the server on an open network is probably a bad idea.

Note that security issues remain the responsibility of the person running this software.

Setup of the server.

Several options can be set via the ADMIN page (while the server is running) which can be accessed from the HOME page with the admin login (default password is admin).

The data and model file paths are relative to the MSVMpack1.5/webpages/ directory. Their defaults place all data files in MSVMpack1.5/webpages/Data/ and all the models trained via the web interface in MSVMpack1.5/webpages/Models.

You can also set the default amount of memory for kernel cache, the maximal number of processors (cores) that MSVMpack can use or change the admin password from here.


Web interface

The web interface offers a more intuitive way to use MSVMpack and set the training parameters without having to type any command line in a terminal. It is accessible from any platform (including Windows) with a classical (javascript capable) browser. The only requirement is to have a network access to a Linux computer with a running instance of the MSVMpack server (see section 2.5 above for server-side instructions). The HOME page is divided in three sections: upload of data files, training and test.

Data files.

In order to train or test an M-SVM model on some data, you must first upload the corresponding data file to the server. This action can be performed from the HOME page, while data files can be deleted or downloaded from the ADMIN page.

Training.

The HOME page allows one to choose an M-SVM algorithm and set all the parameters for its training before clicking the ``Start training" button.

Model files and test.

All models created with the web interface are saved in a dedicated directory (defined in the ADMIN page). All models found in this directory (and these only) can be used to make predictions from the web interface. In particular, the bottom of the HOME page allows one to choose a trained model and a test set to predict labels on this data.

Using MSVMpack from Matlab

MSVMpack can be used from Matlab since version 1.3. However, this interface is only implemented as a set of wrapper functions which call the command-line tools trainmsvm and predmsvm. Therefore, it cannot be used as a platform-independent alternative to MSVMpack.

To start using MSVMpack from Matlab, you must add the matlab subdirectory to Matlab's PATH:



addpath('..path_of_installation../MSVMpack1.5/matlab/');



Then, you can try the test program example.m which will run a simple example and show the basic usage of the commands. The Matlab interface offers the following five commands (for which we only show the basic usage here, the help function might provide more details).

All these functions implicitly work with data and model files created on the spot to interact with MSVMpack command-line tools. These files are created in the current directory and they also ensure that no data nor models are lost between Matlab sessions.

Models and data sets are stored as Matlab structures with self-explanatory fields. For instance a model is represented as:

model = 

                 name: 'mymsvm'
              version: 1.4000
                 type: 'WW'
             longtype: 'Weston and Watkins (WW)'
                    Q: 3
          kernel_type: 1
      kernel_longtype: 'Linear'
        nb_kernel_par: 0
           kernel_par: []
    training_set_name: 'mymsvm.train'
       training_error: 0.2167
              nb_data: 300
            dim_input: 2
                    C: [3x1 double]
        normalization: []
                    b: [3x1 double]
                alpha: [300x3 double]
                    X: [300x2 double]
                    Y: [300x1 double]
              SVindex: [166x1 double]
and the value of $ \alpha_{ik}$ can be accessed as model.alpha(i,k).

Custom kernels.

The Matlab interface does not provide the possibility to add custom kernels implemented as Matlab code. The kernel must be implemented in C as described in section 3.3 and used by passing the corresponding '-k' option to trainmsvm.

lauer 2014-07-03