svm                  package:e1071                  R Documentation

_S_u_p_p_o_r_t _V_e_c_t_o_r _M_a_c_h_i_n_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     `svm' is used to train a support vector machine. It can be used to
     carry out general regression and classification (of nu and
     epsilon-type), as well as density-estimation. A formula interface
     is provided.

_U_s_a_g_e:

     svm(formula, data = NULL, ...)
     svm(x, y=NULL, type=NULL, kernel="radial", degree=3, gamma=1/dim(x)[2],
     coef0=0, cost=1, nu=0.5, class.weights=NULL, cachesize=40, tolerance=0.001, epsilon=0.5,
     shrinking=TRUE, cross=0, ...)

_A_r_g_u_m_e_n_t_s:

 formula: a symbolic description of the model to be fit. Note, that an
          intercept is always included, whether given in the formula or
          not.

    data: an optional data frame containing the variables in the model.
          By default the variables are taken from the environment which
          `svm' is called from.

       x: a data matrix or a vector.

       y: a response vector with one label for each row/component of
          `x'. Can be either a factor (for classification tasks) or a
          numeric vector (for regression).

    type: `svm' can be used as a classification machine, as a regresson
          machine or a density estimator. Depending of whether `y' is a
          factor or not, the default setting for `svm.type' is
          `C-classification' or `eps-regression', respectively, but may
          be overwritten by setting an explicit value.
          Valid options are:

             *  `C-classification'

             *  `nu-classification'

             *  `one-classification' (for density estimation)

             *  `eps-regression'

             *  `nu-regression'

  kernel: the kernel used in training and predicting. You might
          consider changing some of the following parameters, depending
          on the kernel type.

          _l_i_n_e_a_r: u'*v

          _p_o_l_y_n_o_m_i_a_l: (gamma*u'*v + coef0)^degree

          _r_a_d_i_a_l _b_a_s_i_s: exp(-gamma*|u-v|^2)

          _s_i_g_m_o_i_d: tanh(gamma*u'*v + coef0)

  degree: parameter needed for kernel of type `polynomial' (default: 3)

   gamma: parameter needed for all kernels except `linear' (default:
          1/(data dimension))

   coef0: parameter needed for kernels of type `polynomial' and
          `sigmoid' (default: 0)

    cost: cost of constraints violation. (default: 1)

      nu: parameter needed for `nu-classification' and
          `one-classification'

class.weights: a named vector of weights for the different classes,
          used for asymetric class sizes. Not all factor levels have to
          be supplied (default weight: 1). All components have to be
          named.

cachesize: cache memory in MB. (default 40)

tolerance: tolerance of termination criterion (default: 0.001)

 epsilon: epsilon in the insensitive-loss function (default: 0.5)

shrinking: option whether to use the shrinking-heuristics (default:
          TRUE)

   cross: if a integer value k>0 is specified, a k-fold cross
          validation on the training data is performed to assess the
          quality of the model: the accuracy rate for classification
          and the Mean Sqared Error for regression

     ...: additional parameters for the low level fitting function
          `svm.default'.

_V_a_l_u_e:

     An object of class `"svm"' containing the fitted model,
     especially: 

      sv: the resulting support vectors

   index: the index of the resulting support vectors in the data matrix

   coefs: the corresponding coefficiants

     (Use `summary' and `print' to get some output).

_A_u_t_h_o_r(_s):

     David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen
     Lin)
     david.meyer@ci.tuwien.ac.at

_R_e_f_e_r_e_n_c_e_s:

        *  Chang, Chih-Chung and Lin, Chih-Jen:
           LIBSVM 2.0: Solving Different Support Vector Formulations.
           <URL:
           http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm2.ps.gz>

        *  Chang, Chih-Chung and Lin, Chih-Jen:
           Libsvm: Introduction and Benchmarks
           <URL: http://www.csie.ntu.edu.tw/~cjlin/papers/q2.ps.gz>

_S_e_e _A_l_s_o:

     `predict.svm'

_E_x_a_m_p_l_e_s:

     data(iris)
     attach(iris)

     ## classification mode
     # default with factor response:
     model <- svm (Species~., data=iris)

     # alternatively the traditional interface:
     x <- subset (iris, select = -Species)
     y <- Species
     model <- svm (x, y) 

     print (model)
     summary (model)

     # test with train data
     pred <- predict (model, x)

     # Check accuracy:
     table (pred,y)

     ## try regression mode on two dimensions

     # create data
     x <- seq (0.1,5,by=0.05)
     y <- log(x) + rnorm (x, sd=0.2)

     # estimate model and predict input values
     m   <- svm (x,y)
     new <- predict (m,x)

     # visualize
     plot   (x,y)
     points (x, log(x), col=2)
     points (x, new, col=4)

     ## density-estimation

     # create 2-dim. normal with rho=0:
     X <- data.frame (a=rnorm (1000), b=rnorm (1000))
     attach (X)

     # traditional way:
     m <- svm (X)

     # formula interface:
     m <- svm (~a+b)
     # or:
     m <- svm (~., data=X)

     # test:
     predict (m, t(c(0,0)))
     predict (m, t(c(4,4)))

     # visualization:
     plot (X)
     points (X[m$index,], col=2)

