perClass Documentation
version 5.4 (7-Dec-2018)
 SDSTACKGEN Stacked generalization set of classifier outputs


   ALG    Untrained pipeline
   DATA   Data set

   FOLDS  number of cross-validation folds (opt, default: 10)
   SEED   random seed for cross-validation initialization

   OUT    Data set with unbiased soft outputs
   PBASE  Pipeline of all base classifiers fused by a mean combiner

 SDSTACKGEN performs stacked generalization for a given untrained pipeline
 or algorithm ALG. Stacked generalization produces a dataset with the same
 size as DATA containing unbiased soft outputs of ALG. It may be used for
 construction of a second-stage training data in trained combiners or for
 ROC variance estimation.  SDSTACKGEN is based on a rotation-based
 stratified cross-validation.  In each fold, ALG is trained on the fold
 training set and its soft outputs derived on the test set are
 stored. Eventually, all per-fold outputs are collected together in an
 output set OUT.  The order of samples in OUT and DATA is identical.

 As an optional second output, the SDSTACKGEN returns a pipeline with all
 trained per-fold classifiers fused by a mean combiner. Using PBASE as a
 base classifier in the eventual trained combiner system was observed to
 provide higher robustness and performance.

  P.Paclik, T.C.W.Landgrebe, D.M.J.Tax, R.P.W.Duin,
  On deriving the second-stage training set for trainable combiners
  in proc. of MCS 2005, Monterey, CA, USA, June 2005