perClass Documentation
version 5.4 (7-Dec-2018)

Chapter 19: Custom algorithms

Table of contents

19.1. Introduction ↩

Design of pattern recognition systems requires to handle systems built from many low-level components such as feature extractors, models and decision functions. perClass provides a tool to model the entire pattern recognition system: the sdalg algorithm.

Major features of the sdalg algorithm:

19.2. Algorithm function ↩

Let us illustrate the use of sdalg algorithm on a simple example. We consider a classifier trained in two steps, first reducing the dimensionality using the Principal Component Analysis (PCA), and then training a classifier in the resulting subspace. This classification system is fully defined in a function sda_pca_clf.

  1: function out = sda_pca_clf(alg,data)

  3:    if nargin==0 

  5:        alg=sdalg(mfilename);               

  7:        alg.frac=0.9;                       
  8:        alg.clf=sdlinear;                   

 10:        out=alg;

 12:    elseif totrain(alg) 

 14:        alg.pca=sdpca(data,alg.frac);
 15:        data=data*alg.pca;

 17:        alg.trclf=data*alg.clf;   

 19:        out=setstate(alg,'trained');

 21:    elseif toexecute(alg) 

 23:        data=data*alg.pca;                  
 24:        out=data*alg.trclf;                 

 26:    end

Algorithm function takes two input parameters, namely the algorithm object alg and the data set data and returns the output out. The type of out will be different if training or executing the algorithm.

The function contains three sections. The first one describes algorithm initialization (lines 3-10), the second training (lines 12-19) and the third the algorithm execution (lines 21-24).

In the initialization section, we instantiate the algorithm object and attach it to the function (in our case to sda_pca_clf). We may then add arbitrary parameters useful for making our algorithm more general. In our case, we want to adjust the fraction of preserved variance of the PCA dimensionality reduction and the classifier model.

Algorithm parameters behave analogously to Matlab structure fields. It is because sdalg is nothing more than a structure connected to a function.

Final statement in initialization section makes sure the algorithm object alg gets returned in variable out.

The training section describes training of initialized algorithm on sddata object data. First we train PCA projection and store it as alg.pca. On line 15, we project the input data using the trained PCA. Line 17 trains the model alg.clf on projected data and adds a default operating point. Finally, we set the algorithm state to 'trained' and return it on line 19.

The execution section is invoked on a trained algorithm and sddata object data. In our algorithm, if merely projects input data by trained PCA projection and then applies the alg.trclf pipeline which executes the trained classifier model and performs decisions. The output out is therefore the sdlab object with decisions for each sample in data.

19.3. Algorithm training and execution ↩

We can construct an untrained algorithm by simply calling its definition function without parameters:

>> alg=sda_pca_clf
untrained sdalg 'sda_pca_clf'
 frac            1x1          8  double   0.9
 clf             0x0       1034  sdppl    untrained sdlinear

We will use the medical problem in our example:

>> load medical
>> a
'medical all' 259783 by 11 sddata, 3 classes: 'cancer'(56652) 'non-cancer'(168467) 'shadow'(34664) 

>> b=a(:,:,1:2) %  only cancer and non-cancer classes
'medical all' 225119 by 11 sddata, 2 classes: 'cancer'(56652) 'non-cancer'(168467) 

We will use first 8 patients as a test set and the remaining ones for training:

>> [ts,tr]=subset(b,'patient',1:8)
'medical all' 112053 by 11 sddata, 2 classes: 'cancer'(33785) 'non-cancer'(78268) 
'medical all' 113066 by 11 sddata, 2 classes: 'cancer'(22867) 'non-cancer'(90199) 

To train the algorithm, we call directly the algorithm function:

>> tralg=sda_pca_clf(alg,tr)
trained sdalg 'sda_pca_clf'
 frac            1x1          8  double   0.9
 clf             0x0       1034  sdppl    untrained sdlinear
 pca            11x2       5272  sdppl    trained sdp_affine
 trclf           2x1      17440  sdppl    trained sdp_decide

Simpler alternative is to use the multiplication operator:

>> tralg=tr*alg
trained sdalg 'sda_pca_clf'
 frac            1x1          8  double   0.9
 clf             0x0       1034  sdppl    untrained sdlinear
 pca            11x2       5272  sdppl    trained sdp_affine
 trclf           2x1      17440  sdppl    trained sdp_decide

Training added pca and trclf fields. We may display the trained classifier trclf:

>> tralg.trclf
sequential pipeline     2x1 'Gauss eq.cov.+Output normalization+Decision'
 1  Gauss eq.cov.           2x2  2 classes, 2 components (sdp_normal)
 2  Output normalization    2x2  (sdp_norm)
 3  Decision                2x1  weighting, 2 classes, 1 ops at op 1 (sdp_decide)

To execute the algorithm, apply it to the test set:

>> dec=ts*tralg
sdlab with 112053 entries, 2 groups: 'cancer'(15134) 'non-cancer'(96919) 

Algorithms may be executed only on sddata objects and the execution output may be either decisions (sdlab object) or soft output (sddata object).

19.4. Algorithms performing ROC and setting an operating point ↩

In the example above, we trained a PCA projection and a classifier. You may ask: Why to bother writing an algorithm function for it when it may be expressed as a simple one-liner:

>> p=sdpca([],0.9)*sdlinear
untrained pipeline 2 steps: sdpca+sdlinear

>> ptr=a*p
sequential pipeline       2x1 'PCA+Gaussian model+Normalization+Decision'
 1 PCA                     2x2  100%% of variance
 2 Gaussian model          2x3  single cov.mat.
 3 Normalization           3x3 
 4 Decision                3x1  weighting, 3 classes

Good question! In this section, we demonstrate the real-world problem that algorithms were created to address. Any real-world classifier needs more than training few sequential steps. It requires application-specific performance optimization using ROC analysis. For example, in our fruit sorting system, we must make sure that at least 99% of bananas are found with minimal false positives.

This is very simple to achieve in a custom algorithm.

    function out = sda_pca_clf_roc(alg,data)

       if nargin==0 

           alg=sdalg(mfilename);               

           alg.frac=0.9;                       
  8:       alg.tpr_apple=0.99; % fix desired sensitivity on apple
           alg.clf=sdlinear;                   

           out=alg;

       elseif totrain(alg) 

           % split the avalable training data
  16:      [tr,val]=randsubset(data,0.5);

           % train model
           alg.pca=sdpca(tr,alg.frac);
           tr=tr*alg.pca;
           alg.trclf=tr*alg.clf;   

           % estimate ROC (on validation set)
  24:      alg.roc=sdroc(val, alg.trclf, 'measures',{'FPr','apple','TPr','apple'});

           % constrain TPr(apple) and minimize the FPr(apple) in one step:
  27:      alg.roc=setcurop(alg.roc,'constrain','TPr(apple)',alg.tpr_apple,...
                      'min','FPr(apple)');

           out=setstate(alg,'trained');

       elseif toexecute(alg) 

  34:      p=alg.pca*alg.trclf*alg.roc;
           out=data*p;                 

       end

On line 8, we add a parameter tpr_apple specifying desired sensitivity (true positive rate). On line 16, we split available training data into a part used for training a model (tr) and a validation set val used for ROC estimation. This is needed in order to avoid positive bias of the ROC characteristic. The line 24 estimated ROC with TPr and FPr measures. On the line 27, we set the operating point by first constraining TPr and then minimizing FPr. Finally, in the execution section, we construct final pipeline composed of a PCA projection, model and ROC.

>> alg=sda_pca_clf_roc
>> alg.clf=sdparzen
untrained sdalg 'sda_pca_clf_roc'
 frac            1x1          8  double   0.9
 tpr_apple       1x1          8  double   0.99
 clf             0x0       1822  sdppl    untrained Parzen

>> tralg=a*alg
....trained sdalg 'sda_pca_clf_roc'
 frac            1x1          8  double   0.9
 tpr_apple       1x1          8  double   0.99
 clf             0x0       1822  sdppl    untrained Parzen
 pca             2x2       9828  sdppl    trained PCA
 trclf           2x1      27378  sdppl    trained Parzen model+Decision
 roc           159x3      21670  sdroc    

>> tralg.roc
ROC (159 w-based op.points, 3 measures), curop: 83
est: 1:FPr(apple)=0.04, 2:TPr(apple)=0.99, 3:mean-error=0.03

The trained algorithm now contains the ROC object with properly set operating point.

Note, that this algorithm may be now easily applied to new data or cross-validated using sdcrossval function.

19.5. Setting algorithm current operating point ↩

The process of setting current operating point may be specified in a separate algorithm section. This allows us to re-run it after training. We may simply add a tosetcurop section to the algorithm function. It fixes a new operating point in already trained algorithm:

       elseif tosetcurop(alg) 

           % constrain TPr(apple) and minimize the FPr(apple) in one step:
           alg.roc=setcurop(alg.roc,'constrain','TPr(apple)',alg.tpr_apple,...
                      'min','FPr(apple)');
           out=alg;

       end

Note, that there are no extra paramters passed from outside. The code simply uses parameters set in the alg object.

Say, we have a traine algorithm built in the previous section:

>> tralg=a*alg
 frac            1x1          8  double   0.9
 tpr_apple       1x1          8  double   0.99
 clf             0x0       1822  sdppl    untrained Parzen
 pca             2x2       9828  sdppl    trained PCA
 trclf           2x1      27378  sdppl    trained Parzen model+Decision
 roc           159x3      21670  sdroc    

It was fixing the operating point at 99% TPr. We wish to adjust a different one with 95% TPr. We simply update the tpr_apple parameter:

>> tralg.tpr_apple=0.95
trained sdalg 'sda_pca_clf_roc'
 frac            1x1          8  double   0.9
 tpr_apple       1x1          8  double   0.95
 clf             0x0       1822  sdppl    untrained Parzen
 pca             2x2       9828  sdppl    trained PCA
 trclf           2x1      27378  sdppl    trained Parzen model+Decision
 roc           159x3      21670  sdroc    

And call the setcurop method on the algorithm:

>> tralg=setcurop(tralg)
tosetcurop sdalg 'sda_pca_clf_roc'
 frac            1x1          8  double   0.9
 tpr_apple       1x1          8  double   0.95
 clf             0x0       1822  sdppl    untrained Parzen
 pca             2x2       9828  sdppl    trained PCA
 trclf           2x1      27378  sdppl    trained Parzen model+Decision
 roc           159x3      21670  sdroc    

The returned algorithm no uses new operating point:

>> tralg.roc
ROC (159 w-based op.points, 3 measures), curop: 76
est: 1:FPr(apple)=0.01, 2:TPr(apple)=0.95, 3:mean-error=0.03

19.6. Converting algorithms into pipelines for out-of-Matlab execution ↩

In order to execute a complex algorithm outside of Matlab, we need to convert it into an sdppl pipeline. We may add an extra toconvert section in the algorithm function to describe this conversion.

For our example algorithm, the pipeline construction would be simple. We will only need to return concatenate the PCA projection with the trained classifier.

function out = sda_pca_clf(alg,data)

if nargin==0                            

    alg=sdalg(mfilename);               

    alg.frac=0.9;                       
    alg.clf=sdlinear;                   

    out=alg;

elseif totrain(alg)                     

    alg.pca=sdpca(data,alg.frac);
    data=data*alg.pca;

    alg.trclf=data*alg.clf;   

    out=alg;

elseif toexecute(alg)                   

    data=data*alg.pca;                  
    out=data*alg.trclf;                 

elseif toconvert(alg)

    out=alg.pca*alg.trclf;

end

The trained algorithm may be now converted using sdconvert function:

>> p=sdconvert(tralg)
sequential pipeline     11x1 'PCA+Gauss eq.cov.+Output normalization+Decision'
 1  PCA                    11x2  93%% of variance (sdp_affine)
 2  Gauss eq.cov.           2x2  2 classes, 2 components (sdp_normal)
 3  Output normalization    2x2  (sdp_norm)
 4  Decision                2x1  weighting, 2 classes, 1 ops at op 1 (sdp_decide)