perClass Documentation version 5.4 (7-Dec-2018)

kb8: How to build a detector in a single line of code?

Problem: How to define a detector for a class of interest?

Solution: The `sddetector` command provides a ready to use solution.

Detector is a special type of classifier distinguishing one class of interest from everything else. Training a detector means training a statistical model on the class of interest and adjusting the decision threshold (operating point). There are two principaly different ways of adjusting the threshold, depending on our training data set.

If our training data contains only the class of interest, we adjust the threshold indirectly by choosing the fraction of training objects to be rejected. If other classes are present, we may set the threshold to minimize the classification error. Any detector may be build using the `sddetector` command.

Let us build the banana detector, considering the fruit data set.

``````>> a
'Fruit set' 260 by 2 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60)
``````

First, we will consider a data set `b` containing only banana examples.

``````>> b=a(:,:,'banana')
'Fruit set' 100 by 2 sddata, class: 'banana'
``````

We provide `sddetector` with the data set (`b`), the name of the class of interest and a model (untrained `sdgauss` pipeline). Additionally, we specify the reject fraction of 5%:

``````>> pd=sddetector(b,'banana',sdgauss,'reject',0.05)
1: banana -> banana
sequential pipeline     2x1 'Gaussian model+Decision'
1  Gaussian model          2x1  one class, 1 component (sdp_normal)
2  Decision                1x1  thresholding on banana at op 1 (sdp_decide)
``````

We will visualize the detector decisions on the data set `b`:

``````>> sdscatter(b,pd)
``````

Comparing labels of the data set `b` with the detector decisions will show us that 5 of the 100 objects are rejected:

``````>> sdconfmat(b.lab,b*pd)

ans =

True      | Decisions
Labels    | banana non-ba  | Totals
-------------------------------------
banana    |    95      5   |   100
-------------------------------------
Totals    |    95      5   |   100
``````

Alternatively, if our data set contains examples from other classes, we may use them to set the detector threshold. We now train the detector on the complete data set `a`, not specifying the reject fraction:

``````>> pd=sddetector(a,'banana',sdgauss)
1: apple  -> non-banana
2: banana -> banana
3: stone  -> non-banana
sequential pipeline     2x1 'Gaussian model+Decision'
1  Gaussian model          2x1  one class, 1 component (sdp_normal)
2  Decision                1x1  thresholding ROC on banana at op 28 (sdp_decide)

>> sdscatter(a,pd)
``````

``````>> sdconfmat(a.lab,a*pd)

ans =

True      | Decisions
Labels    | banana non-ba  | Totals
-------------------------------------
apple     |    15     85   |   100
banana    |    97      3   |   100
stone     |    17     43   |    60
-------------------------------------
Totals    |   129    131   |   260
``````

When an additional class or classes are present, `sddetector` relabels them creating the non-target class (called 'non-banana' by default). The detector threshold is then fixed to minimize the mean error between target and non-target classes using ROC analysis.

The advantage of this approach is that the operating point of the trained detector may be adjusted according to our specific needs later.

The ROC characteristic, stored in our detector `pd`:

``````>> pd.roc
ROC (52 thr-based op.points, 3 measures), curop: 33
est: 1:err(banana)=0.10, 2:err(non-banana)=0.06, 3:mean-error=0.08

>> sddrawroc(pd)
``````

We may select a different operating point in the ROC figure and save the detector back in the workspace by pressing the 's' key:

``````>> Setting the operating point 33 in sdppl object pd
sequential pipeline     2x1 'Gaussian model+Decision'
1  Gaussian model          2x1  one class, 1 component (sdp_normal)
2  Decision                1x1  thresholding ROC on banana at op 33 (sdp_decide)

>> sdconfmat(a.lab,a*pd)

ans =

True      | Decisions
Labels    | banana non-ba  | Totals
-------------------------------------
apple     |    12     88   |   100
banana    |    88     12   |   100
stone     |     9     51   |    60
-------------------------------------
Totals    |   109    151   |   260
``````