- 1.1. This manual
- 1.2. Introduction to perClass
- 1.2.1. Versions
- 1.2.2. System requirements
- 1.2.3. Useful general commands
- 1.2.3.1. Displaying perClass version and license information
- 1.2.3.2. Demo examples
- 1.2.3.3. Provide direct feedback to PR Sys Design
- 1.2.3.4. Control messages displayed by perClass
- 1.3. Release notes
1.1. This manual ↩
This manual assumes basic knowledge of pattern recognition and Matlab environment. In order to embed trained classifiers into custom applications, basic familiarity with C language is also assumed.
The manual is structured in four parts:
- User's guide - explains software functionality
- Reference manuals for the perClass Toolbox and for perClass Runtime library describe the programing interface
- Knowledge base - collects number of step-by-step usage examples and "howtos"
- Glossary - explains basic pattern recognition terminology
1.2. Introduction to perClass ↩
perClass is a software package that provides quick development of custom machine learning solutions. With perClass, R&D specialists in many innovative companies, have developed algorithms to detect cancer, sort luggage at the airports, classify defects of machine parts or spot traffic accidents.
perClass is composed of two parts, namely perClass Toolbox for quick design of classifiers in Matlab and perClass Runtime for classifier deployment in production.
perClass provides tools for:
- Construction of data sets
- Handling of multiple sets of labels and arbitrary meta-data
- Interactive visualization of data and meta-data
- Feature extraction for images and spectra
- Training statistical detectors and classifiers
- Quick evaluation of classifiers
- Optimizing classifier decisions according to performance requirements using two-class and multi-class ROC analysis
- Building hierarchies of classifiers and classifier fusion
- Deploying trained classifiers in custom applications out of Matlab in real-time applications
1.2.1. Versions ↩
perClass comes in the following versions for commercial use:
perClass Toolbox for development of machine learning algorithms. The permanent license is bound to a hardware dongle.
perClass SDK for embedding trained classifiers into custom applications. Annual license is bound to the hardware dongle of the toolbox.
perClass Pro: bundle of Toolbox and SDK. Pro is the complete solution for design of algorithms and their embedding in custom applications.
perClass Enterprise: the complete solution for design of algorithms with perClass Toolbox and embedding them in custom applications with perClass SDK. All functionality is permanent.
These versions are available for academic research and teaching:
Lite: Free limited version for non-commercial use intended for people who are learning about pattern recognition. It contains only perClass Toolbox and is limited to data sets with maximum 300 samples and three classes.
perClass Toolbox Academic: perClass Matlab Toolbox discounted for use by university students and researchers for non-commercial projects. The license is permanent and bound a hardware dongle which allows the researchers to move between different machines.
perClass Pro Academic: perClass Pro discounted for use by university students and researchers for non-commercial projects only. The license is permanent, bound a hardware dongle and includes both the perClass Toolbox and the perClass Runtime library for execution of trained classifiers out of Matlab in research demos.
For Academic and Commercial versions, also group licensing is available using floating licenses provided by a license server.
1.2.2. System requirements ↩
perClass is supported on the following platforms:
- MS Windows 32-bit
- MS Windows 64-bit
- Linux 32-bit (x86)
- Linux 64-bit (x86)
- Apple Mac OS X 32-bit (x86)
- Apple Mac OS X 64-bit (x86)
perClass requires Matlab 7.5 or later
1.2.3. Useful general commands ↩
1.2.3.1. Displaying perClass version and license information ↩
perClass version information may be displayed using sdversion
. It consists of a
numerical part (e.g. 4.0) and a build date (6-May-2013).
sdversion
also provides several license-related details such as license
type (Commercial, Academic or Lite), licensee name and the license expiration
date.
>> sdversion
perClass 4.0 (06-May-2013), Copyright (C) 2007-2013, PR Sys Design, All rights reserved
Customer: PR Sys Design (PRSD) Issued: 27-mar-2013
Toolbox with DB,imaging: The license expires on 1-jul-2013.
SDK: The license expires on 1-jul-2013.
Installation directory: '/Users/pavel/matlab/toolboxes/perclass'
1.2.3.2. Demo examples ↩
sddemo
lists several basic examples to get started
>> sddemo
run perclass_exampleX.m where X is the index of the desired example
1 : Working with data sets
2 : Training a classifier and visualizing decisions
3 : Tuning a classifier using ROC analysis
4 : Multi-class ROC analysis
5 : Building detectors
6 : Building a detector-classifier cascade
1.2.3.3. Provide direct feedback to PR Sys Design ↩
sdfeedback
command allows users to submit feedback such as error messages
to PR Sys Design directly from within Matlab. Running sdfeedback
without
arguments opens an edit dialog where the user may paste or type the desired
message. An alternative is to provide the message to sdfeedback
as a
string.
1.2.3.4. Control messages displayed by perClass ↩
sddisplay
command provides global verbosity control in perClass.
Running sddisplay
without arguments prints the current display state
(on/off). To switch off messages printed by perClass, use:
>> sddisplay
off
Default sddisplay
state is on
. When perclass_mex
library is re-loaded
into memory, this default state is re-introduced.
Alternatively, you may use the 'nodisplay' option in the functions that
support it: sdrelab
, sdroc
, sddetect
and sdcrossval
.
1.3. Release notes ↩
Version 5.4 (7-Dec-2018)
- adding single precision deployment for models exported from perClass Mira GUI
sdsvc
adding support for support vector regression with 'target' option- fix of error in
sdscatter
with decision backdrop when multiple figure windows were open - fix in
sdlist
allowing use of logical indices - fixing the problem with multiple license checkouts on floating license servers
Version 5.3 (7-Jul-2018)
- internal release adding perClass Runtime deployment in Imec integration
Version 5.2 (9-Jan-2018)
sdconfmat
enables interactive tuning of operating points directly from a confusion matrix (try:load fruit; p=sdfisher(a); r=sdroc(a,p); sdconfmat(r)
)- define multiple contraints by clicking on the confusion matrix fields (directly constrain errors, performance and per-decision precision)
- specific errors in a confusion matrix can be interactively lowered by mouse scroll-wheel or cursor keys (this is really cool new way to leverage application-specific knowledge)
- can be opened from any ROC figure (
sddrawroc
) via toolboar or 'c' key - making constraints in a confusion matrix disables corresponding operating points in the attached ROC plot. This allows one to quickly explore viable subsets of solutions
- this update makes multi-class classifier tuning easy and practical
sdsvc
support for class weighting (of the C parameter) with 'w' option. This enables a practical way to find good default solution in skewed problems (some classes much bigger than others)- RBF, polynomial and linear SVM supported
- one-against-one multi-class SVM supported
sdneural
RBF networks significantly faster in trainingsdbox
command gets new options to remove outliers:- 'bounds' option allows setting min/max values applicable to all features
- 'min' and 'max' options allow specification of min/max values per feature
sdimage
visualization allows setting color scaling limits by percentile with 'o' key (outliers). This is useful to avoid influence of outliers on visualization.- new data preprocessing option for spectra (dark-current and white-background normalization) in
sdprep
- quickly define a random set of categories using
randsubset
on sdlist objects. For example, in a medical data set where each sample has a patient label, we may quickly define a set of training/test patients with[Ltr,Lts]=randsubset(data.patient.list,0.5)
and then get respective data sets with[tr,ts]=subset(data,'patient',Ltr)
. sdexport
for Cubert plugins supports class colors and multiple ROCs (e.g. cascaded detector + classifier with separate ROCs)
Version 5.1 (29-May-2017)
- adding support for Matlab 2017a
- multi-channel input support in deep learning (color or multi-band images)
- showing figure number for each network to easily match training progress to specific model
sdextract
adds support for raw multi-band extractor (useful e.g. for deep learning on multi-channel inputs)- support for GPU training (requires Parallel Computing Toolbox)
- Support Vector Machine improvements:
- probabilistic output for two-class SVM classifiers with 'prob' option
- linear SVM can be approximated by affine projection
sdimage
support for moving to specific band when opening figure ('band' option or 'b' keystroke). Also possible to change band in already opened figures- GUI fixes in
sdconfmat
andsddrawroc
for class names with underscores - specify multiple constrains for multi-class ROC given a matrix (same size as confusion matrix, NaN fields ignored)
sdscatter
adds a command to change class markers randomly (':' keystroke). This is useful for situations with many classes or cases where a data set contains indentical marker for each class.sdroc
stored confusion matrices by default (earlier requires 'confmat' option). This allows us to directly view confusion matrices or use it for selection of operating points. There is a new 'no confmat' option to disable this behaviour.- when expanding results computed on image grid with
expand
method, labels or decisions can be provided as second argument. This is useful to quickly expand decisions back to the original high-resolution image - Stacked generalization
sdstackgen
adds support for user-defined algorithms (that can be converted into a pipeline and provide soft outputs) - new
sdprep
command for data preprocessing:- divide by the sum (divsum) or mean (divmean) of each feature vector
- divide by value of a specific band (if a sample represents a 1D spectrum)
- smoothing and (smoothed) derivative of spectra
- user-defined kernel for 1D convolution of spectra
- compute band ratios (such as NDVI index)
- support for "adding" new computed features after all input features
sdprep
returns a pipeline that can be added to a classifier and exported for direct deployment using perClass Runtime- multiple spectral indices can be computed by a single pipeline using stacked pipelines, see example.
- sdrun_mex supports decision output for stacked decision pipelines ("crisp" combined classifiers)
- licensing:
sdactivate
command included to allow license activationsd_Activate
API for activate of perClass Runtimesd_GetLicenseInfo
API to retrieve details of perClass Runtime licensesdrun
command-line utility gets '-l' option to show license details
- fixes:
- fix for weighting-based decisions in situations where soft outputs are negative (before, all-negative outputs would all get assigned to the first category)
- pipeline buffer in
sd_LoadPipelineFromBuffer
is constant
Version 5.0 (21-Sep-2016)
- improved support for Matlab R2016a. perClass keeps the support for R2007b and higher.
- support for deep learning
- easy to train deep networks via a single
sddeepnet
command - simple way to define network architecture and get feedback with most common errors
- support for convolution, spatial pooling, relu, dropout and batch normalization
- architecture may be defined in a separate cell array ('arch' option)
- support for filter groups (multiple filters trained on identical input bands)
- network training is repeatable if random seed is fixed (via
rand('state',X)
) - special GUI showing training progress and best solution found
- possibility to train further, fine-tune learning rate, batch size or update training/val sets
- stop when a specific error is reached with 'stop at' option. By default stops at zero error.
- trained networks are standard perClass pipelines and can be mixed and matched with any other tools such as ROC or cascading
- training is based on Matconvnet toolbox (bundled in perClass distribution), training support for 64-bit platforms
- on the deployment side, perClass Runtime remains single self-contained DLL (there is no change to Runtime API, it's sufficient to drop-in 5.0 dll in the project without recompile to load and execute deep networks)
- execution speedup via joining 'conv' and 'bnorm' pipeline stages with
+
operator
- easy to train deep networks via a single
- imaging improvements:
- new 'resize' object extractor in
sdextract
resizing any object to a regular grid. This is useful to define data set for deep learning. - new easy way to pass decisions on local image regions, computed on a grid, back to the original full resolution image with
expand
method. This greatly simplifies execution of computationally-intensive algorithms such as deep learning over entire high-res images. Theexpand
method supports also passing of labels defined on a grid back to original image. sdextract
can now compute local regions for specific pixels given by linear pixel indices with 'pixels' option- region "raw" extractor in
sdextract
now automatically sets theimsize
of the output data set
- new 'resize' object extractor in
- other improvements
sdscatter
adds a context-menu command to select a subset based on a specific label of a current sample (e.g. 'Select only the patient as this sample represents.')- all commands returning figure handles (
sdscatter
,sdimage
,sddrawroc
etc.) return only figure number in R2014b and later. This is to avoid excessive display of Matlab figure object details. It is possible to use the returned number to focus the figure usingfigure(f)
sdsvc
uses updatedlibsvm
binaries version 3.21 on all platforms- solution for problems, where libsvm optimizer does not converge or takes extremely long time with new 'verbose' and 'no shrink' options
- new option 'verbose' shows detailed libsvm output
- it is now possible to disable "shrinking" heuristics of libsvm (using 'no shrink' option). This may significantly speed up optimization in some cases (when 'verbose' output indicates "
Warning: using -h 0 may be faster
") - in order to repeat exactly past experiments based on earlier perClass versions, the older libsvm 2.91 binaries from older perClass distributions are still supported by
sdsvc
command
- affine projections (PCA/LDA) now offer 'weights2' output returning relative weights of input features for each output dimension (see output of
info(p)
orp'
for PCA/LDA pipelines)
- fixes
sdlms
fix: When building regression model, it is not necessary to have more than one class defined (there is no difference in model output)sdimport
now judges type of entire column as nominal or real based on all values, not only first rowsdlab
insert fix for singleton labels (one entry per category) where the order of categories and entries could become inconsistent- fixing problem with horizontal concat of labels in R2016a
- fixing issues related to subsref on sddata and sdppl related to changed subsref behaviour in R2015b (e.g.
p(1:2).output
) sddata
fix for assignments given a logical subset (data( dec=='target' ).lab='target'
)- fix for a problem when reading comma-separated values in
sdrun
command - fix for
sddrawroc
legend warning in R2016a
Version 4.8 (19-Apr-2016)
- show all keyboard shortcuts in any
sdscatter
orsdimage
figure with?
key or toolbar button sdimage
improvements- support for zoom and pan with standard Matlab tools
- zoom is also quickly accessible with +/- keystrokes that follow the cursor and do not require mode switching
- legend support. This also enables one to use standard
print
command to save content ofsdimage
figure - New 'go-to-band' functionality with 'g' keystroke
- cropped images may be related back to the original. Both, the reference and assignment operations are supported (Example: Pass the labels from a cropped image to its original image:
origim( croppedim ).lab=croppedim.lab
)
sdextract
improvements- new orientation histogram feature extractor (
orihist
). Supported both for image neighborhoods ('region' domain) and for objects (defined e.g. by segmentation or clustering) - new grey-level morphology extractor (erosion, dilation, opening, closing, edge extraction)
sdextract
now requires that input is an image data set (sddata
object created withsdimage
command), not an arbitrary image matrix
- new orientation histogram feature extractor (
sdscale
adds support for non-linear scaling (different variants of exp and log scaling)- fix for the
sdsegment
crash on images with more than 5000 regions (newmaxsize
option allows to specify max region count)
Version 4.7 (15-Dec-2015)
- improvements of interactive ROC visualization (
sddrawroc
) showing confusion matrices:- it is now possible to visualize a difference confusion matrix showing improvements to a selected operating point. This significantly simplifies selection of a good trade-off in multi-class situations.
- new toolbar buttons to show confusion matrix (or normalized confmat) in ROC figures
sdsvc
adds support for one-against-one multi-class strategy.sdkmeans
training may stop earlier if stability is reached. The 'iters' option now specifies the upper bound on the number of iterations.sdexport
improvements- it is possible to specify numerical format used when exporting data to text files ('numformat' option). This allows us to tweak the precision needed.
- fixes for WEKA Arff format export to support class names with spaces
- the least-mean-square classifier
sdlms
adds support for per-sample weights (useful to create base learners for boosting algorithms) sddetect
adds 'confmat' option to store confusion matrices in the detector. This allows us to visualize them later or to set operating points by cost-sensitive optimization.sdextract
adds support for custom convolution filter banks ('fbank' extractor)- details on data sets, labels and pipelines may be displayed using
info
method (or the transpose operator shortcut'
) - fixes addressing incopatibilities due to new Matlab engine in R2015b
- fix for
sdparzen
NaNs appearing in EM algorithm optimization - fix for
sdcascade
withsdrelab
actions - added multiple dongle support for all platforms and all products including perClass Runtime DLLs (see discussion at http://perclass.com/index.php/forums/viewthread/440/).
- visual fix for
sdscatter
making sure there is precise alignment between scatter plot and decision backdrop on Windows and Linux
Version 4.6 (29-Jun-2015)
- improved setting of a current ROC operating point
setcurop
method now in one command combines a performance constrain with the minimization/maximization of an error criteria, example:r=setcurop(r,'constrain','TPr(apple)',0.9,'min','FPr(apple)')
@sdalg/@sdalg
algorithms can re-run setting of an operating point withsetcurop
method.
- runtime improvements
- support for classifier execution directly on uint8/uint16 input buffers (e.g. direct in-place processing of RGB images).
- No need to loop over your data and cast into double precision on your application side.
- Simply add an
sdconvert
step to any pipeline performing the conversion fromuint8
oruint16
todouble
in inside the runtime. - Feature selection is supported with the
'select'
option. This brings significant speed-ups by processing only desired image chanels.
- new SDK utility/example
ex_diff.c
executes a classifier pipeline on a saved binary data and compares results to a stored binary outputs. Seeex_diff.m
example.
- support for classifier execution directly on uint8/uint16 input buffers (e.g. direct in-place processing of RGB images).
- new deployment platform
- support of royalty-free deployment to LabView real-time OS (PharLap-ETS platform). For details contact us at info@perclass.com.
- local image feature extractors
- Leung-Malik multi-orientation/multi-scale filter bank
'fbank:LM'
- Schmid rotationally-invariant filter bank
'fbank:S'
- Maximum Response filters combining multiple orientations
'fbank:MR8'
and combining both orientations and scales'fbank:MR4'
'raw'
extractor unroling pixel neighborhood into a feature vector
- Leung-Malik multi-orientation/multi-scale filter bank
- other improvements
sdimage
andsdextract
now report an error if an image data set contains multiple copies of the same pixel (this can happen by manually joining image sets multiple times)- fix in
sdextract
for 'class fractions' object-level extractor that now correctly supports situations where a subsets of classes was detected over an object sdimage
figure now shows the size of a class under cursorsetmarkers
andsetcolors
without extra arguments set default values into a data set
Version 4.5 (17-Mar-2015)
- new RBF neural network available via
sdneural
withsdneural(data,'rbf')
. It is implemented using a highly scalable and fast direct (not iterative) formulation. - improvements of
sdmissing
- new 'class mean' and 'class median' imputation methods that are providing more realistic imputation values based on prior knowledge of class labels
- new 'from',DATA option allowing imputation based on externally-supplied data set. This is useful when performing missing value imputation in new/test data only based on training data statistics
- maximizing the amount of data used to estimate imputation statistics with per-feature imputation by default
sdimage
improvements- it is now possible to retrieve a data set from a figure via command-line using
sdimage(fign,'getdata')
- improved painting mode - context menu shows the list of all available classes + new one and allows to freely switch between them (via corresponding digits)
- single pixel brush option available
- it is now possible to retrieve a data set from a figure via command-line using
sdsvc
pipelines now expose multiple parameters:sigma
for RBF ordegree
for polynomial kernelC
parameter used in trainingsvcount
- the number of support vectorssvind
- indices of support vectors in the training set
- support for joining multiple feature selection pipelines with
+
operator - new
sddata
median
method sdalg
improves descriptive display for embedded algorithms (alg in an alg)setprop
now requires correct shape of sample properties to avoid ambiguity- fix of
isclass
issue with sometimes incorrectly functioning 'only' option sdimport
fix improving detection of nominal data from numerical NaN values- when converting a custom algorithm into pipeline with
sdconvert
, check is made whether the algorithm function is on Matlab path
Version 4.4 (3-Dec-2014)
- significantly faster multi-class ROC estimation
- compatibility with Matlab R2014b new graphics subsystem (
sdscatter
andsdimage
) sdsvc
added support for one-class classification based on RBF kernelsddetect
support for user-specified performance measures for internally-estimated ROCsddetect(data,'target',model,'measures',{'TPr','target','precision','target'})
sdextract
improvements:- object-level shape features (Hu moments and shape eigenvalues). Shape representation can be computed both from a binary object mask or from object image content (on a single band/feature)
- object-level extractor computing class fractions inside each object from per-pixel decisions. This is useful for defect detection/material classification inside pre-segmented objects.
- fix of single pixel shift in extracted data
sdrelab
accepts relabeling rules defined by twosdlab
objects of the same size (category in the first gets relabeled into the corresponding category in the secondsdlab
object)- this provides very easy passing of object labels/decisions back to all object pixels in an original image
sdlda
now supports user-specified class priors with 'prior' option- speed up of pipeline execution by user-defined joining of pipeline steps with plus (
+
) operator (applicable to scaling, affine projections and neural nets) - added support for long SQL statements (up to 32kB)
- added
sddb
istable
method to test if a table exists in an SQLite data base sdfeatsel
improvementssdversion
adds a platform string to simplify platform-specific support
Version 4.3 (24-Jun-2014)
- fix for
sdsvc
problem with soft-output offset which impacted performance of multi-class classifiers. Two-class SVMs, followed with ROC, were not affected. - more informative warning given for
sdrandforest
when the data set size requires larger number of nodes with 'maxnodes' parameter
Version 4.3 (26-May-2014)
- Complete perClass documentation is now available from within Matlab via
doc
command. In Matlab 2012b or later, you need to click on "Supplemental Software" to access documentation for 3rd-party toolboxes. - new
sdlogistic
classifier with non-linear decision boundary due to polynomial feature space expansion- direct multi-class formulation, optimization by gradient descent
- polynomial expansion accessible also separately via
sdexpand
sdextract
feature extraction improvements- simplified syntax:
sdextract(data,domain,feature,options)
- domain string describes where features are computed ('region','object','color','bands')
- feature name is always given as third argument
- 'region' extractors (local image features in a sliding window)
- 'object' extractors (one feature vector per object defined by object labels)
- simplified syntax:
sdsegment
improvements (connected components in images)- significant speedup for large images
- fixed problem in connected component labeling for complex shapes
sdimage
improvements- create new label set from current (copy labeling, 'N' key shortcut). Use-case: Quick labeling of image regions by hand. First, find connected components, copy labeling into a new set and use "Rename class" to assign meaningful class labels.
- create new image from data of a specific class (or subset of classes by name/regular-expression with '/' key). Use-case: Quickly extract a subset of an image.
- support for bounding boxes with 'bbox' option in GUI,
sddata
sets and matrix output - improved class rename command (support for class merging, does not change class colors), now bound to 'r' keystroke.
- improved color handling (added commands for random color change for all classes ('R' key) or specific class ('.' key))
- fixed error when drawing classifier decision with interactive ROC on a subset of an image (
sdimage(sub,p,'roc')
) - fixed a problem where it was not possible to switch back to the label layer with 'space' key
sdscatter
improvements- Improved handling of axes limits:
- There is an "auto" mode and "manual" mode of axes limits
- The default "auto" mode sets the limits of the entire data set (or to visible subset with 'v' keystroke)
- The "manual" mode is toggled by standard Matlab zooming, panning or by setting the x- and y- limits via toolbar buttons or 'x'/'y' key-strokes
- New toolbar button shows/switches the "auto"/"manual" modes (also via 'a' key-stroke)
- fix in the label painting: New markers are chosen to be different than the existing ones, markers/colors do not change
- fixed error in "new from current labels" command when label set already present
- fix in
sdconfmat
opened from the scatter plot ('c' keystroke). It can now be closed with a standard keyboard shortcut.
- Improved handling of axes limits:
- other improvements
- easy access to image subsets with
a( b ).lab='something'
syntax whena
andb
are images of the same size.a(b)
then means "return a subset ofa
with pixels available also inb
". - fixed problem with
sdrelab
renaming large label sets (e.g. when loading large sets of file names intosdlab
object (example:sdlab(dir('path/*jpeg'))
) and extracting sub-strings using regular expressions) sdcrossval
for leave-one-out throws an error if 'seed' option is used (does not make sense in leave-one-out)sdrelab
makes sure that makers for new categories are different from already present ones (default markers are user-adjustable indefault_markers.m
function)- fix for multi-class
sdsvc
: colors are now set properly
- easy access to image subsets with
- C/C++ API improvements
- fixed performance timer problem on MS Windows platforms.
Version 4.2 (5-Mar-2014)
sdconfmat
visualization in a figure with 'figure' option provides easy visual representation of confusion matrixsdscatter
improvements- show confusion matrix with visible subset of data ('c' key)
- filter improvements
- visual indication of all filters above the plot
- invert filter on current labels ('i' key)
- remove filter on current labels ('r' key)
- tagging individual samples with double-click or 't' key.
- create new set of labels from current labels ('N' key)
- label visible samples as ... ('L' key)
- switch label painting on and off with 'p' key
sddrawroc
visualizes confusion matrix in a separate figure with 'c' keysdscatter
fixes- fixed feature plot with scatter misalignment
- fixed issue with feature plot update with auto limits ('a' key)
- when painting already existing class, use its marker
- fixed 'previous filter' glitch when applying 'hide this class' command multiple times
- SDK
- fix for compiler warning 'null character(s) preserved in literal' when embedding pipelines into C header file with
sdexport
'header' option.
- fix for compiler warning 'null character(s) preserved in literal' when embedding pipelines into C header file with
Version 4.1 (3-Dec-2013)
- classifier improvements and changes
- new
sdmixture
implementation scalable to very large data sets- new fast automatic estimation of number of components
- Gaussian models provide access to component labels
p.complab
and component subsets withp(1,ind)
- crisp combination of classifier decisions
- new least mean-square classifier
sdlms
sdkmeans
andsdknn
set default threshold to accept all training data (not only prototypes)sdscale
allows manual construction of scaling pipelinessdneural
fix when training further from initial model- fix in lookup-table classifier
sdlut
which now rejects examples precisely at the border
- new
- ROC improvements
- multi-class ROC optimizers return richer sets of operating points (also for cost-sensitive optimization)
- if
sdroc
is provided cost specification matrix, it is used to set the operating point by default
- feature extraction and selection improvements
- convolution based local image features in
sdextract
(Gaussian and Gaussian derivatives, Sobel) - new extractor for spectral data
sdbands
supporting both mean and linear discriminant per band. Bands may be defined manually, sequentially or by clustering. - fix in
sdfeatsel
for backward selection using a classifier error as criterion - fix in
sdextract
for error when providing a mask sdprox
can be used to create data sets from a custom-computed proximity matrix. It adds all meta-data of prototypes as new feature properties of the output set.
- convolution based local image features in
- GUI improvements
sdimage
uses colors defined in class listssdimage
change color via context menusdscatter
allows manual setting of x and y axis limits via toolbar buttons or 'x' and 'y' keystrokes- fix in
sdscatter
where changing marker failed when on filter subset - fix in
sdscatter
feature distribution plots (small classes do not disappear)
- core functionality
- speedup of data set, label and list operations when many categories (thousands) are present
sdlab
adds 'size'/'sizes' and 'fraction' fieldssdlab
now supports construction with category + number of consecutive entries also when categories are repeating:sdlab('apple',4,'banana',3,'apple',2)
- fix in
sdimport
when loading mixed real-value and nominal data set sddata
returns empty if feature does not exist- all objects now supports
isempty
method (discussed in http://perclass.com/index.php/forums/viewthread/411/ )
- SDK
- new API function
sd_GetClassifierCount
returns the number of classifiers in a pipeline (a pipeline describing a cascaded system contains multiple classifiers) sdrun
utility now displays the number of classifiers in a pipeline together with number of op.points and current op.point for each classifier- the
sd_LoadPipeline
function now returns error -226 if a pipeline file cannot be opened instead of reporting an "unknown pipeline format (-224)"
- new API function
Version 4.0 (26-Jul-2013)
- support for nominal features and embedded SQLite database
- create
sddata
sets from cell arrays with categorical data - train classifiers on categorical data
- easy conversion between categories as strings/
sdlab
labels and numerical representation - new command
sdnominal
to check if nom.rep. is identical in data sets and classifiers - work with large local databases from within Matlab using SQL language
- multiple active connections, multiple live queries per connection (live updates in underlying database available in open connections)
- create
- licensing changes
- introducing separate DB and imaging licensing options for perClass Toolbox
- single license file per customer (mutliple host definitions in one license file)
- Enterprise version now has all functionality permanent (toolbox and SDK)
- in addition to deployment dongles, deployment host-based licenses are available (no need for extra hardware)
- new deployment option for servers
- new products
- perClassNET - a .Net interface allowing calling perClass classifiers from C#, VisualBasic.Net, Excel etc.
- perClass Connector - executing classifiers within the MS SQL Server environment via stored procedures
- perClass OEM product for high volume deployment (contact sales@perclass.com)
- new pipeline functionality (making pipelines simpler to use and smarter)
- all classifiers return decisions by default
- pipelines store input feature labels (
p.inlab
) and descriptive output (p.output
) - pipeline details via transpose operator (
p'
) - unary minus removes decision step if present
p=sdfisher(data); soft_out=data*-p;
- data and label improvements
- classifier improvements
- new fast implementation of
sdneural
scalable to thousands of samples sddetector
renamed intosddetect
. By default, one-class detectors do not loose any training target.- simplified classifier combining with
sdcombine
,sdstack
and[ ]
concatenation sdreject
by default makes one-class detector accepting all training examples- arbitrary relabeling of classifier decisions with
sdrelab
- improved handling of
sdsvc
optimizer failures (adding NaNs in grid search results)
- new fast implementation of
- ROC improvements
- GUI improvements
- toolbar buttons for most common tasks
sdscatter
zoom and pan- significantly faster opening of
sdscatter
andsdimage
- better guidelines in
sdimage
crop - interactive color selector to change backdrop decition colors in
sdscatter
(press 'c' key) - marker colors (Matlab
markeredgecolor
) may be set in a data set usingsetmarkers
sdfeatplot
unique-value mode ('u' key) uses stem plot by default
- feature extraction improvements
sdfeatsel
speedup by supporting a user-defined number of steps and subset initialization- band extraction for spectral data
- RGB2HSV color conversion
- Runtime API improvements
- added
sd_GetOpCount
andsd_GetOp
functions to retrieve details on operating points
- added
Version 3.4 (9-Oct-2012)
- local image feature extraction with custom callbacks in
sdextract
(out=sdextract(data,'block',16,'feat',@my_extractor)
)- user-defined feature extractors may use additional parameters
out=sdextract(data,'block',16,'feat',@my_extractor,{'levels',8','range',[0 256]} )
, read more
- user-defined feature extractors may use additional parameters
sdimage
improvements- creating RGB label image impainting labels/decisions to image regions
LI=sdimage(im,'labim')
, read more - label images may be blended with another RGB image with
LI=sdimage(im,'labim','blend',origim)
- shrink image data set to grid by a command in Image menu or using
sdimage(im,'grid')
. This allows us to easily inspect feature images. read more
- creating RGB label image impainting labels/decisions to image regions
- new
sdsegment
command to define connected components based on labels of an image data set - support for regularization in
sdgauss
,sdlinear
andsdquadratic
. This is an alternative to dimensionality reduction. Regularization allows for training good models in problems with limited amount of training data. read more sdcrossval
support for precomputed proximity matrices. With 'prox' option, care is taken that both training and test sets are represented only by training prototypes.sdpca
can optimize dimensionality based on error of a given classifier. Example:p=sdpca(data,sdlinear)
returns PCA minimizing thesdlinear
error, read moresdexport
got a 'no header' option and support for export of raw data matrices- added support for direct assignments into feature and data properties:
a.featlab(3)='moment'
- added
sdlab
support for concatenation with a string[lab; 'aaa']
- fixed handling of features with constant values in
sdscatter
plots - fixed a problem under Windows where scatter plot became unresponsive when changing dimennsions
- fix of
sdconfmat
display output for normalized matrices
Version 3.3.1 (7-Aug-2012)
sdlab
object supports==
and~=
operators for quick comparisons (example: get a number of errors withsum(a.lab~=dec)
)sdscatter
toolbar legend button shows current data set legend- fix for possible crash when creating nested stacked combiners
- fix for
sdrandforest
repeatability problem (example:rand('state',1); p=sdrandforest(data);
gives identical results) - fix for
sdrelab
display with multiple rounds of renaming
Version 3.3 (21-May-2012)
sdversion
now displays installation directory- improvements of
sdsvc
- reporting a clear error message when libsvm optimizer does not find any solution
- improving grid-search performance
- removing output normalization for multi-class
sdscatter
improvements:- sample inspector is positioned next to the scatter figure
- user may pass a parameter to a calback functions which is executed when the user clicks on data sample
sdtree
andsdrandforest
show number of thresholds and base classifiers in the pipeline display string- fix in
sdlda
improving performance for badly-conditioned data sets e.g. with binary values sdrelab
now shows informative error message if relabeling map is not composed of input/output pairs.- fixes of
sddata
class subset allowing logical or cell arraysub=a(:,:,{'banana'})
orsub=a(:,:,a.lab.list~='stone')
sddecide
fix to properly handlesddecide(p*r)
callsdfeatplot
adds 'absolute' option to visualize absolute frequencies instead of default relative onessdrun
MEX library can now return decision names withL=sdrun(pind,'list')
call.
Version 3.2 (14-Mar-2012)
- improvements of interactive image view
sdimage
- interactive crop function
- definition of connected objects. Small object are by default tagged for easy removal.
- custom level of label trasparency may be set using 'alpha' option (
sdimage(im,'alpha',0.4)
)
- hand-drawn polygon classifier improvements in
sdscatter
(see the video)- polygon classifier returns directly decisions
- inside/outside decisions may be changed from
sdscatter
- confusion matrix may be shown for the current subset of data
- feature handling improvements
sdfeatsel
can find/remove features with zero variance- example:
sdfeatsel
(data,'var>0')
returns features with non-zero variance
- example:
- features may be selected in
sddata
andsdfeatsel
using a cell string of names. Groups of features may be easily selected by substring with support for regular expressions. For example:data(:,'/Moment')
selects all features containing substring 'Moment'sdfeatsel
(data,{'Skew','~/Energy'})
selects 'Skew' and all features that do not contain 'Energy'
- features can be removed from
sddata
given logical or cell array with names (data(:,data.featlab=='/Moment')=[]
)
Version 3.1.2 (22-Dec-2011)
sdexport
now supports export to C45 data formatsdscatter
visualization ofsdtree
decisions now allows to interactively change the number of thresholds with a slider- fixing the problem with
sdrun
MEX which was returningint
not adouble
pipeline index (issue pC-1267) - fixing the issue with setting the number of decision tree nodes
Version 3.1.1 (24-Nov-2011)
sdscatter
now shows a value of the soft output under the mouse cursor in the figure title- polygon drawn in
sdscatter
figure may be saved also by pressing the 's' key - fix of
sdscatter
problem when showing soft outputs sddecide
now gives informative error message when given an empty ROC object (result of constraining when no operating point is available)- fix of erroneous
sdp_combine
product combiner output
Version 3.1 (14-Nov-2011)
- fast and highly scalable decision tree (
sdtree
) and random forest (sdrandforest
) classifiers - polygon classifier may be drawn interactively in
sdscatter
figures - classifier acceleration with 2D lookup table (
sdlut
) classifier sdfeatsel
now supports feature selection based on trained decision treesdimport
supports reading ofsdlab
labels separatelyrandsubset
method supports bootstrap samplingsdexport
displays minimum version of perClass runtime required to execute exported pipeline. This means that deployed runtimes may be updated only when needed.sdfeatplot
adds interactive selection of feature threshold.sdcascade
now supports classifier cascades as inputs- Improvements of
sdscatter
displaying classifier decisions:- Decision under cursor is now shown in the Figure title
- The color of the decision under cursor may be changed using 'c' key
- Pipeline drawing decisions may be saved back into Matlab workspace with the 's' key. Pipeline contains changed decision colors.
sdimage
improvements:- show decisions of arbitrary pipeline in Matlab workspace
- execute k-means clustering on image data using 'Cluster with k-means' menu command ('c' keystroke)
- switching to a different class by pressing a digit
- fix for the sdscatter with backdrop problem with flipped fonts (bug in Windows ATI drivers). added Shift-A keystroke to switch off the alpha level (see http://perclass.com/index.php/forums/viewthread/260)
- fix for crash due to memory leak occuring when computing local histogram features for specific data ranges
- fix for internal use of tic/toc functions inside
sdexe
and classifier execution. - fix for possible crash when using rejection-based operating point
- fix for the error raised when switching between multiple scatter/ROC plots
Version 3.0.0 (6-Jun-2011)
PRSD Studio is renamed into perClass (How to transition to 3.0)
new functionality for handling image data
- support for storing image data in data sets through
sdimage
command. Arbitrarily-shaped regions from multiple images may be stored in a sddata object. Single or multi-band images are supported. - visualization of arbitrarily-shaped pixel subsets. See example
- texture and appearance features may be computed in local image regions using new
sdextract
function- support for user-defined grid (region size and step)
- local histograms, features of local histograms, co-occurrence matrices
- high-speed feature extraction (extracting 86000 co-occurrence matrices from 1024x1300 image takes 230 ms on a laptop)
- support for storing image data in data sets through
new functionality for execution runtime
- significantly faster execution runtime
- labels and decisions are represented by integers, not doubles
- C API for precision timers (
sd_Tic
andsd_Toc
) available to custom applications out of Matlab on all platforms - new deployment tool
sdrun
for easy execution of trained classifiers using Matlab compiler.sdrun
is implemented as a single statically-linked mex. To bring perClass classifier execution into custom Matlab application you only need to copy one mex binary and include the pipeline and license files. - new ASCI-based pipeline file format allows embedding of classifiers directly in a source code (see
ex_buffer.c
SDK example)
new toolbox functionality
- new
sdcluster
function for direct clustering of a data set with user-defined model.sdcluster
returns data set with cluster labels. Clustering is performed per class and original labels are preserved. See the new chapter on clustering .*
operator. Applying a pipeline returning decisions to a data set with.*
operator returns a data set with decisions set as new labels. This is useful to get clustering results or image labels in one step. Example:b=a.*p
is equivalent todec=a*p; b=a; b.lab=dec;
sdscatter
improvements- 'show all' menu command for each property
- save filter to workspace. Filter is stored as a structure which may be easily edited by hand and loaded back into
sdscatter
sdfeatplot
improvements (read more)- left/right cursor keys move to first/last feature, respectively
- 's' keystroke switches to stem-plot highlighting individual histogram bins
- 'u' keystroke uses only unique values, instead of default histogram bining
- 'a' keystroke sets automatic binning
- 'x' keystroke allows to specify name of variable defining x-axis bins (e.g. logarithmic)
- 'lab' option specifies the label set used (default: 'lab', example:
sdfeatplot
(data,'lab','tissue')
) - 'bins' option allows bin specification from command-line
sdrelab
includes new 'all' option that sets all samples to a specific class. This works both for labels and data sets.sdrelab
may rename pipeline decisions by providing a new list. Example:pd=sddetector(a,'target',sdgauss); pd2=sdrelab(pd,sdlist('accept','reject'));
read more- generate more data from a Gaussian model using
sdgenerate
. Example:p=sdmixture(data); b=sdgenerate(p,1000);
sdconfmat
header lines may be suppressed with 'no header' option. This is usefull when concatenating multiple confusion matrices, e.g. for each patient into one larger table.sdknn
accepts k also directly after data set as the second argument. Examplep=~sdknn~(a,10)
instead ofp=~sdknn~(a,'k',10)
sdsvc
accepts the type (linear, RBF or polynomial) as a direct parameter. Example:p=sdsvc(a,'linear')
- operating point marker and color in
sddrawroc
may be changed by the second parameter. See example
- new
new core-level functionality
- length of
sddata
returns number of samples (see discussion at: http://prsdstudio.com/index.php/forums/viewthread/301) subset
andsdrelab
preserve the user-specified order of classes when processing a single set of labels.- sorting label list using
sdlist
sort
method orsdlab
sortlist
method. Only order of classes is changed, not the sample labeling. sddata
find
supports regular expressions to return sample indices. Example:ind=find(a,'/substring')
returns indices of all samples with classname containingsubstring
.- direct assignments into
sddata
property. Example:a(1:10).lab='orange'
ora.lab(1:10)='orange'
- label assignment supports also class indices. Example:
lab(1:10)=2
assigns first ten samples to second class in thelab.list
.
- length of
fixes:
sdp_affine
fix for scaling, labels optionalsdlab
does not include extra space insdlab('Feature',1:10)
constructorsdpca
allows 1D outputsdscatter
window will not jump out of the screen when switching on the distribution plotssddata/randsusbet
andsddata/subset
return subset indices in column order
Version 2.4.0 (7-Feb-2011)
- new execution utilities (commercial and academic full versions only)
- execution of classifiers from Microsoft Excel worksheets (Windows only)
- command-line utility sdrun for direct execution of classifiers outside Matlab (all platforms)
- GUI execution demo (Windows only)
- LabView interface example
- toolbox improvements
- improvements in horizontal label concat (omitting internal spaces + scalability to very large data sets (one million samples uner half a second))
- fixing
sdmixture
problem where training set priors were not used by default sdtree
classifier adds 'levels' option that may significantly speedup training- PRTools AdaBoost classifiers with decision tree or stump base learners may be converted into pipelines using the
sdconvert
command.
Version 2.3.0 (13-Dec-2010)
sdcrossval
now provides per-fold measurements of execution speed- fixing a bug in
sdkmeans
that could cause crash for multi-class data sets with high overlap - fixing the problem in
sdscatter
where regular expression could not be applied to a subset of samples - fix for the call
subset(data,'lab',{})
which was not throwing error randsubset
now works for PRTools datasets
Version 2.2.5 (24-Nov-2010)
- regular expressions allow simple definition of data subsets and
sdrelab
. Strings starting with slash/
character are interpreted as regular expressions. For example:subset
(data,'/good')
returns all classes containing the word'good'
. sdscatter
enhancements:- undo the last label painting operation (
u
key orUndo painting
command in scatter right-click menu) - cycle through all classes showing one at a time (
<
and>
keys) - select class subset by regular expression (
/
key) - class to top (
t
key)
- undo the last label painting operation (
- new dissimilarity measures in
sdprox
(Spectral Angle Mapper, Kolmogorov distance, Match distance) sdmindist
classifier directly applicable to dissimilarity representationssdfeatplot
enhancements:- allows selection of a label set used for plotting the per-group distributions.
sdfeatplot
(data,'lab','patient')
will show per-patient histogram for each feature. - change of default behaviour:
sdfeatplot
now uses all data to construct histograms, use'maxsamples
' option to limit sample count used for large data sets. - fixing the problem in
sdfeatplot
related to constant-value features;sdfeatplot
now also shows the constant feature value if present.
- allows selection of a label set used for plotting the per-group distributions.
sddrawroc
supports interactive zooming- fix in
sdscatter
sample inspector showing correct labels when focusing on a sample subset
Version 2.2.4 (5-Oct-2010)
- support for Mac OS X 64-bit platform
- new
sdimport
command for loadingsddata
objects from text files. User may specify what columns correspond to data matrix, labels and additional sample properties. (read more) sdexport
command can storesddata
in a comma-separated file (read more)sdsvc
support for linear and polynomial kernels including automatic grid search (read more)- support for incremental Support Vector Data Description (
incsvdd
) from DD_Tools. - adding support for creating
sdlab
labels using a vector and class names sdfeatplot
allows user definition of line styles used for plotting class-feature distributions (see example)sdp_affine
turns empty offsets into zero vectors (forum discussion)sddetector
now supports test sets also in one-class mode ('reject' and 'test' options used together)- fix of a bug related to scaling proximity data with
sdscale
- fix of a bug in
sdprox
where prototypes were unnecessarily sorted
Version 2.2.3 (29-July-2010)
sdsvc
allows to identify training samples that became support vectors (usingoriginal
property of support vectors setp{1}.proto
)sddetector
support for externally defined test set usingtest
optionsdfeatsel
floating search provides history of feature subsets selected by individual steps.sdfeatsel
adds atest
option which may be used to supply external data set used for evaluation of 1-NN error criterionsddecide
allows construction of an operating point manually. Support for both weighting-based discriminants and thresholding-based detectors.sdsvc
support for setting external data set used for error estimation in parameter grid search with `test optionsddata
supports cell array propertiessdscatter
user callbacks are now accessible using 'callback' option- untrained classifier pipelines now return names using
getname
Version 2.2.2 (22-June-2010)
- fast feature selection
sdfeatsel
scalable to large data sets (forward search with 1000 samples, 50 features under five seconds). Individual, random selection, forward, backward and floating searches are supported using 1-NN error on a validation set as a criterion. Feature subset size is selected automatically. - sdscatter called when clicking on an data sample. This allows to custom visualization such as loading an image corresponding to a sample form disk and showing it in a separate figure.
- support for untrained high-level operations on data (subset, randsubset, sdrelab, sdroc). This allows one to easily express complex sequences of training operations.
- extended
sdscale
supporting also robust domain scaling (robust in presence of outliers) - cascades may be now trimmed to return output after specific stage using
sdconvert
, e.g.pc2=sdconvert(pc,'until',2)
. This helps us to understand how the later stages of hierarchical classifiers improve performance. - experimental support for Mac OS X 64-bit platform
sdscatter
fix for decision colormap when showing classifier decisions
Version 2.2.1 (3-May-2010)
- interactive visualization of feature distributions in
sdscatter
for both axes (use 'Show feature distribution' in 'Scatter' menu or press 'd'). This greatly simplifies understandingo of overlap in very large data sets where scatter plot is not too informative. (example) sdkmeans
classifier and clustering scalable to very large data sets (1 million samples, 10 clusters in 3.3 sec).sdkmeans
provides fast prototype selection method for k-NN classifiers. Classification performance is further improved by prototype pruning (similar effect to editing the training set).sdkcentres
classifier and clusteringrandsubset
allows to limit the maximum number of samples using 'atmax' option. This is useful to limit samples size but tolerate that some classes have less samples.find
andsubset
now allow that some of the class names do not exist and return what is present (and not empty [] as before)
Version 2.1.0 (21-Apr-2010)
- fixing a bug in sddecide related to adding an operating point in an ROC object
- fixing an error message in sdlab constructor
- adding RBF support vector machine training using
sdsvc
command.sdsvc
is based on libSVM and offers automatic grid search for sigma and C parameters and one-against-all multi-class support. (examples) - adding a reject option to a trained discriminant using the
sdreject
function (also for multi-class classifiers; both outlier rejection and rejection close to the decision boundary) (examples) sdcrossval
support for estimating ROC with variances using operating point averaging (cross-validate pipline returning soft outputs and provide fixed operating points using the 'ops' option), (example)- adding
sdcrossval
support for customsdalg
algorithms that are not convertible into a pipeline (algorithm needs to return the list of all possible decisions) sddrawroc
now saves completesdroc
objects back in the workspace, not only operating points (by pressing 's' key)sddecide
support for default op.point based on thresholding (e.g. forsdsvc
on two-class problems)- support for clustering using
sdmixture
with 'cluster' option sdscatter
adding the "show only this class" command (press 'o' key)- default mean-error performance measure in
sdcrossval
is not anymore included if user requests a specific set of measures sdneural
may switch off the default use of validation for teaching purposes (to illustrate overfitting of the network). Use'valfrac',[]
to suppres the use of validation set.- fixing the problem with
sdroc
using 'confmat' and 'reject' options together - fixing the bug in
sdlab
constructor for single label per class - improving compatibility with PRTools (
sdimage
,sddetector
,sdreject
,sdcrossval
,sdstackgen
,sdscatter
visualizing images using sample inspector)
Version 2.0.9 (8-Mar-2010)
- adding support for subset by logical array for
sddata
andsdlab
objects (example:a( a.lab=='banana' )
) sdtest
raises a warning if some of the true classes are not matched to classifier decisions (all samples from these classes are considered misclassified)- fixed sdscatter problem with the order of classes in "class on top" and "change markers"
- usability improvements in
sdfeatplot
(click to change figure title; legend properly displaying special characters) - 'mean-error' performance measure may specify optional class priors used for weighting the class errors
- global display verbosity may be handled using
prsd_display
command (useprsd_display off
to switch off display output of PRSD Studio functions). - 'nodisplay' options added in
sdmixture
,sdparzen
,sdcrossval
randsubset
supports random selection of objects from some classes only (example:[tr,ts]=randsubset(a,[0.5 0])
returns 50% of the first class for the training)sdcrossval
outputs string with the result summary, result struct and the evaluation object.
Version 2.0.8 (19-Feb-2010)
- fix in
sdimage
for multi-dimensional images (image cubes) - pipelines now provide operating points via
p.ops
field - API interface simplification and cleanup
- low-level output of pipelines on matrices and using C API returns indices to decision list as decisions, not the internal codes
sdlist
andsdlab
internal numerical representation is not exposed to the user anymore- feature selection pipelie
sdp_fsel
now may get the feature labels directly from the data setpf=sdp_fsel(data,[3 4])
sddetector
handles output polarity automatically (k-NN output is distance, mixture output is similarity)- adding easy display of
sdlab
object details (class sizes, fractions) using the transpose operator (lab'
)
Version 2.0.5 (22-Dec-2009)
- classifier output visualization using
sdscatter
can now switch between different soft outputs interactively using cursor keys - added
constrain
method for easy application of ROC performance constraints - enhanced
setcurop
method to choose operating point minimizing or maximizing specific performance measure or setting op.point based on costs - new performance measure
nconfmat
- the entry in normalized confusion matrix - 'target' and 'non-target' options in
sddetector
andsdroc
setting the desired target/non-target names setstate
method insdalg
algorithm allows to call algorithm function directly (instead of using the multiplication operator)
Version 2.0.4 (14-Dec-2009)
sdrelab
now allows to add string prefix to all classes in all labels present using 'add to all' option. This makes it easy to compare two data sets with multiple labelings (classes, patients, tissues).- adding
sdscale
command for data scaling
Version 2.0.3 (9-Dec-2009)
isclass
method for quick check if certain classes are present (useful for custom algorithms)sdnorm
function adding normalization step to a trained pipeline (this construct a general discriminant)sdlab
fix for incorrect class size when initialized with a list and indices- adding initial version of auto-conversion for older-format
@sdppl/sdppl
and@sdops/sdops
objects - fix for the inMathOverflow warning/error in
sdtree
training
Version 2.0.2 (4-Dec-2009)
- new
sdlab
object simplifies handling of labels, decisions and indexed meta-data - new
sddata
object brings easy handling of sample meta-data - multiple sets of labels or meta-data in a dataset, unified access to sample properties
- simple queries using multiple criteria (give me all samples labeled as "Cancer" from patient 1,2 and 5 using
subset(a,'class','Cancer','patient',[1 2 5])
) - access to classes is greatly simplified
- sdroc handles classifier output polarity automatically (sdexe stores the output type in
output_type
data property) - user may change class markers. Data set remembers class markers. Scatter markers are stored in the 'marker' property inside the class list.
- dissimilarity representation contains as feature properties all prototype sample properties
- labels and decisions may be easily concatenated. This allows us to add new labels with brake-down of errors (confusion-matrix entries) in one command.
- writing custom sdalg algorithms is significantly simplified
1.x Compatibility changes
- sdppl objects use new internal format.
- sderror replaced by sdtest
Version 1.3 (30-Nov-2009)
- fix in
sdnmean
classifier: now computing pooled diagonal covariance using class priors - adding missing
parse_measures.p
file - fixing p-code copatibility problem with Matlab 7.4
Version 1.2.5 (12-Oct-2009)
- fixes in
findprop
for numerical properties - adding 'all' and 'nodisplay' options to
sdrelab
Version 1.2.4 (13-Aug-2009)
sdtree
implements training of decision tree classifier scalable to large number of samples (example)- fix in
prsd_feedback
correcting the problem with PRTools not on Matlab path
Version 1.2.3 (15-Jul-2009)
- visualization:
sdscatter
provides more detailed information in sample inspector including all sample meta-data sdrelab
: adding prefix or suffix to all class names. (example)sdrelab
: renaming a single class by relative index- simpler installation: PRSD Studio Lite installation does not anymore require software activation
sdroc
: support for reject option on classifiers with distance soft output (sdknn
)selprop
,findprop
support for set of property values defined by cell array
Version 1.2.2 (16-Jun-2009)
- libPRSD: support for AdaBoost execution using decision tree as base classifiers
- visualization:
sdscatter
allows interactive change of classifier parameters using slider (k in k-NN, smoothing in Parzen, number of base classifiers in AdaBoost) - visualization:
sdimage
may be connected to ROC plot and visualize decisions at different operating points in real-time sdneural
providestarget
option that allows one to approximate trained classifiers (example)sdroc
: fraction of all objects may be rejected by specifying fraction afterreject
option
Version 1.2.1 (27-May-2009)
sdnbayes
implementing Naive Bayes classifier with automatic selection of number of histogram binssdroc
now supports cost-based selection of operating point for two-class scenario (in addition to the existing multi-class cost-based optimization)sddecide
may be used in pipelines to define default operating pointsdp_affine
can construct simple feature scaling pipelines
Version 1.2 (19-May-2009)
sdmixture
supports automatic estimation of number of componentssdneural
implementing feed-forward neural network trainingsdcrossval
now supports untrained pipelines
Version 1.1.6 (9-May-2009)
sdparzen
Parzen classifier implementing scalar and vector smoothingsdknn
k-th nearest neighbor classifier with for prototype selection and support for both detection and multi-class classification
Version 1.1.5 (1-May-2009)
- libPRSD now supports loading pipelines also from a buffer using
prsd_LoadPipelineFromBuffer
(pipelines may be now stored in application resources or sent over network). sdroc
supports rejection both far away and close to the decision boundary using thereject
option.sdscatter
: the figure title may be selected interactively by clicking on the title area- simplified selection us performance measures in
sdroc
Version 1.1.4 (19-Mar-2009)
- adding support for group licenses via license server
- support for construction of arbitrary hierarchical classifiers using decision-level fusion and their execution through libPRSD
sddetector
brings one-command construction of detectors based on arbitrary model (both in one-class setting specifying a threshold using fraction of rejected samples and in two-class setup using ROC analysis to fix the threshold minimizing mean error).sddrawroc
allows to save the current operating point into any relevant object (sdroc
,sdops
, pipelines,sddecide
mappings, customsdalg
algorithms)- introducing
sdmixture
for training Gaussian mixture models (one- or multi-class, variable number of components per class, different stopping criteria (iterations or likelihood delta)) sdrelab
allows to define classes by ~ (tilda) negation operator (e.g. turn all what is not not apple into "non-apple")sdscatter
allows the user to flip through order of classes (z-order) by + and - key-strokes- number of usability improvements in construction of pipelines and interaction with PRTools (
sdroc
andsdops
objects can be now directly concatenated into pipelines;sdmap
wraps pipelines for use in PRTools) - many improvements in confusion matrix estimation:
sdconfmat
setprop
now allows to quickly set property to a constant value. This makes it very easy to quickly tag a group of samples with a specific label.sdconfmat
can now add new labels with all confusion matrix combinations as a property. This can be used to quickly visualize different types of error directly in the feature-spacesdconfmat
cosmetic fix: string confusion matrix scales nicely with long class names- new function
selprop
returning a subset of a dataset with given property values - significant improvements in scalability of sdroc to large datasets in speed and memory usage. Practical even for datasets with 100 000 samples and tens of thousands of operating points.
- improved ROC optimizer brings better quality sets of operating points
sdconfmat
can now estimate confusion matrices for sets of operating points from the soft outputssdexe
can return numerical decision codes ('code' option). This is useful for low-level work with classifier outputs.- pipelines can return numerical decision codes using
.*
operator (e.g.dec=data.*p
) sdeaclust
clustering can be now executed on new data. Scalable to very large datasets (images).
Version 1.1.3 (26-Jan-2009)
- fix in sdscatter allowing to paint labels with legend switched on
- fix in sdscatter retaining the type of numerical properties in a dataset saved back to workspace
- sdscatter can now switch visibility of classes or groups on/off. That's helpful when inspecting large datasets with many overlapping sample groups (patients). See context menu in sdscatter Figure windows. Painting now applies only to visible samples.
- initial support for hierarchical systems composed of multiple classifiers returning decisions (
sdp_cascade
)- support for meta-classes and different features at each classifier node.
- ROC analysis for hierarchical systems
sdconfmat
added- the order of labels and decisions (lablists) can be fixed by the user
- sdconfmat can correctly handle situations where only some classes/decisions are present in the test set (given the full lablists)
- sdconfmat can return the string with a table
- support for normalization of confusion matrices
- lablists may be supplied as cell arrays of strings or string arrays
- support for weight-based operating points with reject option (rejection both close to the boundary and distance-based)
sdroc
automatically shows rejected fraction and all per-classTPrs
- support for similarity-based nad distance-based classifier outputs
- adding reject fraction estimate to sdroc
- support for leave-one-out over a property (object, person, patient...)
- fix for the bug where sdscatter made error when mouse pointer was moved too quickly over the new window
Version 1.1.2 (18-Nov-2008)
- adding fast approximated k-NN see example in our blog
- adding a k-centres classifier capable of both one-class classification and multi-class discrimination
- feature selection algorithm
sda_featsel
now supports also backward feature selection
Version 1.1.1 (09-Nov-2008)
- adding leave-one-out evaluation to
sdcrossval
- adding sdfeatsel: robust feature selection using internal cross-validation loop. It supports custom-made feature selection algorithms
- two example algorithms added illustrating the use of feature selection during training (
sda_featsel_example1
) and in inner cross-validation loop based onsdfeatsel
(sda_featsel_example2
)
Version 1.1 (04-Nov-2008)
- fixing critical bug in 1.1 26-Oct-2008 related to problem with dongles
- fixing the issues with one-sample test sets in ROC
Version 1.1 (26-Oct-2008)
sdscatter
gets full support for GUI menus and class renaming- new
sdimage
command visualizing image stored in a dataset. Support for label paiting, class renaming, multiple sample groupings, connection to sdscatter sdscatter
support for interactive sample inspector (datasets with 1D data using bar plot or 2D images)sddrawroc
can now show confusion matrices at the cursor and at the selected operating point (if present i.e. if 'confmat' flag was specified in sdroc command)sdexe
now automatically converts sdalg algorithms into pipelinessdstackgen
now returns also a robust base classifier (mean fusion of per-fold trained base classifiers) as a second output- improved support for prtools classifiers with output conversion
- fix:
sdroc
now stores confusion matrices in multi-class situations using 'confmat' flag - fix to scaling using affine projection. scalem is now supported for all affine scaling types
Version 1.0 (15-Sep-2008)
- added randomization cross-validation scheme
sdcrossval(nmc,data,'method','random')
- ROC object may be queried using short names of measurements
r(:,'err(Cancer)')
- activation support for commercial demos
Version 1.0 (02-Sep-2008)
- fix: included missing sda_prtools wrapper
- new feature: sdscatter now allows for user-defined titles (sample details moved to the figure title bar)