Feature extraction, table of contents
- 8.3.1. Introduction
- 8.3.2. Quick example of the entire process
- 8.3.2.1. Preparing pixel classifier
- 8.3.2.2. Applying pixel classifier
- 8.3.2.3. Segmenting objects
- 8.3.2.4. Extracting object features
- 8.3.3. Object features
- 8.3.3.1. Object size
- 8.3.3.2. Mean of object pixels
- 8.3.3.3. Sum of object pixels
- 8.3.3.4. Histogram of a specific input feature per object
- 8.3.3.5. Shape features on object mask
- 8.3.3.6. Shape features on object content
- 8.3.3.7. Example of computing per-object histogram of local gradient
- 8.3.4. Copying labels into object data set
- 8.3.5. Bounding box of objects
8.3.1. Introduction ↩
Object feature extraction derives one feature vector for each object identified in the original data set.
Typical use-case is image-based object recognition task such as defect detection. Original image is first processed by local (region) feature extraction. A local classifier is then trained to distinguish e.g. object/background or detecting a defect. When applied to a new image, each pixel is assigned to one of classes of interest (e.g. object/background).
In the second step, the decisions are spatially segmented using connected component analysis. Each object (connected spatial area) is then represented by a set of features and an object-level classifier is build. This helps us to distinguish objects based on content (original local features) or shape.
This chapter concerns derivation of object-level representation using
sdextract
command.
8.3.2. Quick example of the entire process ↩
We will use dice data set.
>> I=sdimage
('dice01.jpg','sddata')
101376 by 3 sddata, class: 'unknown'
We will:
- Prepare pixel classifier
- Apply it to an image
- Segment connected objects from its decisions
- Extract object features
8.3.2.1. Preparing pixel classifier ↩
First, we will train dice/background local classifier. We will paint dice and background labels:
We save the labeled image back to Matlab workspace into variable I
by
pressing 's' key or via Image menu / Create data set in workspace.
>> Creating data set I in the workspace.
101376 by 3 sddata, 2 classes: 'dice'(4020) 'background'(97356)
We will train a Gaussian detector on the 'dice' class, and make sure the non-targets will be still called background (without 'non-target' option, the default would be 'non-dice'):
>> pd=sddetect
(I,'dice',sdgauss
,'non-target','background')
1: back -> background
2: dice -> dice
sequential pipeline 3x1 'Gaussian model+Decision'
1 Gaussian model 3x1 full cov.mat.
2 Decision 1x1 ROC thresholding on dice (2000 points, current 1)
It is useful to visualize detector behaviour on an image:
>> sdimage
(I,pd,'roc')
We will set the detector operating point to allow some errors in the
background, rather that having holes in the dice. By pressing 's' we can
save the setting back into the pd
detector pipeline.
8.3.2.2. Applying pixel classifier ↩
Our dice detector may be now applied to a new image (open sdimage
figure
use "Apply classifier" form "Image" menu or press 'd' and enter pd
as the
name of a pipeline to apply):
>> I2=sdimage
('dice02.jpg','sddata')
101376 by 3 sddata, class: 'unknown'
>> sdimage
(I2)
8.3.2.3. Segmenting objects ↩
We will segment the objects based on decisions with "Image" / "Connected components" / "Find connected component" command.
Segmented objects are defined by a new set of 'object' labels. We can save
the data set in Matlab workspace as A
:
>> Creating data set A in the workspace.
101376 by 3 sddata, 17 'object' groups: [135 996 932 895 126 939 947 126 377 915 156 212 158 290 1512 92583 77]
>> A.lab'
ind name size percentage
1 dice-object 30 135 ( 0.1%)
2 dice-object 43 996 ( 1.0%)
3 dice-object 57 932 ( 0.9%)
4 dice-object 75 895 ( 0.9%)
5 dice-object 82 126 ( 0.1%)
6 dice-object 87 939 ( 0.9%)
7 dice-object 98 947 ( 0.9%)
8 dice-object102 126 ( 0.1%)
9 dice-object103 377 ( 0.4%)
10 dice-object105 915 ( 0.9%)
11 dice-object121 156 ( 0.2%)
12 dice-object122 212 ( 0.2%)
13 dice-object138 158 ( 0.2%)
14 dice-object145 290 ( 0.3%)
15 dice-small objects 1512 ( 1.5%)
16 background-object 1 92583 (91.3%)
17 background-small objects 77 ( 0.1%)
The object labels group pixels corresponding to each spatially connected object.
A command-line alternative is:
>> A=sdsegment
( I2.*pd ,'minsize',100)
101376 by 3 sddata, 17 'object' groups: [135 996 932 895 126 939 947 126 377 915 156 212 158 290 1512 92583 77]
8.3.2.4. Extracting object features ↩
Object features may be extracted with sdextract
using:
>> obj=sdextract
(A,'object','mean')
Extracting a feature vector from each of 17 objects ('object' labels):
17 by 3 sddata, class: 'unknown'
For general syntax of sdextract
, look here.
The mean object extractor computes one mean vector for each object, defined by 'object' labels. In our RGB image example, this corresponds to mean color information.
We open the extracted data set obj
in scatter plot. We can observe that a
set of dice objects is separated from green background (change to 'object'
labels in 'Scatter' menu).
>> sdscatter
(obj)
ans =
2
You can visualize a specific object in sdimage
:
>> sub=A(:,:,'/57')
932 by 3 sddata, 'object' lab: 'dice-object 57'
>> sdimage
(sub)
The object-level classifier would be then trained on properly labeled data
set obj
.
8.3.3. Object features ↩
8.3.3.1. Object size ↩
- name: 'size'
- description: Size of an object
- use case:
- Classify objects by size (your may e.g. tune the proper threshold with ROC analysis, instead of manual fixing)
- applies to: Input features do not matter, only object size
- required parameter: None
- optional parameters: None
- output dimensionality: 1
8.3.3.2. Mean of object pixels ↩
- name: 'mean'
- description: Mean vector of each object
- use case:
- Handle uniform types of objects
- applies to: All input features
- required parameter: None
- optional parameters: None
- output dimensionality: Identical to input data set
8.3.3.3. Sum of object pixels ↩
- name: 'sum'
- description: Vector with sum over each object
- use case:
- Per-pixel value contains evidence (e.g. soft classifier output such as confidence)
- applies to: All input features
- required parameter: None
- optional parameters: None
- output dimensionality: Identical to input data set
8.3.3.4. Histogram of a specific input feature per object ↩
- name: 'hist'
- description: Each object is represented by a distribution on a specific input feature
- use case:
- For objects that exhibit characteristic internal structure (e.g. separate black and white objects from gray background)
- Used input feature may highlight different aspect of interest (e.g. per-object histogram of local orientation or gradient)
- applies to: Single input feature
- required parameter: Data range with 'range' option
- optional parameters: Number of histogram bins (default: 8)
- output dimensionality: Number of bins
8.3.3.5. Shape features on object mask ↩
- name: 'shape'
- description: Shape of each object is represented several moment-based features
- use case:
- For characterizing shape of objects
- applies to: Any input feature count (data content is not used, only object shape)
- required parameter: None
- optional parameters: None
- output dimensionality: 9
8.3.3.6. Shape features on object content ↩
- name: 'gray shape'
- description: Shape and content of each object is represented several moment-based features
- use case:
- For characterizing shape of objects by computing moment invariants on gray-level object content
- applies to: 1 (single feature/band used for computing gray-level moment invariants)
- required parameter: None
- optional parameters: None
- output dimensionality: 9
8.3.3.7. Example of computing per-object histogram of local gradient ↩
In this example, we separate dice objects from background regions based on their strong gradient structure. We characterize it by a per-object histogram of gradient.
We apply the dice detector trained above and segment objects preserving all even single-pixel islands:
>> B=sdsegment
(I2.*pd,'minsize',1)
101376 by 3 sddata, 182 'object' groups
We smooth the image with Gaussian filter and then extract gradient with sobel operator:
>> G=sdextract
(B(:,1),'region','gauss','sigma',2)
block: 8, sigma: 2.0 yields kernel coverage 91.5%
96945 by 1 sddata, 167 'object' groups
>> S=sdextract
(G,'region','sobel')
95697 by 2 sddata, 167 'object' groups
For more details on local filter features, see this chapter.
When we visualize local gradient (first feature in S
), we can observe
high variability within dice objects and low in the background:
>> sdimage
(S)
Therefore, we extract per-object histograms, bining the values into three output features (very low, middle and high gradient).
>> obj=sdextract
(S(:,1),'object','hist','range',[0 255],'bins',3)
Extracting a feature vector from each of 167 objects ('object' labels):
167 by 3 sddata, class: 'unknown'
We obtain one output feature vector for each of the 167 connected components in the data set S.
>> sdscatter
(obj)
The dice objects are having small probability of low gradients (1st feature) and high probability of middle gradients (2nd feature). The segmentation islands in the background fall into high probable area on the first feature.
8.3.4. Copying labels into object data set ↩
When extracting features per object with sdextract
, it is often useful to
copy object-related meta-data into the output set.
In our dice example, we may wish to preserve the information from which image is a specific object from. We already may have this information available in the original image set, e.g. as an image label:
>> I=sdimage
('dice01.jpg','sddata')
101376 by 3 sddata, class: 'unknown'
>> I.image
sdlab with 101376 entries from 'dice01.jpg'
Also any data set created by region feature extraction will contain this information:
>> B=sdsegment
(I.*pd,'minsize',100)
101376 by 3 sddata, 18 'object' groups: [115 134 102 948 998 114 926 158 943 934 246 997 358 126 109 1532 92579 57]
>> sub=B(:,:,'/74')
926 by 3 sddata, 'object' lab: 'dice-object 74'
>> sub.image
sdlab with 926 entries from 'dice01.jpg'
Once we extract object features, a new per-object data set is created and only 'object' labels are copied by default:
>> obj=sdextract
(B,'object','size')
Extracting a feature vector from each of 18 objects ('object' labels):
18 by 1 sddata, class: 'unknown'
>> obj'
18 by 1 sddata, class: 'unknown'
sample props: 'lab'->'class' 'class'(L) 'object'(L) 'bbox'(N)
feature props: 'featlab'->'featname' 'featname'(L)
data props: 'data'(N)
We may copy other labels sets using the 'copy' option:
>> obj2=sdextract
(B,'object','size','copy','image')
Extracting a feature vector from each of 18 objects ('object' labels):
18 by 1 sddata, class: 'unknown'
>> obj2'
18 by 1 sddata, class: 'unknown'
sample props: 'lab'->'class' 'class'(L) 'object'(L) 'image'(L) 'bbox'(N)
feature props: 'featlab'->'featname' 'featname'(L)
data props: 'data'(N)
>> obj2(1).image
sdlab with one entry: 'dice01.jpg'
Only sdlab label sets may be copied, not general meta-data.
Multiple sets of labels may be specified in a cell array:
>> obj3=sdextract
(B,'object','size','copy',{'image','class'})
Extracting a feature vector from each of 18 objects ('object' labels):
18 by 1 sddata, 2 classes: 'background'(2) 'dice'(16)
Note, that label set may be copied to object data set only if a value is always unique in each object (otherwise, the single per-object value would be undefined). In our example, we're lucky as the dice/background 'class' labels are always unique in any connected object found.
8.3.5. Bounding box of objects ↩
sdextract
adds to each extracted per-object feature vector a new 'bbox'
meta-data containing information on a object bounding box in the original
image.
>> obj=sdextract
(B,'object','mean','copy','image')
Extracting a feature vector from each of 18 objects ('object' labels):
18 by 3 sddata, class: 'unknown'
>> obj'
18 by 3 sddata, class: 'unknown'
sample props: 'lab'->'class' 'class'(L) 'object'(L) 'image'(L) 'bbox'(N)
feature props: 'featlab'->'featname' 'featname'(L)
data props: 'data'(N)
>> obj(9).bbox
ans =
144 220 37 36
The information is stored in the [row column height width]
format where
row and column correspond to the upper left corner, respectively.
We may use this information to highlight a specific object in the original image, based on extracted data obj only:
The image name is available as we copied image labels with 'copy' option above:
>> obj(9).image
sdlab with one entry: 'dice01.jpg'
>> im=imread(+obj(9).image);
>> figure; imagesc(im)
>> h=rectangle('position',[obj(9).bbox(2) obj(9).bbox(1) obj(9).bbox(4) obj(9).bbox(3)])
>> set(h,'edgecolor',[1 1 1])