Based on the theory of pattern classification, I am listing certain important factors which will be of importance whilst working on the problems involving pattern classification. These are listed based on the notes I had created while reading the subject during the last month.
- Representation of the function describing the classification model should be in a mathematical form (but should not be constrained to it alone) to ensure preciseness, and avoid errors in representation of the model.
- If the set of objects that are to be classified overlap, then the segmentation operation involving the isolation of objects is to be performed for the classification function to work, as desired. This consideration can be overruled if the approach to classify the objects is of taking the groups of objects as input parameters.
- Single criterion for selection and classification of objects into categories might not work for all problems. Depending on the problem domain, multiple criteria should be considered.
- Trade-off's between the cost of obtaining the attributes of the object in consideration, and the quality of the classification is to be maintained. Higher costs of obtaining the attributes can be considered if the function of classification models closer towards novel patterns.
- Based on few rounds of observations, we might tend to create decision rules [based on decision theory], and identify the decision boundaries for classification. We need to ensure that the decision boundary doesn't get biased towards the training data set alone.
- Much of the work will be involved in selecting the appropriate set of features (useful for discriminating the patterns) and deal with redundant features. For redundant features, we need to verify whether the notion of redundance increases or decreases the performance of the classification function.
- Considering that the features (that are essential for discrimination) are costly to measure or provide little improvement, if we force our decision based on limited sets of features, then the degree of error will certainly be incorporated in the decision.
- The central aim of designing a classifier is to suggest actions when presented with novel patterns for unseen objects.
- Considering all training samples being separated perfectly, the complex, non-linear decision boundary would seem to have turned to specific training samples alone, rather than the underlying characteristics of the true model of the object in consideration.
- Representation of the function describing the classification model should be in a mathematical form (but should not be constrained to it alone) to ensure preciseness, and avoid errors in representation of the model.
- If the set of objects that are to be classified overlap, then the segmentation operation involving the isolation of objects is to be performed for the classification function to work, as desired. This consideration can be overruled if the approach to classify the objects is of taking the groups of objects as input parameters.
- Single criterion for selection and classification of objects into categories might not work for all problems. Depending on the problem domain, multiple criteria should be considered.
- Trade-off's between the cost of obtaining the attributes of the object in consideration, and the quality of the classification is to be maintained. Higher costs of obtaining the attributes can be considered if the function of classification models closer towards novel patterns.
- Based on few rounds of observations, we might tend to create decision rules [based on decision theory], and identify the decision boundaries for classification. We need to ensure that the decision boundary doesn't get biased towards the training data set alone.
- Much of the work will be involved in selecting the appropriate set of features (useful for discriminating the patterns) and deal with redundant features. For redundant features, we need to verify whether the notion of redundance increases or decreases the performance of the classification function.
- Considering that the features (that are essential for discrimination) are costly to measure or provide little improvement, if we force our decision based on limited sets of features, then the degree of error will certainly be incorporated in the decision.
- The central aim of designing a classifier is to suggest actions when presented with novel patterns for unseen objects.
- Considering all training samples being separated perfectly, the complex, non-linear decision boundary would seem to have turned to specific training samples alone, rather than the underlying characteristics of the true model of the object in consideration.
- Mood:determined


Comments
>>Representation of the function describing the classification model should be in a mathematical form (but should not be constrained to it alone) to ensure preciseness, and avoid errors in representation of the model.<<
What else is the alternative apart from mathematical form?
>>Based on few rounds of observations, we might tend to create decision rules [based on decision theory], and identify the decision boundaries for classification. >>We need to ensure that the decision boundary doesn't get biased towards the training data set alone.<<
How do we do that? Please elaborate.
>>Considering all training samples being separated perfectly, the complex, non-linear decision boundary would seem to have turned to specific training samples alone, rather than the underlying characteristics of the true model of the object in consideration.<<
If classification is done correctly for all samples, can we not assume that such a criterion function for complex, non-linear decision boundary is working well. What is the criterion on which this assertion is stated? Or can't we say that the nature of training samples were best fit for the criterion function of the mathematical model.
What else is the alternative apart from mathematical form?
Honestly, I don't know the alternative. Rather, the note has been intended to say that the function describing the classification model shouldn't be modeled in a mathematical form for the sake of modeling it in a mathematical notation. If exceptional conditions occur, which pose problems for the modeler, whilst modeling the function, then they may be noted in simplest forms of language. At a later point in time, with intense observations, perhaps the modeler would find ways of representing the exceptional conditions in mathematical forms as well.
Your concern is a valid one, but the note's intended meaning is different. As a beginner, for the time being, I don't intend to delve in to the depth of the subject, rather I would like to keep myself away from it, from the fear of getting lost into it, and losing the interest, as it has been the case with me till date. Once a reasonable amount of breadth is covered and competence is built, I might wish or perhaps even defer to delve in to further details.
How do we do that? Please elaborate.
To ensure that the decision boundary doesn't get biased towards the training data set alone, we may wish to consult the domain experts for their insights regarding the problem domain and obtain further significant information regarding the training data provided. Repeated rounds of interaction will definitely throw some light which would have been missed earlier.
Also, if the decision boundary representing the training data set comprises of complex functions (say having higher degrees of polynomial functions) then we might wish to have a re-look on the sets of features based on which the function representing the classification model has been chosen. An iterative approach can be adopted. Then too, after various iterations we might not be in a state to achieve the desired, but our chances of surety will definitely tend to be higher, with comparison to our previous state.
If classification is done correctly for all samples, can we not assume that such a criterion function for complex, non-linear decision boundary is working well. What is the criterion on which this assertion is stated? Or can't we say that the nature of training samples were best fit for the criterion function of the mathematical model.
We can certainly assume, but we also understand that the criterion for selecting a specific data point as part of the training set can be biased, or rather should I say, domain-dependent. Interestingly, as you have referred to me at other instances, the fact that, it can be argued that any set of samples taken for experimentation can't be an absolute representative of the entire data, similarly, we can't say that the nature of training samples are a best fit for the criterion function.
P.S. Please note the way the usage of English language constructs have been used in my notes. They represent uncertainties, and questions that arise before me while understanding the concepts. I hope to obtain clarity on these, only when I work on them. For the moment, you and other readers, will have to bear with me.