Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/2311
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Vipsita, S | - |
dc.contributor.author | Rath, S K | - |
dc.date.accessioned | 2015-04-29T07:52:54Z | - |
dc.date.available | 2015-04-29T07:52:54Z | - |
dc.date.issued | 2013-06-03 | - |
dc.identifier.citation | Hindawi Publishing Corporation Computational Biology Journal Volume 2013, Article ID 898090, 12 pages | en_US |
dc.identifier.uri | http://dx.doi.org/10.1155/2013/898090 | - |
dc.identifier.uri | http://hdl.handle.net/2080/2311 | - |
dc.description.abstract | We deal with the problem of protein superfamily classification in which the family membership of newly discovered amino acid sequence is predicted. Correct prediction is a matter of great concern for the researchers and drug analyst which helps them in discovery of new drugs. As this problem falls broadly under the category of pattern classification problem, we have made all efforts to optimize feature extraction in the first stage and classifier design in the second stage with an overall objective to maximize the performance accuracy of the classifier. In the feature extraction phase, Genetic Algorithm- (GA-) based wrapper approach is used to select few eigenvectors from the principal component analysis (PCA) space which are encoded as binary strings in the chromosome. On the basis of position of 1’s in the chromosome, the eigenvectors are selected to build the transformation matrix which then maps the original high-dimension feature space to lower dimension feature space. Using PCA-NSGA-II (non-dominated sorting GA), the nondominated solutions obtained from the Pareto front solve the trade-off problem by compromising between the number of eigenvectors selected and the accuracy obtained by the classifier. In the second stage, recursive orthogonal least square algorithm (ROLSA) is used for training radial basis function network (RBFN) to select optimal number of hidden centres as well as update the output layer weighting matrix. This approach can be applied to large data set with much lower requirements of computer memory. Thus, very small architectures having few number of hidden centres are obtained showing higher level of performance accuracy. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Hindawi Publishing Corporation | en_US |
dc.subject | GeneticAlgorithm | en_US |
dc.subject | Pattern classification | en_US |
dc.subject | Principal component analysis | en_US |
dc.title | Two-Stage Approach for Protein Superfamily Classification | en_US |
dc.type | Article | en_US |
Appears in Collections: | Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
898090.pdf | 1.35 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.