Data analysis in chemometrics: selection versus compression (DOI: 10.2436/20.2003.01.30)
Keywords:
Chemometrics, latent structures, principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), compression, selection, data mining, soft sensor, multivariate process diagnosis.Abstract
Chemometrics uses data mining tools for empirical modeling of biochemical systems. The explosive development of information and communications technology have enabled the manufacture of a wide variety of sensors that are able to register large amounts of data stored on computing devices. The challenge is to efficiently extract the potential information contained in the data, which depends heavily on the strategy of analysis used. With so much data available it is necessary to use a procedure to reduce the number of variables to analyze. In this paper we present two strategies for this necessary simplification: compression versus selection. The big difference between them is that with selection some variables are discarded whereas after compression all variables may be recovered. If the selection is made at the beginning of the investigation there is a risk of eliminating variables with useful information to solve the problem at hand. The recommendation is therefore compress and, if it is needed, select. The benefits of this recommendation are illustrated with actual examples.
Keywords. Chemometrics, latent structures, principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), compression, selection, data mining, soft sensor, multivariate process diagnosis.
Downloads
Downloads
Issue
Section
License
The intellectual property of articles belongs to the respective authors.
On submitting articles for publication to the journal Revista de la Societat Catalana de Química, authors accept the following terms:
- Authors assign to Catalan Society of Chemistry (a subsidiary of Institut d’Estudis Catalans) the rights of reproduction, communication to the public and distribution of the articles submitted for publication to Revista de la Societat Catalana de Química.
- Authors answer to Catalan Society of Chemistry for the authorship and originality of submitted articles.
- Authors are responsible for obtaining permission for the reproduction of all graphic material included in articles.
- Catalan Society of Chemistry declines all liability for the possible infringement of intellectual property rights by authors.
- The contents published in the journal, unless otherwise stated in the text or in the graphic material, are subject to a Creative Commons Attribution-NonCommercial-NoDerivs (by-nc-nd) 3.0 Spain licence, the complete text of which may be found at http://creativecommons.org/licenses/by-nc-nd/3.0/es/deed.en. Consequently, the general public is authorised to reproduce, distribute and communicate the work, provided that its authorship and the body publishing it are acknowledged, and that no commercial use and no derivative works are made of it.
- The journal is not responsible for the ideas and opinions expressed by the authors of the published articles.