Data analysis in chemometrics: selection versus compression (DOI: 10.2436/20.2003.01.30) Authors Alberto Ferrer Departament d’Estadística i Investigació Operativa Aplicada i Qualitat .Universitat Politècnica de València. Keywords: Chemometrics, latent structures, principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), compression, selection, data mining, soft sensor, multivariate process diagnosis. Abstract Chemometrics uses data mining tools for empirical modeling of biochemical systems. The explosive development of information and communications technology have enabled the manufacture of a wide variety of sensors that are able to register large amounts of data stored on computing devices. The challenge is to efficiently extract the potential information contained in the data, which depends heavily on the strategy of analysis used. With so much data available it is necessary to use a procedure to reduce the number of variables to analyze. In this paper we present two strategies for this necessary simplification: compression versus selection. The big difference between them is that with selection some variables are discarded whereas after compression all variables may be recovered. If the selection is made at the beginning of the investigation there is a risk of eliminating variables with useful information to solve the problem at hand. The recommendation is therefore compress and, if it is needed, select. The benefits of this recommendation are illustrated with actual examples.Keywords. Chemometrics, latent structures, principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), compression, selection, data mining, soft sensor, multivariate process diagnosis. Downloads Download data is not yet available. Author Biography Alberto Ferrer, Departament d’Estadística i Investigació Operativa Aplicada i Qualitat .Universitat Politècnica de València. Alberto Ferrer és enginyer agrònom i doctor per la Universitat Politècnica de València. Actualment, és catedràtic del Departamentd’Estadística i Investigació Operativa Aplicades i Qualitat de la Universitat Politècnica de València, on dirigeix el grup d’investigació en Enginyeria Estadística Multivariant, dedicat al desenvolupament de metodologia estadística per a l’anàlisi, el monitoratge i el diagnòstic de processos complexos. És editor associat de la revista Technometrics, membre de l’equip editorial de la revista Quality Engineering, membredel Consell de la International Society for Business and Industrial Statistics (ISBIS), així com membre de l’European Network for Business and Industrial Statistics (ENBIS) i de la Xarxa Espanyola de Quimiometria. Downloads PDF (Català) Issue No. 10 (2011) Section Articles License The intellectual property of articles belongs to the respective authors. On submitting articles for publication to the journal Revista de la Societat Catalana de Química, authors accept the following terms:Authors assign to Catalan Society of Chemistry (a subsidiary of Institut d’Estudis Catalans) the rights of reproduction, communication to the public and distribution of the articles submitted for publication to Revista de la Societat Catalana de Química.Authors answer to Catalan Society of Chemistry for the authorship and originality of submitted articles.Authors are responsible for obtaining permission for the reproduction of all graphic material included in articles.Catalan Society of Chemistry declines all liability for the possible infringement of intellectual property rights by authors.The contents published in the journal, unless otherwise stated in the text or in the graphic material, are subject to a Creative Commons Attribution-NonCommercial-NoDerivs (by-nc-nd) 3.0 Spain licence, the complete text of which may be found at http://creativecommons.org/licenses/by-nc-nd/3.0/es/deed.en. Consequently, the general public is authorised to reproduce, distribute and communicate the work, provided that its authorship and the body publishing it are acknowledged, and that no commercial use and no derivative works are made of it.The journal is not responsible for the ideas and opinions expressed by the authors of the published articles.