A biplot correlation range for group-wise metabolite selection in mass spectrometry

Youngja H. Park, Taewoon Kong, James R. Roede, Dean P. Jones, Kichun Lee

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Background: Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on statistical principles of false discovery rate and correlation. Due to functional redundancies and multiple layers of regulation in complex biologic systems, individual biomarkers, while useful, are inherently limited in disease characterization. Data reduction and discriminant analysis tools such as principal component analysis (PCA), partial least squares (PLS), or orthogonal PLS (O-PLS) provide approaches to separate the metabolic phenotypes, but do not offer a statistical basis for selection of group-wise metabolites as contributors to metabolic phenotypes. Methods: We present a dimensionality-reduction based approach termed 'biplot correlation range (BCR)' that uses biplot correlation analysis with direct orthogonal signal correction and PLS to provide the group-wise selection of metabolic markers contributing to metabolic phenotypes. Results: Using a simulated multiple-layer system that often arises in complex biologic systems, we show the feasibility and superiority of the proposed approach in comparison of existing approaches based on false discovery rate and correlation. To demonstrate the proposed method in a real-life dataset, we used LC-MS based metabolomics to determine spectrum of metabolites present in liver mitochondria from wild-type (WT) mice and thioredoxin-2 transgenic (TG) mice. We select discriminatory variables in terms of increased score in the direction of class identity using BCR. The results show that BCR provides means to identify metabolites contributing to class separation in a manner that a statistical method by false discovery rate or statistical total correlation spectroscopy can hardly find in complex data analysis for predictive health and personalized medicine.

Original languageEnglish
Article number4
JournalBioData Mining
Volume12
Issue number1
DOIs
StatePublished - 2019 Feb 4

Fingerprint

Biplot
Mass Spectrometry
Metabolites
Mass spectrometry
Least-Squares Analysis
Biomarkers
Precision Medicine
Medicine
Partial Least Squares
Phenotype
Range of data
Thioredoxins
Mitochondria
Discriminant analysis
Liver
Principal component analysis
Metabolomics
Redundancy
Liver Mitochondrion
Complex Systems

Keywords

  • Biplot correlation
  • Feature selection
  • Metabolomics

Cite this

Park, Youngja H. ; Kong, Taewoon ; Roede, James R. ; Jones, Dean P. ; Lee, Kichun. / A biplot correlation range for group-wise metabolite selection in mass spectrometry. In: BioData Mining. 2019 ; Vol. 12, No. 1.
@article{38583b30ac9340e7b98de69ae9524106,
title = "A biplot correlation range for group-wise metabolite selection in mass spectrometry",
abstract = "Background: Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on statistical principles of false discovery rate and correlation. Due to functional redundancies and multiple layers of regulation in complex biologic systems, individual biomarkers, while useful, are inherently limited in disease characterization. Data reduction and discriminant analysis tools such as principal component analysis (PCA), partial least squares (PLS), or orthogonal PLS (O-PLS) provide approaches to separate the metabolic phenotypes, but do not offer a statistical basis for selection of group-wise metabolites as contributors to metabolic phenotypes. Methods: We present a dimensionality-reduction based approach termed 'biplot correlation range (BCR)' that uses biplot correlation analysis with direct orthogonal signal correction and PLS to provide the group-wise selection of metabolic markers contributing to metabolic phenotypes. Results: Using a simulated multiple-layer system that often arises in complex biologic systems, we show the feasibility and superiority of the proposed approach in comparison of existing approaches based on false discovery rate and correlation. To demonstrate the proposed method in a real-life dataset, we used LC-MS based metabolomics to determine spectrum of metabolites present in liver mitochondria from wild-type (WT) mice and thioredoxin-2 transgenic (TG) mice. We select discriminatory variables in terms of increased score in the direction of class identity using BCR. The results show that BCR provides means to identify metabolites contributing to class separation in a manner that a statistical method by false discovery rate or statistical total correlation spectroscopy can hardly find in complex data analysis for predictive health and personalized medicine.",
keywords = "Biplot correlation, Feature selection, Metabolomics",
author = "Park, {Youngja H.} and Taewoon Kong and Roede, {James R.} and Jones, {Dean P.} and Kichun Lee",
year = "2019",
month = "2",
day = "4",
doi = "10.1186/s13040-019-0191-2",
language = "English",
volume = "12",
journal = "BioData Mining",
issn = "1756-0381",
number = "1",

}

A biplot correlation range for group-wise metabolite selection in mass spectrometry. / Park, Youngja H.; Kong, Taewoon; Roede, James R.; Jones, Dean P.; Lee, Kichun.

In: BioData Mining, Vol. 12, No. 1, 4, 04.02.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A biplot correlation range for group-wise metabolite selection in mass spectrometry

AU - Park, Youngja H.

AU - Kong, Taewoon

AU - Roede, James R.

AU - Jones, Dean P.

AU - Lee, Kichun

PY - 2019/2/4

Y1 - 2019/2/4

N2 - Background: Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on statistical principles of false discovery rate and correlation. Due to functional redundancies and multiple layers of regulation in complex biologic systems, individual biomarkers, while useful, are inherently limited in disease characterization. Data reduction and discriminant analysis tools such as principal component analysis (PCA), partial least squares (PLS), or orthogonal PLS (O-PLS) provide approaches to separate the metabolic phenotypes, but do not offer a statistical basis for selection of group-wise metabolites as contributors to metabolic phenotypes. Methods: We present a dimensionality-reduction based approach termed 'biplot correlation range (BCR)' that uses biplot correlation analysis with direct orthogonal signal correction and PLS to provide the group-wise selection of metabolic markers contributing to metabolic phenotypes. Results: Using a simulated multiple-layer system that often arises in complex biologic systems, we show the feasibility and superiority of the proposed approach in comparison of existing approaches based on false discovery rate and correlation. To demonstrate the proposed method in a real-life dataset, we used LC-MS based metabolomics to determine spectrum of metabolites present in liver mitochondria from wild-type (WT) mice and thioredoxin-2 transgenic (TG) mice. We select discriminatory variables in terms of increased score in the direction of class identity using BCR. The results show that BCR provides means to identify metabolites contributing to class separation in a manner that a statistical method by false discovery rate or statistical total correlation spectroscopy can hardly find in complex data analysis for predictive health and personalized medicine.

AB - Background: Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on statistical principles of false discovery rate and correlation. Due to functional redundancies and multiple layers of regulation in complex biologic systems, individual biomarkers, while useful, are inherently limited in disease characterization. Data reduction and discriminant analysis tools such as principal component analysis (PCA), partial least squares (PLS), or orthogonal PLS (O-PLS) provide approaches to separate the metabolic phenotypes, but do not offer a statistical basis for selection of group-wise metabolites as contributors to metabolic phenotypes. Methods: We present a dimensionality-reduction based approach termed 'biplot correlation range (BCR)' that uses biplot correlation analysis with direct orthogonal signal correction and PLS to provide the group-wise selection of metabolic markers contributing to metabolic phenotypes. Results: Using a simulated multiple-layer system that often arises in complex biologic systems, we show the feasibility and superiority of the proposed approach in comparison of existing approaches based on false discovery rate and correlation. To demonstrate the proposed method in a real-life dataset, we used LC-MS based metabolomics to determine spectrum of metabolites present in liver mitochondria from wild-type (WT) mice and thioredoxin-2 transgenic (TG) mice. We select discriminatory variables in terms of increased score in the direction of class identity using BCR. The results show that BCR provides means to identify metabolites contributing to class separation in a manner that a statistical method by false discovery rate or statistical total correlation spectroscopy can hardly find in complex data analysis for predictive health and personalized medicine.

KW - Biplot correlation

KW - Feature selection

KW - Metabolomics

UR - http://www.scopus.com/inward/record.url?scp=85061136192&partnerID=8YFLogxK

U2 - 10.1186/s13040-019-0191-2

DO - 10.1186/s13040-019-0191-2

M3 - Article

VL - 12

JO - BioData Mining

JF - BioData Mining

SN - 1756-0381

IS - 1

M1 - 4

ER -