Iterative pruning PCA improves resolution of highly structured populations

Iterative pruning PCA improves resolution of highly structured populations
Back

11/04/2023 by นพพร ม่วงระย้า

Metadata

Document Title

Iterative pruning PCA improves resolution of highly structured populations

Author

Intarapanich A, Shaw PJ, Assawamakin A, Wangkumhang P, Ngamphiw C, Chaichoompu K, Piriyapongsa J, Tongsima S

Name from Authors Collection

Affiliations

National Science & Technology Development Agency - Thailand; National Center Genetic Engineering & Biotechnology (BIOTEC); National Science & Technology Development Agency - Thailand; National Electronics & Computer Technology Center (NECTEC); Mahidol University

Type

Article

Source Title

BMC BIOINFORMATICS

ISSN

1471-2105

Year

2009

Volume

Open Access

Green Published, gold

Publisher

BMC

DOI

10.1186/1471-2105-10-382

Format

PDF

Abstract

Background: Non-random patterns of genetic variation exist among individuals in a population owing to a variety of evolutionary factors. Therefore, populations are structured into genetically distinct subpopulations. As genotypic datasets become ever larger, it is increasingly difficult to correctly estimate the number of subpopulations and assign individuals to them. The computationally efficient nonparametric, chiefly Principal Components Analysis (PCA)-based methods are thus becoming increasingly relied upon for population structure analysis. Current PCA-based methods can accurately detect structure; however, the accuracy in resolving subpopulations and assigning individuals to them is wanting. When subpopulations are closely related to one another, they overlap in PCA space and appear as a conglomerate. This problem is exacerbated when some subpopulations in the dataset are genetically far removed from others. We propose a novel PCA-based framework which addresses this shortcoming. Results: A novel population structure analysis algorithm called iterative pruning PCA (ipPCA) was developed which assigns individuals to subpopulations and infers the total number of subpopulations present. Genotypic data from simulated and real population datasets with different degrees of structure were analyzed. For datasets with simple structures, the subpopulation assignments of individuals made by ipPCA were largely consistent with the STRUCTURE, BAPS and AWclust algorithms. On the other hand, highly structured populations containing many closely related subpopulations could be accurately resolved only by ipPCA, and not by other methods. Conclusion: The algorithm is computationally efficient and not constrained by the dataset complexity. This systematic subpopulation assignment approach removes the need for prior population labels, which could be advantageous when cryptic stratification is encountered in datasets containing individuals otherwise assumed to belong to a homogenous population.

License

CC BY

Rights

Intarapanich et al; licensee BioMed Central Ltd.

Publication Source

WOS

Back to items list

Iterative pruning PCA improves resolution of highly structured populations

Metadata

Document Title

Author

Name from Authors Collection

Shaw PJ.

Assawamakin A.

Wangkumhang P.

Apichart Intarapanich

Chumpol Ngamphiw

Shaw PJ.

Chaichoompu K.

Assawamakin A.

Jittima Piriyapongsa

Wangkumhang P.

Chumpol Ngamphiw

Chaichoompu K.

Tongsima S.

Jittima Piriyapongsa

Affiliations

Type

Source Title

ISSN

Year

Volume

Page

Open Access

Publisher

DOI

Format

Abstract

Industrial Classification

Knowledge Taxonomy Level 1

Knowledge Taxonomy Level 2

License

Rights

Publication Source

Continue browsing

Iterative pruning PCA improves resolution of highly structured populations

Metadata

Share

Document Title

Author

Name from Authors Collection

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Scopus Author ID

ORCID ID

Affiliations

Type

Source Title

ISSN

Year

Volume

Page

Open Access

Publisher

DOI

Format

Abstract

Industrial Classification

Knowledge Taxonomy Level 1

Knowledge Taxonomy Level 2

Funding Sponsor

License

Rights

Publication Source

Continue browsing