DSpace Repository

NAToRA, a relatedness-pruning method to minimize the loss of dataset size in genetic and omics analyses

Show simple item record

dc.contributor.author Leal, T.P.
dc.contributor.author Furlan, V.C.
dc.contributor.author Gouveia, M.H.
dc.contributor.author Saraiva Duarte, J.M.
dc.contributor.author Fonseca, P.A.
dc.contributor.author Tou, R.
dc.contributor.author Scliar, M.D.O.
dc.contributor.author Araujo, G.S.D.
dc.contributor.author Costa, L.F.
dc.contributor.author Zolini, C.
dc.contributor.author Peixoto, M.G.C.D.
dc.contributor.author Carvalho, M.R.S.
dc.contributor.author Lima-Costa, M.F.
dc.contributor.author Gilman, Robert Hugh
dc.contributor.author Tarazona-Santos, E.
dc.contributor.author Rodrigues, M.R.
dc.date.accessioned 2022-06-01T13:53:56Z
dc.date.available 2022-06-01T13:53:56Z
dc.date.issued 2022
dc.identifier.uri https://hdl.handle.net/20.500.12866/11724
dc.description.abstract Genetic and omics analyses frequently require independent observations, which is not guaranteed in real datasets. When relatedness cannot be accounted for, solutions involve removing related individuals (or observations) and, consequently, a reduction of available data. We developed a network-based relatedness-pruning method that minimizes dataset reduction while removing unwanted relationships in a dataset. It uses node degree centrality metric to identify highly connected nodes (or individuals) and implements heuristics that approximate the minimal reduction of a dataset to allow its application to complex datasets. When compared with two other popular population genetics methodologies (PLINK and KING), NAToRA shows the best combination of removing all relatives while keeping the largest possible number of individuals in all datasets tested and also, with similar effects on the allele frequency spectrum and Principal Component Analysis than PLINK and KING. NAToRA is freely available, both as a standalone tool that can be easily incorporated as part of a pipeline, and as a graphical web tool that allows visualization of the relatedness networks. NAToRA also accepts a variety of relationship metrics as input, which facilitates its use. We also release a genealogies simulator software used for different tests performed in this study. en_US
dc.language.iso eng
dc.publisher Elsevier
dc.relation.ispartofseries Computational and Structural Biotechnology Journal
dc.rights info:eu-repo/semantics/restrictedAccess
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
dc.subject Complex network theory en_US
dc.subject Population genetics en_US
dc.subject Genetic kinship en_US
dc.subject Genealogies simulator en_US
dc.title NAToRA, a relatedness-pruning method to minimize the loss of dataset size in genetic and omics analyses en_US
dc.type info:eu-repo/semantics/article
dc.identifier.doi https://doi.org/10.1016/j.csbj.2022.04.009
dc.relation.issn 2001-0370


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

info:eu-repo/semantics/restrictedAccess Except where otherwise noted, this item's license is described as info:eu-repo/semantics/restrictedAccess

Search DSpace


Browse

My Account

Statistics