Applies flexible quality control filters on an object of class SNPDataLong
.
Supports call rate filtering, minor allele frequency (MAF), Hardy-Weinberg equilibrium (HWE),
removal of monomorphic SNPs, exclusion of specific chromosomes, optionally removing SNPs without positions,
and optionally removing SNPs at the same genomic position (keeping the one with highest MAF).
qcSNPs(x, ...)
# S4 method for class 'SNPDataLong'
qcSNPs(
x,
missing_ind = NULL,
missing_snp = NULL,
min_snp_cr = NULL,
min_maf = NULL,
hwe = NULL,
snp_position = NULL,
no_position = NULL,
snp_mono = FALSE,
remove_chr = NULL,
action = c("report", "filter", "both")
)
An object of class SNPDataLong.
Additional optional arguments.
Maximum allowed proportion of missing data per individual (currently not implemented).
Maximum allowed proportion of missing data per SNP (currently not implemented).
Minimum acceptable call rate for SNPs (e.g., 0.95). SNPs below this threshold are removed.
Minimum minor allele frequency allowed for SNPs (e.g., 0.05). SNPs with lower MAF are removed.
p-value threshold for Hardy-Weinberg equilibrium test (e.g., 1e-6). SNPs violating this are removed.
Logical. If TRUE, removes SNPs mapped to the same position, retaining only the one with highest MAF.
Logical. If TRUE, removes SNPs without defined genomic positions.
Logical. If TRUE, removes monomorphic SNPs (with no variation).
Character vector of chromosomes to exclude (e.g., c("X", "Y")).
One of "report" (returns a list of removed SNPs), "filter" (returns filtered SNPDataLong), or "both" (returns both).
Depending on the action argument: - "report": list of SNPs removed by each filter and SNPs retained. - "filter": filtered SNPDataLong object. - "both": list containing the filtered object and detailed report.
if (FALSE) { # \dontrun{
set.seed(123)
mat <- matrix(sample(c(0, 1, 2, NA), 100,
replace = TRUE, prob = c(0.4, 0.4, 0.15, 0.05)),
nrow = 10, ncol = 10)
colnames(mat) <- paste0("snp", 1:10)
rownames(mat) <- paste0("ind", 1:10)
map <- data.frame(Name = colnames(mat), Chromosome = 1, Position = 1:10)
x <- new("SNPDataLong",
geno = mat,
map = map,
path = "dummy_path",
xref_path = rep("chip1", 10))
# Example using multiple filters
qcSNPs(x,
min_snp_cr = 0.8,
min_maf = 0.05,
snp_mono = TRUE,
no_position = TRUE,
snp_position = TRUE,
action = "filter")
} # }