This section contains a table showing the samples clustered into groups, where each row in the table corresponds to one sample. The table will show whether your samples are grouping together in unexpected ways, which would indicate sample mislabelling.
Tool used: biometrics BAM type: Collapsed BAM Regions: MSK-ACCESS-v1_0-curatedSNPs.vcfIt is a two step process to produce the table: (1) extract SNP genotypes from each sample using biometrics extract
command and (2) perform a pairwise comparison of all samples to determine sample relatedness using the biometrics genotype
command. Please see the biometrics documentation for further documentation on the methods.
Below is a description of all the columns.
Column Name
Description
sample_name
The sample name.
expected_sample_group
The expected group for the sample based on user input.
predicted_sample_group
The predicted group for the sample based on the clustering results.
cluster_index
The integer cluster index. All rows with the same cluster_index are in the same cluster.
cluster_size
The size of the cluster this sample is in.
avg_discordance
The average discordance between this sample and all other samples in the cluster.
count_expected_matches
The count of expected matches when comparing the sample to all others in the cluster.
count_unexpected_matches
The count of unexpected matches when comparing the sample to all others in the cluster.
count_expected_mismatches
The count of expected mismatches when comparing the sample to all other samples (inside and outside its cluster).
count_unexpected_mismatches
The count of unexpected mismatches when comparing the sample to all other samples (inside and outside its cluster).