Fingerprinting
Detecting sample swaps.
Introduction
This section contains a table showing the samples clustered into groups, where each row in the table corresponds to one sample. The table will show whether your samples are grouping together in unexpected ways, which would indicate sample mislabelling.
Methods
Tool used: biometrics BAM type: Collapsed BAM Regions: MSK-ACCESS-v1_0-curatedSNPs.vcf
It is a two step process to produce the table: (1) extract SNP genotypes from each sample using biometrics extract
command and (2) perform a pairwise comparison of all samples to determine sample relatedness using the biometrics genotype
command. Please see the biometrics documentation for further documentation on the methods.
Interpretation
Below is a description of all the columns.
Column Name | Description |
sample_name | The sample name. |
expected_sample_group | The expected group for the sample based on user input. |
predicted_sample_group | The predicted group for the sample based on the clustering results. |
cluster_index | The integer cluster index. All rows with the same cluster_index are in the same cluster. |
cluster_size | The size of the cluster this sample is in. |
avg_discordance | The average discordance between this sample and all other samples in the cluster. |
count_expected_matches | The count of expected matches when comparing the sample to all others in the cluster. |
count_unexpected_matches | The count of unexpected matches when comparing the sample to all others in the cluster. |
count_expected_mismatches | The count of expected mismatches when comparing the sample to all other samples (inside and outside its cluster). |
count_unexpected_mismatches | The count of unexpected mismatches when comparing the sample to all other samples (inside and outside its cluster). |
Last updated