Download

There are currently two sets of download files available below in .tsv or .vcf formats for denovo-db, and release notes, denovo-db.v.1.6.1.pdf

Sample Source .tsv .vcf
non-SSC Samples denovo-db.non-ssc-samples.variants.v.1.6.1.tsv.gz denovo-db.non-ssc-samples.variants.v.1.6.1.vcf.gz
SSC* Samples denovo-db.ssc-samples.variants.v.1.6.1.tsv.gz denovo-db.ssc-samples.variants.v.1.6.1.vcf.gz

*: The use of Simons Simplex Collection (SSC) and Simons VIP data sets is limited to projects related to advancing the field of autism and related developmental disorder research. Questions on SSC/VIP consents should be directed to collections@sfari.org.

 

The fields in the online variant table and in the .tsv download file are described below.

Field Description
SampleID If some type of sample identifier is given in the study we use that exactly. If there is no sample identifier we use the name of the study and start numbering such that every variant has a unique sample identifier.
SSC* Sample The use of Simons Simplex Collection (SSC) and Simons VIP data sets is limited to projects related to advancing the field of autism and related developmental disorder research. Questions on SSC/VIP consents should be directed to collections@sfari.org.
StudyName This is the name of the study.
PubmedID Pubmed ID for the study publication.
NumProbands The total number of probands involved in the study.
NumControls The total number of controls involved in the study.
SequenceType The sequence type used in the study.
PrimaryPhenotype The primary phenotype is the main phenotype of the patient for inclusion in the study.
Validation The validation status describes the result of some orthogonal validation method (for example Sanger sequencing). The values are either yes or unknown meaning either valid or not known, respectively. Any variants that are not valid are removed early in the pipeline and are not represented in denovo-db.
Chr Chromosome
Position Genomic position in hg19.
Variant Reference allele > alternate allele.
rsID dbSNP rs identifier of the variant. If there is not an rsID this field is 0.
DbsnpBuild dbSNP build that the variant was found in.
AncestralAllele Ancestral allele of the position.
1000GenomeCount Count of the variant in 1000 genomes.
ExacFreq Frequency of the variant in the ExAC database.
EspAaFreq Frequency of the variant in African American samples in ESP database.
EspEaFreq Frequency of the variant in European American samples in ESP database.
Transcript Transcript that the variant resides on.
codingDnaSize Coding DNA size of the transcript.
Gene Gene name.
FunctionClass The functional classification of the variant.
cDnaVariant cDNA representation of the variant.
ProteinVariant Protein representation of the variant.
Exon/Intron Exon or intron location of the variant if available.
PolyPhen(HDiv) HDiv score of variant from PolyPhen.
PolyPhen(HVar) HVar score of variant from PolyPhen.
SiftScore SIFT score of the variant.
CaddScore CADD score of the variant
LofScore Lof score of the variant calculated using dbNSFP. (reference)
LrtScore Lrt score of the variant. (reference)
muPIT muPIT is an interactive browser-based application that maps single-nucleotide variants to available three-dimensional protein structures.

Privacy Terms