This is YSEQ's new Whole Genome NGS Specifically for Genealogy Researchers with 400 Base Long Reads

The whole genome contains over 200 times as much data as the Y chromosome alone! Sequencing has improved dramatically over the last seven years, which makes it not very reasonable to pay a high price for a type of sequence that only covers 70% of the Y chromosome. At a substantially lower price per base, you can get nearly 100% of the Y chromosome with YSEQ's Whole Genome test.

You’ll be able to get a complete and accurate sequence of the Y chromosome, the mitochondrial DNA, plus all the other chromosomes from 1 to 22 and of course the X chromosome.*

Why are 400-Base Reads Important?

The genome sequence you’ll receive from most other companies will be composed of very short individual reads, of 100 or 150 bases each. This greatly limits what can be learned from the data. Many important kinds of variation in the genome are impossible to pick up with these short reads – deletions, insertions, inversions, and other complex rearrangements of your genome, that can have a much greater impact than simple one-base changes (SNPs). Not to mention, many STRs cannot be read with such short reads. Below, we give an example of how long reads will give us the ability to read STRs that are otherwise missed.

The greatest enhancement of the long read sequences is that they will allow us to create de novo assemblies. That is, instead of relying on a standard, one-size-fits-all reference sequence, we can decipher the real, unique Y chromosome sequence as it is found in each haplogroup.

Sample Data

In our pre-launch WGS400 test runs we've collected some astounding sequencing data which shows the new possibilities

FTP Link: https://genomes.yseq.net/WGS/400SE/

We've asked independent experts to have a look at the data and we present their results here:

Analysis from James Kane (https://ydna-warehouse.org/statistics.html)

Analysis by YFull from FGC WGS (left) and YSEQ WGS400 (right) from the same person.

Analysis with the online tool bam.bioio.io

400 Base Reads improve the readability of Y-STR markers

Normally, WGS testing is done with 2 x 100 base or 2 x 150 base paired end reads. This is good enough if you only want to look at short SNP markers, but longer repeating elements, such as STR markers, can't be covered with short reads.

For example the Y-STR marker DYS684 (aka. DYS1005) is approximately 250 bases long, and looks like this:

Different persons may have different numbers of CTTT or CCTTT repeat units. Obviously, 150-base reads can be mapped to various positions on this sequence, and the ends can't anchor to both sides of the repeat section at the same time. This way, it's impossible to identify the correct repeat count, and find other persons who match you at this marker. 400-base reads can cover the whole stretch of long STR markers, and with this technology we can resolve almost every STR marker that has been tested in existing Y-STR databases.

More STR examples

Long STR markers can't be covered with 150-base paired end reads, but 400-base single reads can reach across both ends of the repeat section. This is why longer reads are important for measuring STR markers.

YFull STR Details.
Long STRs with many repeats are significantly better covered by a WGS400.

*The YSEQ analysis does not contain any medical relevant information since the purpose is genealogy. But of course health related information is contained in the raw data and can be extracted from a third-party expert. When ordering this test, you'll need to sign a written consent form stating that you understand the medical implications and the risks associated with decoding this information. You'll need to indemnify YSEQ from any responsibility associated with health information encoded in your raw data. This product is Research Use Only! Not designed for extracting reliable medical information from the raw data files!

