Introduction Y-DNA Profiles
The Y-DNA Profile pages have been designed to give all available test results, both for STRs and SNPs, on a connected group of profiles for each of a series of clades (in this case R-Z18, R-Z372 and R-L257; see menu above). The advantage of these pages is, that, while still adhering to all rules of privacy, they allow to present all "known" Z18+ and not only those in any specific project as on the FT-DNA site, all non-members are presented anonymously however (privacy mode; see below). We do require all profiles given here to be Z18+ or estimated to be as such. The presentation format used here is tuned to give the maximum amount of useful information at a minimum level of scrolling. If you so allow, you will find all results of tests applied to your kit on a single line. If any scrolling is needed, e.g. to look at related profiules, it is going to be in one, vertical, direction only. The web pages were designed with a reasonably large screen (HD or UHD/4k) in mind, HD-screens (1920x1080 pixels) are considered standard. This introduction gives all further background of these pages and will walk you through all available columns in the tables presented. Please note, the tables presented are not based on, or the result of, any spreadsheet, with all serious, but inherent, problems of maintainability. These pages were generated by an advanced piece of software that pulls the data from a Profile Database, formats the table(s) presented and outputs them in html format, as an object to the Content Management System (CMS), used to build this site. The CMS in turn combines all information to a web page, adds meta-information (e.g. update date) and finally applies styling to the web page. This looks nothing like a simple spreadsheet. The software used here has been offered as a service (on SAAS-basis only) to other projects e.g. on their web site (as a service only: you are going to do the clerical work).
Each page starts with a diagram showing the top levels of the tree of the clade concerned (e.g. R-Z372 or R-L257). This diagram is considered self-explanatory and not repeated or described here. There are far more elaborate trees available on the web (some even include family/private SNPs, but we want to concentrate on clarity for everybody first, and then, have we already told you, we don't like scrolling, especially if it is two dimensional (both vertically and horizontally) ? We try not to require it without very good reason or obvious added value. To service our readers, the next levels of the tree will be added shortly (without the need of any scrolling, of course). The plan is, to see if there's a demand for adding family trees separately (a family page for genetic genealogy).
The next table is the results table (see example above), that is preceded by a list of all clades in the table (immediately below e.g. the top level) that link to the section of the table where this clade is given (these links are intra-page, you will not leave the page using them). The table itself can be seen as a series of sub-tables, but technically it is one single table. The table starts with a table header that presents a Reference Group of modals of nearby clades, used for comparison, and a Modal Group consisting of the modal of the haplogroup on the page (e.g. Z18). This modal is used as a reference for all profiles that follow. For each modal 111 markers are given. The word modal indicates that for each marker listed, the allele value that occurs most frequently is given, not something like an average. The modals listed are considered as not originating from any country, not having undergone any tests and not being positive for any SNPs.
The tables that follow, each give all data for a first-level subclade (e.g. R-S11601 or R-DF95 in R-Z18). The full presentation for R-Z18 is thus divided into three levels: (1) as three pages for R-Z18, R-Z372 and R-L257 (see menu above); (2) as different sections or subclades starting with a horizontal dark grey bar with the name of the section; and (3) as group or subclades within the clade (starting with a light bar).
For each section (second level; dark grey bar),typically a clade, a name is given on the left and in the middle a definition of the motif, i.e. a set of shared off-modals allele values that are typically shared by members of the clade. An off-modal is an allele value that deviates from that in the modal for a slow marker that is typically shared by the majority of profiles in the section.
For each group (third level; wider light bar) on the right side, an age estimate according to the algorithm proposed by Scandinavian mathematician Ken Nordvedt's Generation111 spreadsheet is given (the spreadsheet is re-engineered as part of the program); results are rounded off to get reasonable estimates (these estimates are not displayed as part of this current version). The age given is to be understood as an estimate for the minimum coalescence age for all profiles in the group. The method used here is based on STR-variance (mutation rates); they were once very popular (Ken's sheet is dated May 2012, in fact the only age-estimating technique available before large numbers of Full-Y results with large numbers of SNPs became available as outcome of FT-DNA's Big-Y. Presently, other techniques, based on SNPs (loosely called SNP counting), are in mainstream use (which use mutation rates of SNPs instead of mutations rates of STRs), but there's more to say about the difference between these techniques). The age estimate is not displayed in this version of the software.
For each profile a few items are given for Identification: KitNr, Name and Country. If the kit owner is not currently a member of the R-Z18 project or has indicated to value his privacy if he is, he is anonymously listed in "privacy mode", i.e. only the Country is given (other identification fields remain blank). The country given is restricted to the country in Europe, as the R-Z18 Projects concentrates on the time period before 1500 i.e. before the migrations started and most Z18+ members are expected to have still been in Europe in those days. For each non-privacy restricted profile the tests that have been taken are given, as this information is meaningful to other members of the haplogroup. To facilitate the study of the history of the haplogroup, the column Geopin indicates the geographic position of the place of birth of the ancestor to be available, as far as this position is in Europe (otherwise "u" is specified). If the information is not available, the column is blank. Geno2, Big-Y and various Panels are indicated tests shown to help interpreting the test results.
Next, a number of SNP tests are listed with the known test results of each profile. The SNPs listed vary per section, such that the listed SNPS are relevant for the people currently in the section. Please note, that only public SNPs are listed, not the private or semi-private SNPs that tend to be the result of a Big-Y test. The software can handle any number of SNPs, but a maximum of 15 seems to be useful for web use, and to comply to the objectives of non-scrolling readability. Interpretation of these results, as always, requires a thorough understanding of two key concepts of SNP-ology, i.e. alias and equivalence. A SNP A is an alias for SNP B if both refer to the same mutation (!!) at the same location on the Y-Chromosome. The two names A and B refer to the exact same thing. A SNP A is equivalent to SNP B if the two have exactly the same position in the Y-Tree. This results in all people in the having tested both A and B to be either both A+ and B+ or A- and B-. In R-Z18 this applies to e.g. Z18 and Z14, so these are considered equivalent. It is important that all these people are in the same haplogroup (let say, Hg R), because there could well be a SNP Z18 in any other haplogroup that is NOT equivalent to Z14. So an alias applies to a position on the physical Y-Chromosome and equivalence applies to to a logical position on a Y-Tree.
When a set of new SNPs are discovered, it's not always immediately clear what their exact position in the Y-Tree is, especially wrt each other and so if two newly discovered SNPs happen to be equivalent. In R-Z18 when Z14 was found (BTW, this in fact is a very slow-mutating STR, not a SNP or Indel), initial testing found Z18+ people being Z14+ and Z14-, suggesting Z14 was downstream of Z18. Later testing made clear that in fact all Z18+ people were actually also Z14+ (and all Z18- were Z14-) indicating the two SNPs to be equivalent. As a second case, first testing showed all S11601+ people were also S10198+ (and all S11601- to be S10198- suggesting the two to be equivalent, later testing found a (single) person to be S11601- and S10198+ suggesting S10198 was one level below S11601. That is, the equivalence relation between two SNPs might well change if more samples are tested over time.
The SNP-naming itself has been ground for much confusion, ever since FT-DNA's Big-Y started (some five years ago). Before then, SNPs were discovered by testing labs and when a new SNP was found, the lab first verified, if the SNP was already known and, if so, used the existing name. If not, it assigned a new name consisting of a lab-specific prefix (like L means TK at the FT-DNA lab) followed by a sequential number, such as L257 in R-Z18, found at FT-DNA in a WTY-test back in 2010. In those days, lists of prefixes used by testing labs circulated on the web. At a certain moment in time, a specific (valued) customer required Full Genome Corporation (FGC) to name all candidate SNPs found without checking its existence and that started the trend of commercial companies using SNP-names for commercial and marketing purposes. Nowadays, nobody checks the existence of SNP names for newly discovered mutations anymore and companies are massively renaming existing SNP-names. This is very clear in the SNP-database, where a few years back literary thousands of renamed SNPs could be found, that carried for instance both an FGC and e.g. a Y-name (further unrelated names, of course). People new in this field are in general puzzled by these names and start asking questions on forums. To help overcome this problem in our results overview we have a series of columns for SNPs and if a column is named e.g. Z18, then a profile is indicated to be positive for Z18 or one of its aliases as '+; if it is positive for one of its equivalents (and their aliases)('&'); negative for all equivalents ('-') or inconclusive ('?'). A blank name means no test for the SNP was taken or no test result is known.
The last part of the table is the Markers, divided in four sets according to the "standard" 25, 37, 67 and 111 marker tests. For historic reasons (the software was initially written like this back in 2010), the table gives a result for DYS464e (the fifth copy of DYD464), although few people have a result for this copy. Note that the allele values given for all markers are presented as last digit only (least significant digit). There are a number of technical reasons for this. This choice allows all results to be presented on a standard HD-type screen. The colors are the result of the comparison of the result for a marker with the modal on top of the table. If any result is equal to the modal the cell is green, red-ish colors (light, dark red) indicate the result to be numerically higher than the modal, blue-ish colors (light, dark blue) indicate the result to be numerically lower. Experience shows the table is easiest to read by watching the colors, clusters of related profiles are easily spotted by looking out for consistent vertical patterns of red-ish or blue-ish cells (shared off-modals). Many clusters in R-Z18 have been found like this (using far more extensive selections of profiles, of course). In order to ease reading the table, the Marker or STR-names are repeated for each section of the table. When hovering over a marker cell, a pop-up appears decribing the cell as "row-name:allele (modal-dif)", where modal-dif is either '=' for equal to, or '+n' for n-times higher numerically than modal or '-n' for n times lower numerically than the modal (please note, on Windows this only works if the browser window is in focus; this is a concept of MS-Windows). The column headers of slow markers are highlighted as per YSearch WU7FP. Names of markers that do not change in allele value in any profile on the page (including the modal) are shown in orange-like color (like DYS426 in most haplogroups in Hg-R).
We have a number of plans for future additions to these pages, such as (suggestions for other additions, in line with the objectives and the design of these pages are welcomed):
- Integration of results from other test labs (e.g. YSeq; the software to integrate these other results is available, but not currently operational)
- Addition of age estimates on STR variants basis (the software is available, but not currently operational)
- Addition of a geographic map for each haplogroup (the software is available, but not currently operational; currently a geographic map of all R-Z18 is presented elsewhere on this site)