We are starting to dig into the data we have already accumulated for the sole purpose of informing our actual and potential users. While being educational is important, this is not a scientific analysis per se but concerning our product, Gen P, it is naturally using scientific concepts. Nevertheless even scientists can benefit from the data here. For instance and according to our knowledge the table of the top 9 most abundant human proteins/protein families below is based on many more individuals already than the data coming from the biggest saliva deep proteomics study, from this year by Grassl et al. That study has used data from 8 individuals between 24 – 40 years of age. Our data is spanning a much bigger age range as well, not independently from the fact that Gen P’s main ambition is to provide deep proteomic age profiles and patterns.
The list of 9 proteins/protein families below are sorted according to median abundance. First of all, the quantified protein data Gen P provides is in terms of relative abundance: this means that differences in calculated intensities by the same protein across multiple samples accurately reflect relative differences in its abundance. There are different ways to look for most abundant proteins (and we will provide data using other methods as well) but we start by taking all the different abundances of particular proteins and then calculate the median abundance for each protein and then provide the list of these top 9 protein/protein families sorted in a descending order. Another important curation that happens in between and explains the protein/protein families usage: when top abundant proteins are part of the same protein family we group them together, for instance, S100 is a protein family of different abundant proteins, like S100-A9 and S100-A8 that have slightly or sometimes markedly different functions assigned to them. On the other hand, cornulin for instance is a particular protein, not a protein family.