This is a discussion on Ifat’s draft in order to complete the research project.

Overall

Andy: (1) There are seem to be much fewer imprinted genes as some prev. studies suggested (Gregg,…, Dulac). (2) Age dependence of imprinting. (3) First(?) study using human samples.

note: We started discussing the necessity of error control for supporting the first conclusion above. I will write a more formal account on error control soon.

Regression

, where is the age of death and the null hypothesis is , i.e. age has no impact on imprinting. seems to be a variable that in some way aggregates over 8 or 13 genes . But what is the definition of ? What kind of aggregation is it (summation, pooling,…)?

note: We looked at Ifat’s R script for regression analysis and the definition of LOI_R. When I get access to her files on the other server I’ll look at them.

Error rates

The manuscript provides no error rates for the classification of genes as mono or biallelically expressed.

  1. frequentist approach
    • -values based on the null distribution of
    • FDR control based on estimate of fraction of monoallelically expressed genes
  2. Bayesian approach
    • probabilities of mono/biallelic expression:
    • prior prob. based on a. estimate of fraction of monoallelically expressed genes b. distance from known imprinted genes (further extension: HMM)
    • posterior given a. expression data (RNA-seq) b. genotype data (SNP-array) c. likelihood for based on d. prior

Notes

In the frequentist approach we only need the likelihood function for biallelic expression whereas in the Bayesian one we also need that for (and the prior , of course).

Andy: permutation-derived null distribution of seems preferable instead of binomial assumption

The form of likelihood depends on the dependency structure of the following variables:

Error of genotype calling

A different kind of error rates is provided: error for calling genotypes (Figures error rate 1 and error rate 2.

Andy: discordant call is when RNA-seq suggests monoallelic expression and the Chip-array suggests heterozygosity

Figure error rate 1.

Andy: the acceptable minimum number of fragments covering a given SNP.

Figure error rate 2.

Association of HLA genes to schizophrenia

Nonsignificant tendency for HLA-DQB1 was found. Is it worth to follow up?

Andy: not really worth it.

Imputation of HLA types uses two sources of info

  1. SNPs (HIBAG)
  2. RNA-seq (PHLAT)