Algorithm BLink. Step 2

Input:

Physical position of a pair of markers, s1, s2, the prior precision alpha, bilocus haplotype counters nAB, nAb, naB, nab and the number of individuals with unphased haplotypes nHH for this pair of markers.

Output:

Estimation of the allele population frequencies fA, fB and the haplotype population frequencies fAB, fAb, faB and fab for the pair of markers.

Procedure:

  • distance=abs(s2-s1)
  • nKnownHaplotypes=nAB+nAb+naB+nab
  • alphaHap=alphax(1-exp(-distance/100000))
  • fAB=(nAB+alphaHap)/(nHaplotypes+alpha)
  • nA=nAB+nAb+nHH
  • nB=nAB+naB+nHH
  • fA=(nA+2xalphaHap)/(nHaplotypes+nHH+alpha)
  • fB=(nB+2xalphaHap)/(nHaplotypes+nHH+alpha)
  • fAB=IP(fAB, nHH, nAB, nKnownHaplotypes, fA, fB)
  • fAb=fA-fAB
  • faB=fB-fAB
  • fab=1-fAB-fAb-faB
  • where IP(fAB, nHH, nAB, nKnownHaplotypes, fA, fB, MaxIteration=default) uses an Iterative Proportional (IP) algorithm to take into account unknown haplotypes.

    Algorithm Recursive IP for biallelic and bilocus haplotypes.

    Input:

    Current frequency estimation of haplotype AB, fAB, the total number of individual with unknown phase nHH, the total number of haplotypes AB in the sample, nAB, the total number of known haplotypes in the sample, nKnownHaplotypes and allele frequencies fA, fB.

    Output:

    Estimation of the haplotype population frequency fAB.

    Note:

    In our implementation, variable threshold is 5E-5 and default value for MaxIterations is 1000.

    Procedure:

  • fAb=fA-fAB
  • faB=fB-fAB
  • fab=1-fAB-fAb-faB
  • Expected=(fABxfab)/(fABxfab+fAbxfaB)
  • Maximum=(nAB+nHHxExpected)/(nKnownHaplotypes+2xnHH)
  • IF abs(fAB-Maximum) > threshold AND MaxIterations>0 THEN IP(Maximum, nHH, nAB, nKnownHaplotypes, fA, fB, MaxIterations-1) END IF
  • RETURN Maximum