This week’s edition of Nature has a brief paper (doi:10.1038/nature07390) reporting on the identification of an HIV positive tissue sample collected in Leopoldville (now Kinshasa) in what was then the Belgian Congo, and now known as the Democratic Republic of the Congo.  Sequence data derived from the tissue was used to investigate the chronology of the appearance of HIV from its likely simian origin.

This is a piece of research which hit the news services (see for example this page at the BBC news website).  The research has a number of features which earmark it for media interest: an important virus, a serious disease with a global spread, and a simple take-home message as to the origin of the virus.  This raised my interest and I looked at the paper.  Incidentally, the paper raises issues to do with complexity of statistical analysis: I imagine many readers such as I, and the journos who wrote articles in the press, have little or no chance of understanding what an “unconstrained Bayesian Markov chain Monte Carlo method” is, and are similarly limited in one’s real critical analysis of conclusions reached by that means!  I am forced to assume that all is above board in the statistical and computational aspects of this paper, and that the referees have done their job!  In addition, it’s always interesting in studies of ancient DNA (and cases where sample preservation was not originally intended to preserve nucleic acids) to know what measures were taken to ensure that contamination with modern DNA did not happen.

HIV is thought to have entered human populations relatively recently, most likely derived from chimpanzee SIV-1.  A lot of work’s been done on the evolution of this virus: on the basis of DNA sequence analysis (of cDNA derived from HIV samples)  HIV isolates fall into three groups, and  one of these, the M group, is the one that’s spread globally, being responsible for >95% of HIV infections.  A phylogeny of HIV and related viruses is shown below (and another here).

Within Group M are a number of subtypes, named A-M.  There’s a relationship beween geographical distribution and each of these subtypes, whih appear to have resulted from independent founder events.  Subtype B is found in North America and Europe (and pretty much all the isolates from those areas fall into B).  The novel isolate described here (DRC60) comes from subtype A, while a previous African archival sample (ZR59 – dating from 1959) is of subtype D.

27 archived samples of patient tissue were screened by RT-PCR, and one was identified as containing HIV RNA.  I mentioned above that extreme care needs to be taken when attempting to recover ancient DNA (for example from museum or archaeological samples).  There are several reasons for this:

  • the Polymerase Chain Reaction (PCR) is extremely sensitive to contamination with extraneous DNA – for example modern DNA – from other work going on in the lab
  • the DNA (or in this case RNA) won’t have been preserved in ideal conditions – this makes PCR amplification of the ancient samples difficult (and makes the possibility of recovering products from contamination modern DNA more likely)

Most labs dealing with ancient DNA samples take a number of measures to ensure their work isn’t compromised by contamination.  One recent review of this criteria can be found in this article in Trends in Ecology and Evolution, and this article in PLoS One. I don’t know whether this list of criteria  is the canonical one used in the field: certainly several points would be relevant to any PCR experiments! Steps taken by Worobey et al to avoid (and detect any) contamination are listed in detail in the supplementary data.  Chief among these are:

  • work was carried out in labs experienced in recovering ancient nucleic acids
  • results reproducible across repeated independent extractions
  • samples analysed in two independent laboratories
  • evaluation of mRNA quality using RT-PCR to amplify an endogenous human gene (this revealed mRNA quality to be poor – only short fragments could be recovered) 
  • as with the earlier ZR59 sample, the new sample appeared to be basal in the phylogeny
  • a number of measures taken to avoid contamination are described in the Methods section

A number of relatively small fragments of HIV were recovered from DRC60, and used to infer patterns of HIV spread and diversity in the early period of transmission in the human population.  The estimate of HIV emergence in present day DRC is that the most recent common ancestor (TMCA) of the M strains arose in the early years of the 20th century – different values and confidence limits arise from different statistical methods, but the results are generally pretty comparable.  What is particularly interesting is that the propagation of HIV within human populations in this area seems to follow the establishment of the major urban areas (Kinshasa in 1881, Brazzaville in 1883, Yaounde in 1889 and Bangui in 1899 – the estimated date range for the TMCA is 1908, with 95% highest probability distribution 1884-1924).  Since one must presume that transmission of SIV chimpanzees (likely via predation) may well have been repeated, it seems probable that the spread of HIV followed the growth of urban areas.

So, to return to the question as to why this hit the media.  I guess that without knowning what press-releases were circulated (see my prior posting on my own BBSRC-mediated press-release), the answer is that there is a single, easily understood message – that the origins of HIV are a bit clearer with this new data point.  Clearly the details of the analysis (which re beyond my capacity to critically review) are going to be lost in a news report!  Where does the research take us beyond an interesting take on the history of a pandemic (and it’s not unique – recall the studies on the 1918-19 flu virus isolates)?  Well, the potential for new disease emerging from animal reservoirs is always there, particularly where human populations co-exist in close proximity to wild animal populations, and an understanding of the dynamics of novel disease emergence will prove important. 

Michael Worobey, Marlea Gemmel, Dirk E. Teuwen, Tamara Haselkorn, Kevin Kunstman, Michael Bunce, Jean-Jacques Muyembe, Jean-Marie M. Kabongo, Raphaël M. Kalengayi, Eric Van Marck, M. Thomas P. Gilbert, Steven M. Wolinsky (2008). Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960 Nature, 455 (7213), 661-664 DOI: 10.1038/nature07390