Advertisement

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.

ENJOY UNLIMITED ACCES TO C&EN

Biological Chemistry

Human Gene Count Pared down Again

Reanalysis finds there are just 20,000 to 25,000 human genes

by Stu Borman
October 25, 2004 | A version of this story appeared in Volume 82, Issue 43

GENOMICS

A detailed reanalysis of the human genome sequence yields the surprising conclusion that there are only 20,000 to 25,000 human genes [Nature, 431, 931 (2004)].

Before the genome was sequenced, conventional wisdom predicted about 100,000 human genes. In 2001, when the International Human Genome Sequencing Consortium (IHGSC) and an independent group led by Celera Genomics, Rockville, Md., reported the first substantially complete human genome sequences, the estimate shrank to 30,000 to 40,000.

Now, it has decreased about another one-third. The new study confirms 19,599 protein-coding genes and identifies another 2,188 probable protein-coding segments. The resulting sum of human genes is about 14% higher than the number of genes in nematode worms and about 15% lower than the number in thale cress plants. The error rate in the new data is about 0.001%--10 times better accuracy than the Human Genome Project's original goal.

The findings were authored by more than 2,800 IHGSC researchers. The new data will facilitate "more precise studies of our genetic instruction book and how it influences health and disease," says coauthor Francis S. Collins, director of the National Human Genome Research Institute.

The sequence now covers more than 99% of the euchromatic (gene-containing) human genome. About 20% of the genome is heterochromatic (highly condensed and repetitive noncoding DNA) and remains unsequenced.

The reanalysis shows that segmental duplications--large, near-identical sequence copies--cover 5.3% of the genome. A related study [Nature, 431, 927 (2004)] finds that shotgun sequencing (Celera's shortcut technique) was less effective in resolving segmental duplications than IHGSC's systematic clone-ordered approach. Many sequencers have since adopted shotgun sequencing alone, but combining it with clone-ordered sequencing would be preferable for optimal accuracy, the researchers say.

Article:

This article has been sent to the following recipient:

0 /1 FREE ARTICLES LEFT THIS MONTH Remaining
Chemistry matters. Join us to get the news you need.