ASIN Patent

Abstract

The present invention relates generally 
to genomics. More specifically, the inve
tion of a nucleic acid (or a nucleotide 
applied to sequencing in general and, in

to genetic analysis and, in particular, 
ntion relates to a method of amplifica-
sequence of interest) which may be 
 particular, to genome sequencing.

Inventors

Tillett; Daniel (Randwick, AU)

Assignee

Takara Shuzo Co., Inc. (Kyoto, JP)

Appl. No.: 581822

Filed: May 20, 2002

PCT Filed: May 1, 2000

PCT NO: PCT/AU00/00391

PCT PUB.NO.: WO00/66768

PCT PUB. Date: November 9, 2000

Foreign Application Priority Data

Apr 30, 1999[AU] PQ0087

Other References

Shuber et al., "A Simplified Procedure f
Research, 5:488-493 (1995).

or Developing Multiplex PCRs" Genome

Liu et al., "Thermal Asymmetric Interlac
and Sequencing of Insert End Fragments f
Walking", Genomics, 25:674-681 (1995).

ed PCR: Automatable Amplification 
rom P1 and YAC Clones for Chromosome

Waclaw Szybalski, "From the double-helix
of large genomes", Gene, 135:279-290 (19

 to novel approaches to the sequencing 
93).

Churcher et al., "Sequencing strategies"
Centre, Wellcome Trust Genome Campus, Hi

, Chapter 20, 517-527, The Sanger 
nxton, Cambridge, UK.

Ghiso et al., "A Subset of 1200 Hexamers
of cDNAs by Hexamer String Primer Walkin

 is Sufficient to Sequence over 95% 
g", Genomics, 17:798-799 (1993).

Primary Examiner: Horlick; Kenneth R.

Attorney, Agent or Firm: Leydig, Voit &

Mayer, Ltd.

TECHNICAL FIELD

The present invention relates generally 
to genomics. More specifically, the inve
tion of a nucleic acid (or a nucleotide 
applied to sequencing in general and, in

to genetic analysis and, in particular, 
ntion relates to a method of amplifica-
sequence of interest) which may be 
 particular, to genome sequencing.

BACKGROUND ART

The development of methods for automated
advances in bioinformatics, has revoluti
in the new field of genomics--the study 
have been used to decipher the entire ge
7, 9, 10, 20), archea (3) and eukaryotes

 DNA sequence analysis, together with 
onised biology and medicine and ushered 
of genes and genomes. These techniques 
nomes of a number of bacteria (5, 
 (6, 11).

The traditional approach to sequencing l
genome, uses a three-stage divide-and-co
involves the construction of a number of
DNA by randomly cutting the DNA into fra
size classes, and then inserting the fra
of propagation in a yeast or bacterial h

arge genomes, including the human 
nquer strategy (29). The first stage 
 clone libraries of the study organism's 
gments, separating these into differing 
gments into appropriate vectors capable 
ost.

The second stage involves (a) constructi
by identification of shared chromosomal 
bacterial artificial chromosome (YAC or 
for example, unique sites that can be am
(PCR) (sequence-tagged sites or STSs) or
(b) the construction of high-resolution 
subcloning YAC or BAC inserts into cosmi
overlaps.

on of a low-resolution physical map 
landmarks on overlapping yeast or 
BAC) clones. The landmarks may be, 
plified by polymerase chain reaction 
 restriction-enzyme digestion sites: 
(sequence ready) maps by randomly 
d vectors and identifying their landmark

The third and final stage involves selec
cosmid clones, randomly fragmenting each
into M13 phage or plasmid vectors. For e
clones are sequenced and assembled to co
(Kbp) cosmid insert. This random (shogun
tide is sequenced about eight times.

ting a minimally overlapping set of 
 into small pieces, and subcloning 
ach cosmid approximately 800 M13 phase 
nstruct the sequence of the 40-kilobase 
) approach is redundant as ever nucleo-

The complexity and cost of the "divide-a
development of new strategies. The Insti
pioneered the direct shogun sequencing o
this approach, the small fragments of ch
into the M13 vector. Clones are randomly
constructed by direct assembly. This who
has been applied to the sequencing of a 
including the 1.9 Mbp genome of Haemophi
of Mycoplasma genitalium (10), and the 1
chii (3). This approach eliminates the n
significantly reducing the overall per b
sequence. However, as with all random se
problem is the requirement for a high le
words, every nucleotide has to be sequen
alignment, sequence contigs (clusters of
The initial shotgun assembly of the H. i
the generation of 11.6 Mbp of random seq
coverage), and yet still contained 140 c
closure (9).

nd-conquer" approach has driven the 
tute for Genomic Research (TIGR) has 
f megabase-sized (Mbp) genomes. In 
romosomal DNA are cloned directly 
 sequenced and the chromosome sequence 
le-genome random sequencing strategy 
number of bacterial and archeal genomes, 
lus influenzae (9), the 0.58 Mbp genome 
.66 Mbp genome of Methanococcus jannas-
eed for any prior physical mapping, 
ase pair cost of producing a finished 
quencing approaches, the inherent 
vel of sequence redundancy. In other 
ced numerous times until, by computer 
 aligned sequences) can be constructed. 
nfluenzas genome, for example, involved 
uence data (greater than 6-fold genome 
ontig gaps requiring labour intensive

An alternative to the inherent inefficie
is primer walking (25). In this procedur
sequence is used to extend sequence info
region. The new sequence information is 
the process is continued until the entir
is determined. Although the primer walki
large-scale sequencing projects, the nee
synthesis of individual primers every 40
The use of a presynthesized library of s
for the synthesis of each new primer. Un
ely short primers are enormous, for exam
65,536 primers, while a complete decamer
idual primers.

ncies of random shogun sequencing 
e, a primer designed from a known 
rmation into the flanking unknown 
used to design the next primer, and 
e sequence of the region of interest 
ng strategy appears attractive for 
d for time-consuming and expensive 
0 to 500 bp makes it impracticable. 
hort primers would avoid the requirement 
fortunately, libraries of even relativ-
ple, a complete octamer library contains 
 library contains over a million indiv-

Two basic solutions have been proposed t
the synthesis of large primer libraries.
of the primer libraries by selecting an 
nonamers, or decamers (4, 12, 24, 26). T
by Ligation of 6-mers (SPEL-6), involves
bp or longer) by the annealing of at lea
hexamers (drawn from a presynthesized li
hexamers or 1024 singly degenerated hexa
The annealed hexamers are joined by liba
performed (15-19, 27). A number of relat
have been developed, including the use o
22), or based on the ligation of self-co

o enable primer walking and yet avoid 
 The first involves reducing the size 
optimise subsets of useful octamers, 
he second, Sequential Primer Elongation 
 the assembly of large primers (18 
st three contiguous complementary 
brary of the full set of all 4096 
mers) to a single stranded DNA template. 
tion and a standard sequencing reaction 
ed techniques based on this approach 
f hexamers but omitting ligation (21, 
mplementary hexamer strings (8).

A large number of technical difficulties
has prevented their wide-spread use. Sim
projects have suggested that reduction o
90% affects priming flexibility and gene
an octamer primer library, this results 
12,000 primers, with a nonamer primer li
primers. While primer libraries of this 
would be both expensive to construct and
ators have designed smaller octamer and 
to 3000 primers, however, these sets are
sequences with little G-C variability (1
nature is the failure of many short olig
sequencing reactions, for example, in on
of 121 nonamer primers worked (2). This 
formation of template secondary structur
of the primer to the correct site (18).

 exist with both approaches which 
ulation studies of large sequencing 
f primer sets by more than 80% to 
ral utility (1, 26). In the case of 
in library sets containing 6,000 to 
brary requiring four times as many 
size are technically possible, they 
 unwieldy to use. A number of investig-
nonamer primer sets containing 1000 
 limited in use to protein coding 
2, 14, 24). Of a more fundamental 
onucleotides to successfully prime 
e report only approximately one half 
common problem appears linked to the 
es which prevent efficient binding

The complexity of the SPEL-6 hexamer lib
for large-scale sequencing projects. In 
library (containing 4096 primers), this 
phosphorylation of the hexamer primers, 
or chemical denaturation of double stran
in the presence of single stranded bindi
before sequencing, and (5) the use of th
sequencing failures are common, as the l
for hexamer primer annealing also promot
hairpin structures that prevent efficien
the reduced library and the SPEL-6 appro
belled primers, and are thus limited in 
chemistries.

ation strategy has limited its utility 
addition to a complete hexamer primer 
technique requires: (1) enzymatic 
(2) a single-stranded DNA template 
ded DNA, (3) a DNA ligation reaction 
ng protein, (4) a deproteination step 
e Sequence enzyme (18). In addition, 
ow annealing temperature required 
es the formation of template secondary 
t primer annealing. Finally, both 
aches are unable to use fluorescent-la-
the use of sequencing hardware and

It is an object of the present invention
one of the disadvantages of the prior ar

 to overcome or ameliorate at least 
t, or to provide a useful alternative.

ASIN Patent

Method of amplification of nucleic acids | USPTO 6,737,253

ASIN FORWARD