Abstract
Thyroglobulin, the dimeric glycoprotein (19 S, 2 .times. 330 kDa [kilodaltons]), specific to the thyroid gland, is the support for thyroid hormone synthesis. Elucidation of the mechanism for thyroid hormone synthesis requires the knowledge of the primary sequence of the protein. The sequence of the first coding 2190 nucleotides from the 5''-end of the human mRNA is presented. This was obtained by sequencing 2 previously described overlapping clones and by construction and sequencing of a single-stranded cDNA corresponding in to the 5''-end of the mRNA. The nucleotide sequence represents a quarter of the human thyroglobulin mRNA, from which a polypeptide sequence of 730 amino acids at the NH2-terminal end of the monomer has been deduced. This sequence shows a repetition of 5 highly conserved motifs each of .apprx. 50 amino acids, the analysis of which allowed us to establish a consensus sequence. The hormonogenic tyrosine residue recently described in the mature protein, which is located 4 amino acids after the NH2-terminal Asn; a prepeptide signal of thyroglobulin secretion comprising 19 amino acids preceding the Asn residue, the NH2-terminal residue of the mature protein and a 6-signal tripeptide (Asn-Xaa-Thr or Ser) of N-glycosylation of the chain.