基因组与基因作图,Lecture 2,向 光 盛 生物化学教研室,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,ObjectiveReview ConceptsPosition of genes,Organization of the human genome,2011-10-25,向光盛 长江大学医学院,HUMAN NUCLEAR GENOME24 chromosomes(haploid)3200 Mbp30,000 genes,Mitochondrial genome16569 bp37 genes,Human genome=nuclear genome+mitochondrial genome,2011-10-25,向光盛 长江大学医学院,基因组(genome)泛指一个有生命体、病毒或细胞器的全部遗传物质;在真核生物,基因组是指一套染色体(单倍体)DNA。,基因组学(genomics)就是发展和应用DNA制图、测序新技术以及计算机程序,分析生命体(包括人类)全部基因组结构及功能。,2011-10-25,向光盛 长江大学医学院,基因组学包括3个不同的亚领域结构基因组学(structural genomics)功能基因组学(functional genomics)比较基因组学(comparative genomics),基因组学概念,2011-10-25,向光盛 长江大学医学院,结构基因组学(structural genomics)是通过HGP的实施来完成的。HGP的内容就是制作高分辨率的人类遗传图和物理图,最终完成人类和其它重要模式生物全部基因组DNA序列测定,因此HGP属于结构基因组学范畴,2011-10-25,向光盛 长江大学医学院,HGP包括以下研究内容,(一)物理制图(二)遗传制图(三)基因组DNA序列测定(四)创建计算机分析管理系统,2011-10-25,向光盛 长江大学医学院,HGP主要任务及内容,2011-10-25,向光盛 长江大学医学院,通过HGP获得的广泛基因组信息组成了结构基因组学的基本内容,是开展功能基因组学的研究的基础;同时为详尽研究每一个单基因遗传病提供“平台”,并将成为复杂的多基因遗传病研究的发端。,2011-10-25,向光盛 长江大学医学院,功能基因组学,完成一个生物体全部基因组测序后即进入后基因组测序阶段详尽分析序列,描述基因组所有基因的功能,包括研究基因的表达及其调控模式,这就是功能基因组学(functional genomics)。,2011-10-25,向光盛 长江大学医学院,主要具体内容包括以下方面,(一)鉴定DNA序列中的基因(二)同源搜索设计基因功能(三)实验性设计基因功能(四)描述基因表达模式,2011-10-25,向光盛 长江大学医学院,功能基因组学研究策略及主要内容,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2.Recombination,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,1st mechanism for genetic diversity:independent assortment of chromosomes,Mendels Laws imply independent assortment.That is,genes on the same chromosome are inherited together;genes on different chromosomes are inherited independently.With 23 human chromosomes,there is a possible 223=8.4 x 106 distinct gametes.,Somatic cell,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2.No DNA replication between 1st and 2nd divisions,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,If two genes are recombined x%of the time,they are said to be separated by a genetic map distance of x centimorgans(cM).,Quantitative measure of recombination,phenotype(Greek,to show),2011-10-25,向光盛 长江大学医学院,The genetic map and the physical map are colinear,but not quite proportional.,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,Human Mitochondrial Genome,Small(16.5 kb)circular DNArRNA,tRNA and protein encoding genes(37)1 gene/0.45 kbVery few repeatsNo introns93%coding;Genes are transcribed as multimeric transcriptsRecombination not evidentMaternal inheritance,2011-10-25,向光盛 长江大学医学院,H strand enriched in GL strand enriched in C,7S DNA short repetitive segment of H strand attached to L strand(abortive replication)Element of tripleDNA stand structure,2011-10-25,向光盛 长江大学医学院,What are the mitochondrial genes?,24 of 37genes are RNA coding22 mt tRNA2 mit ribosomal RNA(23S,16S)13 of 37 genes are protein coding(synthethized on ribosomes inside mitochondria)some subunits of respiratory complexes and oxidative phosphorylation enzymes,2011-10-25,向光盛 长江大学医学院,Limited autonomy of mitochondria,mt encodednuclearNADH dehydrogenase 7 subunits41 subunitsSuccinate CoQ reductase 0 subunits4 subunitsCytochrome b-c1 comp 1 subunit10 subunitsCytochrome C oxidase 3 subunits10 subunitsATP synthase complex 2 subunits14 subunitstRNA components 22 tRNAs nonerRNA components 2 components noneRibosomal proteins none 80 Other mt proteins none mtDNA pol RNA pol etc.,2011-10-25,向光盛 长江大学医学院,Two overlapping genes encoded by same strand of mt DNA(unique example),Two independent AUG located in Frame-shift to each other,second stop codon is derived from TA+A(from poly-A),2011-10-25,向光盛 长江大学医学院,Mitochondrial codon table,22 tRNA cover for 60 positions via third base wobble,2011-10-25,向光盛 长江大学医学院,Human Nuclear Genome,3200 Mb23(XX)or 24(XY)linear chromosomes30-35,000 genes1 gene/100kbIntrons in the most of the genes1,5%of DNA is codingGenes are transcribed individuallyRepetitive DNA sequences(45%)Recombination at least once for each chrom.Mendelian inheritance(X+auto,paternal Y),2011-10-25,向光盛 长江大学医学院,Human Genome Organization,HUMAN GENOME,Genes and gene-related sequences,Extragenic DNA,Nuclear genome3000 Mb65-80000 genes,Mitochondrial genome16.6 kb37 genes,Coding DNA,Noncoding DNA,Unique or low copy number,Moderate to highly repetitive,Pseudogenes,Gene fragments,Introns,untranslatedsequences,etc.,Tandemly repeated or clustered repeats,Interspersedrepeats,Unique or moderately repetitive,Two rRNAgenes,22 tRNAgenes,13 polypeptide-encoding genes,30%,70%,10%,90%,80%,20%,From:Dr Finbarr Hayes lec,2011-10-25,向光盛 长江大学医学院,Human nuclear genome,Euchromatic portion 3000Mb,Constitutive heterochromatine200 Mb,Heterochromatin is distributed between chromosomesunevenly,2011-10-25,向光盛 长江大学医学院,Gene-poor chromosomes(With extra heterochromatin),Short arms of acrocentric chromosomes 13,14,15,21,22Part of long arms of chr 1,9,16Long arm of chromosome Y,2011-10-25,向光盛 长江大学医学院,Human genome base content,41%CG in average38%CG for chromosomes 4 and 1349%for chromosome 19Regions with wide swings in GC content(e.g.from 33,1%to 59,3%)GC content is correlated with Giemsa staining;Genes correlated too.Gene density correlates with higher GC content,2011-10-25,向光盛 长江大学医学院,CpG dinucleotide conspicious depletion,Expected frequency is 0,042(4,2%)Observed frequency is five times lowerIt happens due to methylation-dependent mutation based CpG depletionCpG islands in the regulatory areas of human genes,2011-10-25,向光盛 长江大学医学院,Location of CpG islands in the gene,CpG islands do NOT have a deficit of CpG dinucelotides,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,REPEATS!,2011-10-25,向光盛 长江大学医学院,3 Main Components in Eukaryotic Genomes,DNA purified from a human,do not self-anneal as a simple sigmoidal curve.Instead we see a curve which is the sum of the reannealings of many different components,C0=the initial concentration of nucleotides,T time in seconds,CoT curve is a measure of sequence complexity,REPEATS,NO REPEATS,2011-10-25,向光盛 长江大学医学院,human CoT DNA(commercial preparate),This is human DNA which has been denatured and allowed to reanneal to a C0t value of 1.The double stranded component is then purified from the single stranded component and is supplied commercially.It contains most of the human repetitive DNA but very little single copy DNA(unique genes).Used to suppress background hybridization of comple probes,2011-10-25,向光盛 长江大学医学院,Satellite DNA is repetitive DNA that could be separated by buoyant density,Equilibrium density gradient centrifugation Sheared DNA in Cesium Chloride gradient,2011-10-25,向光盛 长江大学医学院,Satellite DNA,Alpha satellite(Centromere DNA),Microsatellites,Minisatellites,Are you still remember what it is?If not please refer to previous lectures and to the book,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,Repetitive DNA,Moderately repeated DNATandemly repeated rRNA,tRNA and histone genes(gene products needed in high amounts)Large duplicated gene familiesMobile DNA(transposons)to be discussed soonSimple-sequence DNATandemly repeated short sequencesFound in centromeres and telomeres(and others)=(MINI and MICROSATELLITES),2011-10-25,向光盛 长江大学医学院,Human Mobile DNA(transposons),Moves within genomeLINE(Long interspersed nuclear elements)L1,L2,L3 LINE is 21%of human DNA(1,00,000 copies)SINE(Short interspersed nuclear elements)Alu is 10,7%of human DNA(1,200,000 copies)MIR,MIR3 is 3%of hum DNA(500,000 copies)LTR elements(Long Terminal Repeats)ERV and MalR are 8%of human DNA(500,000 copies)Transposons MER1(Charlie),MER2(Tigger),others(350,000 copies),2,8%of human DNATOTAL:approx;45%of human DNA,2011-10-25,向光盛 长江大学医学院,LINEs and ERVs,2011-10-25,向光盛 长江大学医学院,Long interspersed nuclear elements(LINEs)20%of genome,LINE1 active(Also many truncated inactive sequences)Line2 inactiveLine 3 inactive,RNA binding,also endonuclease,LINEs prefer AT-rich euchromatic bands,Internal promoter,IN everyones genome 60-100 copies of LINE1 are still capable of transposing,and may occasionally cause the disease by gene disruption,2011-10-25,向光盛 长江大学医学院,Mechanism of LINE repeat jumps,Full length LINE transcript is generated from 5-UTR-based promoterORF1 and ORF2 translated into proteins that stay bound to LINE mRNAORF1/ORF2/mRNA complex moves back into the nucleus,orf1,orf2,5 3,3,5,3 5,Product of ORF2 cut ds DNA,Freed 3 serves as a primer for LINE reverse transcription from 3 UTR,2011-10-25,向光盛 长江大学医学院,ORF2 and ORF1 function,ORF1 keeps ORF2 and LINE mRNA bound together and retracted into nucleusORF2(endonuclease)cut dsDNA to provide free 3 end as a primer to LINE 3UTRORF2(reverse transcriptase)makes cDNA copy of LINE mRNA,which becomes integrated into chromosomal DNA(as it bound to it by former 3 freed end),TTTT A is ORF1 cleavage site,that is why integration prefers AT rich regions,2011-10-25,向光盛 长江大学医学院,LINE replication is not very efficient process,Reverse transcriptase of LINE elements is a“weak”enzyme(have a low processivity)Many insertions are truncational(copies are not able to copy itself further)Most insertions are only 900 bp(instead of 6.1 kb),only 1 of 100 insertions is successful,2011-10-25,向光盛 长江大学医学院,Illustration to full-size LINEs and their fossil derivates,2011-10-25,向光盛 长江大学医学院,Short interspersed nuclear elements(SINE)13%of genome,Non-autonomous(no revertase)100-400 bp long;No open reading framesDerived from tRNA(transcribed with RNA pol III,leaving internal promoter)Share sequences with 3 ends of LINEsDepend on LINE machinery for its movement,2011-10-25,向光盛 长江大学医学院,AluI-elements,Derived from signal recognition particle 7SLDoes not share its 3 end with a LINEInternal promoter is active,but require appropriate flanking sequence for activation so its active only if lucky with its integration siteIntegrates in GC rich sequencesOnly active SINE in the human genome,2011-10-25,向光盛 长江大学医学院,As ALU repeats do not have open reading frames,ALUs have to useRT enzyme and endonuclease provided by LINE repeatsor other transposons,Mark A.Batzer and Prescott L.Deininger,2011-10-25,向光盛 长江大学医学院,After integration Alu copies rapidly mutate at sites of their 24 CpGs,Alignment of Alu-subfamily consensus sequences.,Mark A.Batzer and Prescott L.Deininger,2011-10-25,向光盛 长江大学医学院,The expansion of Alu-elements in primate lineage,Mark A.Batzer and Prescott L.Deininger,2011-10-25,向光盛 长江大学医学院,2011-10-25,向光盛 长江大学医学院,Potential Alu-mediated damage to human genome,Insertional mutagenesis,ALU-mediated uneven recombination,2011-10-25,向光盛 长江大学医学院,Diseases that sometimes caused by de novo Alu-integration,Neurofibromatosis(Shwann cell tumors),haemophilia,breast cancer,Apert syndrome(distortions of the head and face and webbing of the hands and feet),cholinesterase deficiency(congenital myasthenic syndrome)complement deficiency(hereditary angioedema),2011-10-25,向光盛 长江大学医学院,Disease that sometimes caused by Alu-mediated uneven recombination,insulin-resistant diabetes type II(InsReceptor)LeschNyhan syndrome(overproduction of uric acid leading to neurologic syndrome),TaySachs disease,complement component C3 deficiency,Familial hypercholesterolaemia-thalassaemiaSeveral types of cancer,including Ewing sarcoma,breast cancer,acute myelogenous leukaemia,2011-10-25,向光盛 长江大学医学院,Positive role of Alu repeats in evolution,Alu,Alu,Alu,Insertions of the repeat near gene may change its expression pattern,gene structure,or leads to alternatively spliced mRNA isoforms,LTRs contain promoters,ALUs repeats contain TF binding sites,2011-10-25,向光盛 长江大学医学院,Human repeat distribution depends on GC content of integration sites,2011-10-25,向光盛 长江大学医学院,Alu paradox,Alu repeats are found in GC-rich(gene rich)regions more often than in AT rich;De novo integration of ALU-repeats happens in AT-rich areas(as they hijacked ORF2 product of LINE),ALUs are subject of positive selection(as they CREATE new genes)by supplying genome segments ready to become geneswith promoter like elements and exonic-like boundaries.Also they are GC rich themselves,so they transform AT-rich regions into GC rich,2011-10-25,向光盛 长江大学医学院,LTR transposons,Any trasposon flanked by Long Terminal Repeats;DNA bases transposons and Retrotransposons;,Contain Transposase;Already silent in the human genomeFossils(Charlie and Tigger types),Endogenous Retroviral Sequences(ERVs)Contain Gag and Pol genesOnly HERV-K look still OK for moving,2011-10-25,向光盛 长江大学医学院,DNA transposons and retrotransposons,Kazazian,Science,Vol 303,Issue 5664,LINE,SINE,2011-10-25,向光盛 长江大学医学院,Human RNA genes(non-coding RNA transcripts),3000 RNA genes in human genome(rough)rRNA tRNASmall nuclear RNASmall nucleolar RNASRP RNAMicroRNAAntisense RNANon-coding gene mRNA isoforms;RNAs form transcribed pseudogenes,THIS IS NOT TRUE,MY OPINION IS CLOSE TO 100,000,2011-10-25,向光盛 长江大学医学院,miRNA and antisense RNA are underestimated;“other non-coding RNA”are not represented,2011-10-25,向光盛 长江大学医学院,rRNA genes(1200 genes),18S,5.8S and 28Sare encoded by single transcription units;Located in 5 clusters:Chr.13,14,15,21,225S is in tandem arrays,largest is on Chr.1q41-42All this is to increase a gene dosage,2011-10-25,向光盛 长江大学医学院,tRNA genes(497 nuclear genes+324 putative pseudogenes),Humans have fewer tRNA genes that the worm(584),but more than the fly(284);Frog X.laevis have thousands of tRNA genes;Number of tRNA genes correlates with size of the oocytes;In large oocytes lots of protein needs to be sythethized simultaneously.,2011-10-25,向光盛 长江大学医学院,49 families according to codon recognition;(Should by 61 for every coding triplet)Paradox is eliminated by codon wobblingVery rough correlation between tRNA gene number and amino acid frequency in the protein280 out of 497 genes are on Chr.6,most are clustered in the same 4 Mb region;other are also more or less clustered(Chr.1 and 7)All chromosomes still carry at least one tRNA gene chr.22 and Y are exclusions,tRNA genes(497 nuclear genes+324 putative pseudogenes),2011-10-25,向光盛 长江大学医学院,Representation of aminoacids by human tRNA(examples),2011-10-25,向光盛 长江大学医学院,Small nuclear RNA(snRNA),Uridine rich;Numbered U1,U2,U3 etcInclude spliceosomal RNAs U6 and U1U6(44 genes)and U1(16 genes)Sometimes clustered as very irregular or almost perfect groups,e.g.RNU1 locus at 1p36 and RNU2 at 17q21;For U6 snRNA 1135 fragmental/pseudogenic s