欢迎来到三一办公! | 帮助中心 三一办公31ppt.com(应用文档模板下载平台)
三一办公
全部分类
  • 办公文档>
  • PPT模板>
  • 建筑/施工/环境>
  • 毕业设计>
  • 工程图纸>
  • 教育教学>
  • 素材源码>
  • 生活休闲>
  • 临时分类>
  • ImageVerifierCode 换一换
    首页 三一办公 > 资源分类 > PPT文档下载  

    《高性能处理器》PPT课件.ppt

    • 资源ID:4880647       资源大小:298.50KB        全文页数:25页
    • 资源格式: PPT        下载积分:15金币
    快捷下载 游客一键下载
    会员登录下载
    三方登录下载: 微信开放平台登录 QQ登录  
    下载资源需要15金币
    邮箱/手机:
    温馨提示:
    用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP免费专享
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    《高性能处理器》PPT课件.ppt

    2023/5/21,USTC CS AN Hong,1,取指和取数都要访问同一个存储器Detection is easy in this case!(right half highlight means read,left half write),结构相关:由访存引起的结构相关,2023/5/21,USTC CS AN Hong,2,取指延迟一拍进行,结构相关的解决方案:阻塞,2023/5/21,USTC CS AN Hong,3,控制相关:Whats the Problem?,Need address here,Compute address here,Branch Delay,例:BEQ rs,rt,offset if Rrs=Rrt then PC-PC+offset,分支处理问题可划分为两个子问题决定分支的方向(分支条件相关)对需要跳转的分支,使执行延迟最小化-尽快获得转移的目标地址(分支地址相关),2023/5/21,USTC CS AN Hong,4,Stall:wait until decision is clearImpact:2 lost cycles(i.e.3 clock cycles per branch instruction)=slowMove decision to end of decodesave 1 cycle per branch,Control Hazard Solution#1:Stall,2023/5/21,USTC CS AN Hong,5,Predict:guess one direction then back up if wrongImpact:0 lost cycles per branch instruction if right,1 if wrong(right 50%of time)More dynamic scheme:history of 1 branch(90%),Control Hazard Solution#2:Predict,2023/5/21,USTC CS AN Hong,6,Delayed Branch:Redefine branch behavior(takes place after next instruction)Impact:0 clock cycles per branch instruction if can find instruction to put in“slot”(50%of time)As launch more instruction per clock cycle,less useful,Control Hazard Solution#3:Delayed Branch,2023/5/21,USTC CS AN Hong,7,Data Hazard on R1,Read After Write(RAW)InstrJ tries to read operand before InstrI writes itCaused by a“Dependence”(in compiler nomenclature).This hazard results from an actual need for communication.,2023/5/21,USTC CS AN Hong,8,add r1,r2,r3,sub r4,r1,r3,and r6,r1,r7,or r8,r1,r9,xor r10,r1,r11,Data Hazard on r1:Read after write hazard(RAW),2023/5/21,USTC CS AN Hong,9,Instr.Order,Time(clock cycles),add r1,r2,r3,sub r4,r1,r3,and r6,r1,r7,or r8,r1,r9,xor r10,r1,r11,IF,ID/RF,EX,MEM,WB,ALU,Im,Reg,Dm,Reg,Reg,Dm,Reg,Reg,Dm,Reg,Im,ALU,Reg,Dm,Reg,Data Hazard on r1:Read after write hazard(RAW),Dependencies backwards in time are hazards,2023/5/21,USTC CS AN Hong,10,Instr.Order,Time(clock cycles),add r1,r2,r3,sub r4,r1,r3,and r6,r1,r7,or r8,r1,r9,xor r10,r1,r11,IF,ID/RF,EX,MEM,WB,ALU,Im,Reg,Dm,Reg,Reg,Dm,Reg,Reg,Dm,Reg,Im,ALU,Reg,Dm,Reg,Data Hazard Solution:Forwarding,“Forward”result from one stage to another,2023/5/21,USTC CS AN Hong,11,Reg,Time(clock cycles),lw r1,0(r2),sub r4,r1,r3,IF,ID/RF,EX,MEM,WB,ALU,Im,Reg,Dm,Reg,Dm,Reg,Forwarding(or Bypassing):What about Loads?,Dependencies backwards in time are hazardsData Hazard Even with ForwardingCant solve with forwarding,Must delay/stall instruction dependent on loads,2023/5/21,USTC CS AN Hong,12,Reg,Time(clock cycles),lw r1,0(r2),sub r4,r1,r3,IF,ID/RF,EX,MEM,WB,ALU,Im,Reg,Dm,Stall,Forwarding(or Bypassing):What about Loads?,Dependencies backwards in time are hazardsData Hazard Even with ForwardingCant solve with forwarding,Must delay/stall instruction dependent on loads,2023/5/21,USTC CS AN Hong,13,Try producing fast code fora=b+c;d=e f;assuming a,b,c,d,e,and f in memory.Slow code:LW Rb,bLW Rc,cADD Ra,Rb,RcSW a,Ra LW Re,e LW Rf,fSUB Rd,Re,RfSWd,Rd,Software Scheduling to Avoid Load Hazards,Fast code:LW Rb,bLW Rc,cLW Re,e ADD Ra,Rb,RcLW Rf,fSW a,Ra SUB Rd,Re,RfSWd,Rd,Compiler optimizes for performance.Hardware checks for safety.,2023/5/21,USTC CS AN Hong,14,Data Hazard Solution(3):Out-of-Order Execution,Need to detect data dependences at run timeNeed of precise exceptions:Out-of-order execution,in-order completion,Time T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12sub$2,$1,$3 IF ID EX ME WBadd$14,$5,$4 IF ID EX ME WB sw$15,100($6)IF ID EX ME WB and$12,$2,$3 IF*ID EX ME WBor$13,$6,$2 IF ID EX ME WB,2023/5/21,USTC CS AN Hong,15,Data Hazard Solution(4):Data Speculation,In a wide-issue processors,e.g.8 12 instructions per clock cycleLarger than a basic block(5 7 instructions)Multiple branches use multiple-branch prediction(e.g.trace cache)Multiple data dependence chains very hard to execute them in the same clock cycleValue speculation is primarily used to resolve data dependences:In the same clock cycleLong latency operations(e.g.load operations),2023/5/21,USTC CS AN Hong,16,Data Hazard Solution(4):Data Speculation,Why is Speculation Useful?Speculation lets all these instruction run in parallel on a superscalar machine.addq$3$1$2addq$4$3$1addq$5$3$2What is Value Prediction?Predict the value of instructions before they are executedCp.Branch Prediction eliminates the control dependencesPrediction Data are just two values(taken or not taken)Value Predictioneliminates the data dependencesPrediction Data are taken from a much larger range of values,2023/5/21,USTC CS AN Hong,17,Data Hazard Solution(4):Data Speculation,Value Locality:likelihood of a previously-seen value recurring repeatedly within a storage locationObserved in any storage locations RegistersCache memoryMain memoryMost work focussing on value stored in registers to break potential data dependences:register value localityWhy Value Prediction?Results of many instructions can be accurately predicted before they are issued or executed.Dependent instructions are no longer bound by the serialization constraints imposed by data dependences.More parallelism can be explored.Prediction of values for dependant instructions can lead to beneficial speculative execution,2023/5/21,USTC CS AN Hong,18,冗余指令,若将程序执行期间生成的每条静态指令的动态实例进行缓存,则每条生成结果的动态指令可归为以下三种类型:新结果指令:首次生成新值的动态指令 5%重复结果指令:生成结果与对应静态指令的其它动态实例相同的动态指令 80%90%可推导型指令:生成结果能用先前的结果推导出来的动态指令 5%冗余指令重复型指令和可推导指令,2023/5/21,USTC CS AN Hong,19,Question:Where does value locality occur?,Single-cycle Arithmetic(i.e.addq$1$2)Single-cycle Logical(i.e bis$1$2)Multi-cycle Arithmetic(i.e.mulq$1$2)Register Move(i.e.cmov$1$2)Integer Load(i.e.ldq$1 8($2)Store with base register update FP Multiply FP Add FP MoveFP Load,Somewhat YesNoYesYesNoSomewhat Somewhat YesYes,How often does the same value result from the same instruction twice in a row?,Source of Value Locality(Sources of value predictability),2023/5/21,USTC CS AN Hong,20,Data redundancy:text files with white spaces,empty cells in spreadsheetsError checkingProgram constantsComputed branchesVirtual function callsGlue code:allow calling from one compilation unit to anotherAddressability:pointer tables store constant addresses loaded at runtimeCall contexts:caller-saved/callee saved registersMemory alias resolution:conservative assumptions from compiler regarding aliasingRegister spill code,Source of Value Locality(Sources of predictability),2023/5/21,USTC CS AN Hong,21,Three Generic Data Hazards,Write After Read(WAR)InstrJ writes operand before InstrI reads itCalled an“anti-dependence”by compiler writers.This results from reuse of the name“r1”.Cant happen in DLX 5 stage pipeline because:All instructions take 5 stages,and Reads are always in stage 2,and Writes are always in stage 5,2023/5/21,USTC CS AN Hong,22,Three Generic Data Hazards,Write After Write(WAW)InstrJ writes operand before InstrI writes it.Called an“output dependence”by compiler writersThis also results from the reuse of name“r1”.Cant happen in DLX 5 stage pipeline because:All instructions take 5 stages,and Writes are always in stage 5Will see WAR and WAW in more complicated pipes,2023/5/21,USTC CS AN Hong,23,总结:影响指令级并行性的因素,Pipeline CPI=Ideal pipeline CPI+Structural stalls+RAW stalls+WAR stalls+WAW stalls+Control stalls改进理想的CPI:多发射(静态/动态)克服流水线中的相关性结构相关:由资源冲突导致的相关解决办法:增加资源数据相关:由RAW、WAW、WAR导致的相关解决办法(用软件):编译器静态调度,循环展开,寄存器重命名,软流水(用硬件):forwarding技术,寄存器重命名,动态调度的乱序执行技术(记分板,Tomasulo算法)控制相关:由分支引起的相关解决方法:静态/动态预测和推测执行,2023/5/21,USTC CS AN Hong,24,总结:数据相关(又称数据依赖),在程序的一个基本块中存在的数据相关有以下几种情形:真数据依赖:两条指令之间存在数据流,有真正的数据依赖关系RAW(Read After Write)相关:对于指令i和j,如果(1)指令j使用指令i产生的结果,则称指令j与指令i为RAW相关;或者(2)指令j与指令i存在RAW相关,而指令k与指令j存在RAW相关,则称指令k与指令i为RAW相关伪数据依赖(又称名相关):指令使用的寄存器或存储器称为名。两条指令使用相同名,但它们之间不存在数据流,则它们之间是一种伪数据依赖关系,包括两种情形:WAR(Write After Read)相关:对于指令i和j,如果指令i先执行,指令j写的名是指令i读的名,则称指令j与指令i为WAR相关(又称反相关,anti-dependence)WAW(Write After Write)相关:对于指令i和j,如果指令i与指令j写相同的名,则称指令j与指令i为WAW相关(又称输出相关,output-dependence),2023/5/21,USTC CS AN Hong,25,总结:开发指令级并行性的技术,

    注意事项

    本文(《高性能处理器》PPT课件.ppt)为本站会员(小飞机)主动上传,三一办公仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三一办公(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    备案号:宁ICP备20000045号-2

    经营许可证:宁B2-20210002

    宁公网安备 64010402000987号

    三一办公
    收起
    展开