《编译原理概述》PPT课件.ppt
编译原理,邹阳计算机与信息学院,2023/7/29,2,个人联系方式 电子邮件:办公室:电学楼4509,2023/7/29,3,第一章 概论,课程要求什么是编译程序为什么学习编译原理与技术编译过程概述编译程序的结构编译程序的生成,2023/7/29,4,0.课程要求,源语言目标语言编译技术,2023/7/29,5,课程的性质、地位和任务,本课程是计算机专业的一门重要的专业基础课,既是一门理论性、实验性、技术性很强的课程,又是理论与实践紧密结合的课程。本课程的主要任务是介绍程序设计语言编译程序构造的基本原理和设计方法。通过本课程的学习,使学生掌握和理解编译的基本过程,各个编译阶段的功能与常用的一些设计方法和技巧。,2023/7/29,6,课程定位计算机核心课程,操作系统,2023/7/29,7,主要内容,概述形式语言与自动机词法分析语法分析语法制导翻译及中间代码符号表代码优化代码生成存储管理面向对象语言的编译,2023/7/29,8,教材,张素琴,吕映芝等,编译原理(第二版),清华大学出版社。,2023/7/29,9,参考书目,陈火旺,程序设计语言编译原理,国防工业出版社钱焕延,编译技术,第二版,东南大学出版社K.C.Louden,编译原理及实践,冯博琴译,机械工业出版社,2023/7/29,10,成绩评定,平时作业与课堂表现:20%期末考试:80%课程实践:1学分,单独算一门课,2023/7/29,11,1.什么是编译程序,机器语言machine language汇编语言assembly language高级语言high-level language翻译程序translator汇编程序assembler编译程序compiler解释程序interpreter,2023/7/29,12,58 20 1020(1000)59 20 1024(1004)47 C0 1014(1008)50 20 1028(100C)47 F0 101C(1010)58 20 1024(1014)50 20 1028(1018),LD R2,xCP R2,yBNG LESSST R2,maxJMP EXITLESS:LD R2,yST R2,maxEXIT:,if(xy)max=x;else max=y;,2023/7/29,13,Simply stated,a compiler is a program that reads a program written in one language-the source language-and translates it into an equivalent program in another language-the target language.(from“Compilers:principles,techniques and tools”,by A.V.Aho,R.Sethi,and J.D.Ullman),2023/7/29,14,从功能上看,一个编译程序就是一个语言翻译程序,它把一种语言(称作源语言)书写的程序翻译成另一种语言(称作目标语言)的等价的程序.,2023/7/29,15,其他与编译相关的程序,LinkerLoaderPreprocessorEditorDebuggerProfilerProject manager,2023/7/29,16,2.为什么学习编译原理与技术,Computer Science程序设计语言软件技术思维方法,2023/7/29,17,2.为什么学习编译原理与技术,Various Applications人工智能-句法模式识别机器人语言软件描述/发布-CORBA IDLXML,SGML,xxML通信协议解释,2023/7/29,18,3.编译过程概述,词法分析lexical analysis语法分析syntax analysis语法树syntax tree语义分析semantic analysis中间代码生成intermediate code generation优化 optimization目标代码生成 target code generation,2023/7/29,19,2023/7/29,20,出错处理,表格管理,词法分析,从左至右读字符流的源程序、识别(拼)单词例:position:=initial+rate*60;,position:=initial+rate*60;,单词类型单词值 标识符1(id1)position 算符(赋值):=标识符2(id2)initial 算符(加)+标识符3(id3)rate 算符(乘)*整数 60 分号;,又如一个C源程序片断:int a;a=a+2;词法分析后可能返回:单词类型单词值 保留字 int标识符(变量名)a界符;标识符(变量名)a算符(赋值)=标识符(变量名)a 算符(加)+整数 2界符;,有关术语,词法分析(lexical analysis or scanning)-The stream of characters making up a source program is read from left to right and grouped into tokens,which are sequences of characters that have a collective meaning.单词-token保留字-reserved word标识符-identifier(user-defined name),功能:层次分析.依据源程序的语法规则把源程序的单词序列组成语法短语(表示成语法树).position:=initial+rate*60;规则:=“:=”:=“+”:=“*”:=“(”“)”:=:=:=,语法分析,赋值语句,标识符,表达式,表达式,+,表达式,表达式,标识符,整数,标识符,:=,表达式,*,id1:=id2+id3*N,术语,语法分析(syntax analysis or parsing)The purpose of syntax analysis is to determine the source programs phrase structure.This process is also called parsing.The source program is parsed to check whether it conforms to the source languages syntax,and to construct a suitable representation of its phrase structure.语法树(推导树)(parse tree or derivation tree),语义分析,语义审查(静态语义)上下文相关性类型匹配类型转换,例:Program p();Var rate:real;procedure initial;position:=initial+rate*60/*error*/*error*/*warning*/;,又如:int arr 2,abc;abc=arr*10;,语义分析,术语,语义分析(semantic analysis)The parsed program is further analyzed to determine whether it conforms to the source languages contextual constraints:scope rules,type rulese.g.To relate each applied occurrence of an identifier in the source program to the corresponding declaration.,中间代码生成,源程序的内部(中间)表示三元式、四元式(*id3t1t2)t2=id3*t1,id1:=id2+id3*60(1)(inttoreal,60-t1)(2)(*,id3t1t2)(3)(+,id2t2t3)(4)(:=,t3-id1),中间代码生成(intermediate code generation),This is where the intermediate representation of the source program is created.We want this representation to be easy to generate,and easy to translate into the target program.The representation can have a variety of forms,but a common one is called three-address code or 4-tuple code.,代码优化,id1:=id2+id3*60(1)(inttoreal60-t1)(2)(*id3t1t2)(3)(+id2t2t3)(4)(:=t3-id1)变换(1)(*id360.0t1)(2)(+id2 t1id1),代码优化,t1=b*c t1=b*c t2=t1+0 t2=t1+t1t3=b*c a=t2t4=t2+t3a=t4,代码优化(code optimization),Intermediate code optimizationThe optimizer accepts input in the intermediate representation and output a version still in the intermediate representation.In this phase,the compiler attempts to produce the smallest,fastest and most efficient running result by applying various techniques.Object code optimization,目标代码生成,(*,id360.0t1)(+,id2t1id1),movfid3,R2mulf#60.0,R2movfid2,R1addfR2,R1movfR1,id1,符号表管理,记录源程序中使用的名字收集每个名字的各种属性信息类型、作用域、分配存储信息,Const1常量值:35Var1变量类型:实层次:2,符号表(symbol table),Symbol table is a data structure which is employed to associate identifiers with their attributes.An identifiers attribute consists of information relevant to contextual analysis,and is obtained from the identifiers declaration.,出错处理,检查错误、报告出错信息、排错、恢复编译工作,2023/7/29,43,2023/7/29,44,出错处理(error handling)(error reporting and error recovery),The compiler should report the location of each error,together with some explanation.The major categories of compile-time error:syntax error,scope error,type error.After detecting and reporting an error,the compiler should attempt error recovery,means that the compiler should try to get itself into a state where analysis of the source program can continue as normally as possible.,2023/7/29,46,编译程序中的主要数据结构,记号表token(单词的内部表示)语法树syntax tree符号表symbol table常数表literal table中间代码intermediate code临时文件temporary file,2023/7/29,47,4.编译程序的结构,2023/7/29,48,出错处理,表格管理,2023/7/29,49,几个重要概念,P.6趟(pass,遍)单趟扫描多趟扫描编译的前端(front end)编译的后端(back end),2023/7/29,50,对源程序(包括源程序中间形式)从头到尾扫描一次,并做有关的加工处理,生成新的源程序中间形式或目标程序,通常称之为一遍。,2023/7/29,51,前端:通常将与源程序有关的编译部分称为前端。词法分析、语法分析、语义分析、中间代码生成、代码优化 特点:与源语言有关,后端:与目标机有关的部分称为后端。目标程序生成(以及与目标机有关的优化)特点:与目标机有关,2023/7/29,52,5.编译程序的生成,交叉编译自展(自编译)移植自动生成程序,2023/7/29,53,交叉编译器(P.305),由于目标机指令系统与宿主机的指令系统不同,编译时将应用程序的源程序在宿主机上生成目标机代码,称为交叉编译。,2023/7/29,54,2023/7/29,55,小结,本章重点对编译程序的功能和结构做一综述,要求了解编译程序各个成分在编译阶段的逻辑关系以及它们怎样作为一个整体完成编译任务,以及几个重要的概念:编译,遍,自展,交叉编译,移植,2023/7/29,56,作业,阅读第1、2章查阅有关编译技术应用方面的文献,