课程实验报告(全日制硕士研究生)课程名称程序语言与编译实验名称C-编译器设计专业、班级计算机科学与技术4班学生张洁坤学号S********同组学生及学号无指导教师杨晓波目录第1章实验介绍及整体框架 (3)1.1实验目的: (3)1.2实验环境: (3)1.3 C-编译器的整体框架 (3)第2章词法分析 (4)2.1 词法分析包括两个类: (4)2.2 C关键字表: (5)2.3 标识符词法: (5)第3章语法分析 (6)3.1 Class CParser: (6)3.2 Grammar: (6)3.3 基本树形结构: (7)3.4 支持的语句及运算: (8)第4章建立符号表 (9)4.1 辅助类: (9)4.2 主要的类,建立符号表: (9)第5章类型检测 (10)第6章代码生成 (11)6.1 PCode: (11)6.2 80X86 ASM: (12)第7章总结 (13)参考文献 (15)第1章实验介绍及整体框架1.1实验目的:通过实验,加深对所学的关于程序语言与编译的理论知识的理解,增强对所学知识的综合应用能力。
通过本实验,进一步明确编译各阶段之间的关系,掌握词法分析、语法分析、语义分析等实现技术及其实现,熟悉符号表的管理及其在编译过程中的作用,掌握错误处理机制及其应用。
1.2实验环境:硬件:主机:586以上,配有鼠标,内存:256MB以上显示器:VGA或以上硬盘空间:500MB以上软件:Microsoft Visual C++ 6.01.3 C-编译器的整体框架输入文件开始词法分析语法分析建立符号表类型检查代码生成结束语法树符号表第2章词法分析2.1 词法分析包括两个类:(1)Class CTokenizer:从一个字符串中(这个把一个文件看作是一个字符串,MFC中CFile->CString)分离出一个一个token,配上简单的类型通过NextToken()返回:#define TT_EOL '\n'#define TT_EOF -1#define TT_INTEGER -2#define TT_REAL -3#define TT_WORD -4#define TT_STRING '"'#define TT_CHAR '\''(2)Class CScaner:得到具体的的token类型,定义TokenType如下:enum TokenType{// reserved Keyword_AUTO, _DOUBLE, _INT, _STRUCT,_BREAK, _ELSE, _LONG, _SWITCH,_CASE, _ENUM, _REGISTER, _TYPEDEF,_CHAR, _EXTERN, _RETURN, _UNION,_CONST, _FLOAT, _SHORT, _UNSIGNED,_CONTINUE, _FOR, _SIGNED, _VOID,_DEFAULT, _GOTO, _SIZEOF, _VOLA TILE,_DO, _IF, _STATIC, _WHILE,_READ, _WRITE, _PRINTF,// operationsASSIGN, PLUS, MINUS, TIMES, DIV, MOD,BITWISE_AND, BITWISE_OR, BITWISE_NOT, LOGICAL_NOT, LT, GT,// interpunctionsLPARAN, RPARAN, LBRACE, RBRACE, LSQUARE, RSQUARE, COMMA, DOT, SEMI, COLON,// complex operationsEQ/* == */, NEQ/* != */, PLUS_PLUS/* ++ */, MINUS_MINUS/* -- */,PLUS_ASSIGN/* += */, MINUS_ASSIGN/* -= */, TIMES_ASSIGN/* *= */, DIV_ASSIGN/* /= */,NGT/* <= */, NLT/* >= */, LOGICAL_AND/* && */, LOGICAL_OR/* || */,// others_EOF, _ID, _NUM, _STRING, _CHARACTER, _LABEL, _ERROR, _NONE};CScaner通过一个CMap<CString, LPCSTR, enum TokenType, enum TokenType> m_KeyIndex 把CString的关键字和TokenType对应,便于查找和反向查找。
2.2 C关键字表:auto double int structbreak else long switchcase enum register typedefchar extern return unionConst float short unsignedContinue for signed voidDefault goto sizeof volatileDo if static while2.3 标识符词法:identifier :nondigitidentifier nondigitidentifier digitnondigit : one of_ a b c d e f g h i j k l m n o p q r s t u v w x y zA B C D E F G H I J K L M N O P Q R S T U V W X Y Zdigit : one of0 1 2 3 4 5 6 7 8 9escape:\n, \r, \b, \0-7第3章语法分析3.1 Class CParser:定义CTreeNode,和Tiny例程类似:#define MAX_CHILDREN 3class CTreeNode{public:CTreeNode* child[ MAX_CHILDREN ]; // point to child nodeCTreeNode* father; // point to father nodeCTreeNode* sibling; // point to sibling nodeint lineno;NodeKind nodekind;union {StmtKind stmt;ExpKind exp;} kind;enum TokenType type;CString szName;CString szScope; // node function scopeBOOL bArray; // is this an array declarationint iArraySize; // array size};通过文法及相应规则建立语法树。
3.2 Grammar:1.program->declaration_list2.declaration_list->declaration_list declaration | declaration3.declaration->var_declaration | fun_declaration4.var_declaration->type_specifier ID(, ...)`;` | type_specifier ID `[` NUM `]`(, ...)`;`5.type_specifier->`int` | `void` | `char`, actually this step is in declaration_list()6.fun_declaration->type_specifier ID `(` params `)` compound_stmt7.params->param_list | `void` | empty, `void` is thought as empty8.param_list->param_list `,` param | param9.param->type_specifier ID | type_specifier ID `[` `]`pound_stmt->`{` loal_declarations statement_list `}` | expression_stmt11.local_declarations->local_declarations var_declaration | var_declaration12.`read` `(` var `)` `;`13.`write` `(` expression `)` `;`14.`printf` `(` `"` STRING `"` `)` `;`15.expression_stmt->expression `;` | `;`16.expression->var `=` expression | logic1_expression17.logic1_expression->logic1_expression `||` logic2_expression | logic2_expression18.logic2_expression-> logic2_expression `&&` simple_expression | simple_expression19.simple_expression->additive_expression relop additive_expression | additive_expression20.relop-> `<=` | `<` | `>` | `>=` | `==` | `!=`21.additive_expression -> additive_expression addop term | term22.addop-> `+` | `-`23.term->term mulop logic3_expression | logic3_expression24.mulop-> `*` | `/` | `%`25.logic3_expression-> `!` logic3_expression | factor26.factor->`(` expression `)` | var | call | NUM27.var->ID | ID `[` expression `]`28.call->ID `(` args `)`29.args->args_list | empty30.args_list->args_list `,` expression | expression31.sub_compoundstmt->ID `:` | call `;` | expression_stmt32.if_stmt->`if` `(` expression `)` compound_stmt| `if` `(` expression `)` compound_stmt `else` compound_stmt33.while_stmt->`while` `(` expression `)` compound_stmt34.for_stmt->`for` `(` var `=` expression `;` expression `;` var `=` expression `)`compound_stmt35.goto_stmt->`goto` ID `;`36.break_stmt->`break` `;`37.continue_stmt->`continue` `;`38.return_stmt->`return` `;` | `return` expression `;`3.3 基本树形结构:if语句:while语句:if语句表达式语句语句while语句表达式语句for复合语句:3.4 支持的语句及运算:1) 数据类型:int ,char void ,PCode 里支持float ,在80x86 ASM 里不支持 2) 语句:赋值(=),if, while ,for ,return ,break ,continue 3) 数学运算:+,-,*,/4) 关系运算:= =,>,<,>=,<=,!= 5) 逻辑运算:&&,||,! 6) 支持函数的定义、调用 7) 支持复合语句8) 注释语句:C 类型的 /* */ 和C++类型的 //表达式语句表达式for 语句表达式语句复合语句语句语句声明第4章建立符号表4.1 辅助类:(1) Class LineListRec:主要成员是lineno,记录某个Token(变量或函数名)声明或使用时的行数。