SAS BASE要点笔记.doc
《SAS BASE要点笔记.doc》由会员分享,可在线阅读,更多相关《SAS BASE要点笔记.doc(119页珍藏版)》请在三一办公上搜索。
1、Accessing Data and Creating Data StructuresTopic: Accessing Data and Creating Data Structures1.Reading raw data files using INFILE and INPUT statement2.Writing _NULL_ Data Set3. Assigning and change variable attributes4. Import database table or data file into SAS dataset5. Labeling variables6.Readi
2、ng existing SAS dataset7. Restricting observations while reading data8. Creating temporary and permanent SAS data sets9. Exporting data to different files10.Displaying contents of dataset11. Restricting observations and variables in a SAS data set processed1. Reading raw data files using INFILE and
3、INPUT statement1.1 Introduction1.1.1 Common Step Boundary Keywords:DATA PROC CARDSDATALINES QUITRUN1.1.2 Data Step Flowdata sales; infile rawin;input name $1-10 division $12 years 15-16 sales 19-25;run; proc print data=sales;run;Note: The use of RUN after each step is highly recommendedA. The Compil
4、ation PhaseWhen you submit a DATA step for execution, SAS checks the syntax of the SAS statements and compiles them, that is, automatically translates the statements into machine code. In this phase, SAS identifies the type and length of each new variable, and determines whether a type conversion is
5、 necessary for each subsequent reference to a variable. During the compile phase, SAS creates the following three items: input bufferis a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement. Note that this buffer is created only when the DATA step
6、 reads raw data. (When the DATA step reads a SAS data set, SAS reads the data directly into the program data vector.)program data vector (PDV)is a logical area in memory where SAS builds a data set, one observation at a time. When a program executes, SAS reads data values from the input buffer or cr
7、eates them by executing SAS language statements. The data values are assigned to the appropriate variables in the program data vector. From here, SAS writes the values to a SAS data set as a single observation. Along with data set variables and computed variables, the PDV contains two automatic vari
8、ables, _N_ and _ERROR_. The _N_ variable counts the number of times the DATA step begins to iterate. The _ERROR_ variable signals the occurrence of an error caused by the data during execution. The value of _ERROR_ is either 0 (indicating no errors exist), or 1 (indicating that one or more errors ha
9、ve occurred). SAS does not write these variables to the output data set. descriptor information is information that SAS creates and maintains about each SAS data set, including data set attributes and variable attributes. It contains, for example, the name of the data set and its member type, the da
10、te and time that the data set was created, and the number, names and data types (character or numeric) of the variables. B. The Execution PhaseBy default, a simple DATA step iterates once for each observation that is being created. The flow of action in the Execution Phase of a simple DATA step is d
11、escribed as follows: 1. The DATA step begins with a DATA statement. Each time the DATA statement executes, a new iteration of the DATA step begins, and the _N_ automatic variable is incremented by 1. 2. SAS sets the newly created program variables to missing in the program data vector (PDV). 3. SAS
12、reads a data record from a raw data file into the input buffer, or it reads an observation from a SAS data set directly into the program data vector. You can use an INPUT, MERGE, SET, MODIFY, or UPDATE statement to read a record. 4. SAS executes any subsequent programming statements for the current
13、record. 5. At the end of the statements, an output, return, and reset occur automatically. SAS writes an observation to the SAS data set, the system automatically returns to the top of the DATA step, and the values of variables created by INPUT and assignment statements are reset to missing in the p
14、rogram data vector. Note that variables that you read with a SET, MERGE, MODIFY, or UPDATE statement are not reset to missing here. 6. SAS counts another iteration, reads the next record or observation, and executes the subsequent programming statements for the current observation. 7. The DATA step
15、terminates when SAS encounters the end-of-file in a SAS data set or a raw data file. Note:It shows the default processing of the DATA step. You can code data-reading statements (such as INPUT or SET), or data-writing statements (such as OUTPUT), in any order in your program.Flow of Action in the DAT
16、A StepDiagnosing Errors in the Compilation Phase Now that you know how a DATA step is processed, you can use that knowledge to correct errors. There were errors that are detected during the compilation phase, including misspelled keywords and data set names missing semicolons unbalanced quotation ma
17、rks invalid options. During the compilation phase, SAS can interpret some syntax errors (such as the keyword DATA misspelled as DAAT). If it cannot interpret the error, SAS prints the word ERROR followed by an error message in the log compiles but does not execute the step where the error occurred,
18、and prints the following message to warn you: NOTE:The SAS System stopped processing this step because of errors.Some errors are explained fully by the message that SAS prints; other error messages are not as easy to interpret. For example, because SAS statements are free-format, when you fail to en
19、d a SAS statement with a semicolon, SAS does not always detect the error at the point where it occurs.Diagnosing Errors in the Execution Phase As you have seen, errors can occur in the compilation phase, resulting in a DATA step that is compiled but not executed. Errors can also occur during the exe
20、cution phase. When SAS detects an error in the execution phase, the following can occur, depending on the type of error: A note, warning, or error message is displayed in the log. The values that are stored in the program data vector are displayed in the log. The processing of the step either contin
21、ues or stops. 1.2 Basic Forms of INPUT Statement The most common way to create new datasets is by submitting a DATA step. The INPUT statement describes what data will be contained in your new data set. It is used to read data from an external source, or from lines contained in your SAS program.1.2.1
22、 List InputUse the List input mode to read data recorded with at least one blank space separating each data field. Missing values are represented as a dot (period). This is the simple form of input (freeform list or format-free). DATA Census; INPUT State $ Pop ; CARDS; NC 5.085 SC 2.590 VA 1.360 MA
23、3.450 PA .; run;1.2.2 Column InputUse Column input mode to read the following type of data. The variables must be listed in the order in which they appear in the input data.- Character and numeric data - Data values which are entered in fixed column positions - Character values longer than eight cha
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- SAS BASE 要点笔记 要点 笔记
链接地址:https://www.31ppt.com/p-2396044.html