SAS

SAS (previously "Statistical Analysis System") Is A Software Suite Developed By SAS Institute For Advanced Analytics, Multivariate Analyses, Business Intelligence, Data Management, And Predictive Analytics.

1.Explain what is SAS? What are the functions does it performs?

SAS means Statistical Analysis System, which is an integrated set of software products.

• Information retrieval and data management

• Writing reports and graphics

• Statistical analysis, econometrics and data mining

• Business planning, forecasting and decision support

• Operation research and Project management

• Quality Improvement

• Data Warehousing

• Application Development

2.Explain what is the basic structure of SAS programming?

The basic structure of SAS are

• Program Editor

• Explorer Window

• Log Window

3.What is the basic syntax style in SAS?

To run program successfully, and you have following basic elements:

• There should be a semi-colon at the end of every line

• A data statement that defines your data set

• Input statement

• There should be at least one space between each word or statement

• A run statement

For example: Infile ‘H: \StatHW\yourfilename.dat’;

4.Explain what is Data Step?

The Data step creates an SAS dataset which carries the data along with a “data dictionary.” The data
dictionary holds the information about the variables and their properties.

5.Explain what is PDV?

The logical area in the memory is represented by PDV or Program Data Vector. At the time, SAS creates a
database of one observation at a time. An input buffer is created at the time of compilation which holds a
record from an external file. The PDV is created following the input buffer creation

6.Mention what are the data types does SAS contain?

The data types in SAS are Numeric and Character.

7.In SAS explain which statement does not perform automatic conversions in comparisons?

In SAS, the “where” statement does not perform automatic conversions in comparisons.

8.Explain how you can debug and test your SAS program?

You can debug and test your SAS program by using Obs=0 and systems options to trace the program
execution in log

9.Mention what is the difference between nodupkey and nodup options?

The difference between the NODUP and NODUPKEY is that, NODUP compares all the variables in our
dataset while NODUPKEY compares just the BY variables

10.Mention the validation tools used in SAS?

For DataSet : Data set name/ debug Data set: Name/stmtchk

For Macros: Options: mprint mlogic symbolgen

11.Explain what does PROC print, and PROC contents are used for?

To display the contents of the SAS dataset PROC print is used and also to assure that the data were read
into SAS correctly. While, PROC CONTENTS display information about an SAS dataset.

12.Explain what is the use of function Proc summary?

The syntax of proc summary is same as that of proc means, it computes descriptive statistics on numeric
variables in the SAS dataset.

13.Explain what Proc glm does?

Proc glm performs simple and multiple regression, analysis of variance (ANOVAL), analysis of
covariance, multivariate analysis of variance and repeated measure analysis of variance.

14.Explain what is SAS informats?

SAS INFORMATS are used to read, or input data from external files known as Flat Files ASCII files, text
files or sequential files). The informat will tell SAS on how to read data into SAS variables.

15.Mention the category in which SAS Informats are placed?

SAS informats are placed in three categories,

• Character Informats : $INFORMATw

• Numeric Informats : INFORMAT w.d

• Date/Time Informats: INFORMAT w.

16. What function CATX syntax does?

CATX syntax concatenate character strings remove trailing and leading blanks and inserts separators.

16.Explain what is the use of PROC gplot?

PROC gplot has more options and can create more colorful and fancier graphics.

17.What is the one statement to set the criteria of data that can be coded in any step?

Options statement.

18.What is the effect of the OPTIONS statement ERRORS=1?

The –ERROR-variable ha a value of 1 if there is an error in the data for that observation and 0 if it is not.

19.What do the SAS log messages “numeric values have been converted to character” mean? What are the implications?

It implies that automatic conversion took place to make character functions possible.

20.Why is a STOP statement needed for the POINT= option on a SET statement?

Because POINT= reads only the specified observations SAS cannot detect an endoffile condition as it
would if the file were being read sequentially.

21.How do you control the number of observations and/or variables read or written?

FIRSTOBS and OBS option

22.Approximately what date is represented by the SAS date value of 730?

31st December 1961

23.Identify statements whose placement in the DATA step is critical.

INPUT, DATA and RUN…

24.Does SAS ‘Translate’ (compile) or does it ‘Interpret’?

Compile

25.What does the RUN statement do?

When SAS editor looks at Run it starts compiling the data or proc step, if you have more than one data
step or proc step or if you have a proc step. Following the data step then you can avoid the usage of the run
statement.

26.Why is SAS considered self documenting?

SAS is considered self documenting because during the compilation time it creates and stores all the
information about the data set like the time and date of the data set creation later No. of the variables later
labels all that kind of info inside the dataset and you can look at that info using proc contents procedure.

27.What are some good SAS programming practices for processing very large data sets?

Sort them once, can use firstobs = and obs = ,

28.What is the different between functions and PROCs that calculate the same simple descriptive statistics?

Functions can used inside the data step and on the same data set but with proc’s you can create a new data
sets to output the results. May be more ………..

29.If you were told to create many records from one record, show how you would do this using arrays and with PROC TRANSPOSE?

I would use TRANSPOSE if the variables are less use arrays if the var are more …………….. depends

30.What is a method for assigning first.VAR and last.VAR to the BY groupvariable on unsorted data?

In unsorted data you can’t use First. or Last.

31.How do you debug and test your SAS program?

First thing is look into Log for errors or warning or NOTE in some cases or use the debugger in SAS data
step.

32.What other SAS features do you use for error trapping and data validation?

Check the Log and for data validation things like Proc Freq, Proc means or some times proc print to look
how the data looks like ……..

33.How would you combine 3 or more tables with different structures?

I think sort them with common variables and use merge statement.

34.What areas of SAS are you most interested in?

BASE, STAT, GRAPH, ETSBriefly

35.Describe 5 ways to do a “table lookup” in SAS.

Match Merging, Direct Access, Format Tables, Arrays, PROC SQL

36.What versions of SAS have you used (on which platforms)?

SAS 9.1.3,9.0, 8.2 in Windows and UNIX, SAS 7 and 6.12

37.What are some good SAS programming practices for processing very large data sets?

Sampling method using OBS option or subsetting, commenting the Lines, Use Data Null

38.What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data?

The result of any operation with missing value will result in missing value. Most SAS statistical
procedures exclude observations with any missing variable vales from an analysis.

39.How would you create a data set with 1 observation and 30 variables from a data set with 30observations and 1 variable?

Using PROC TRANSPOSE

40.What is the different between functions and PROCs that calculate the same simple descriptive statistics?

Proc can be used with wider scope and the results can be sent to a different dataset. Functions usually
affect the existing datasets.

41.If you were told to create many records from one record, show how you would do this using array and with PROC TRANSPOSE?

Declare array for number of variables in the record and then used Do loop Proc Transpose with
VARstatement

42.What are _numeric_ and _character_ and what do they do?

Will either read or writes all numeric and character variables in dataset.

43.How would you create multiple observations from a single observation?

Using double Trailing @@

44.For what purpose would you use the RETAIN statement?

The retain statement is used to hold the values of variables across iterations of the data step. Normally, all
variables in the data step are set to missing at the start of each iteration of the data step. What is the order
of evaluation of the comparison operators: + – * / ** ()?A) (), **, *, /, +, –

45.How could you generate test data with no input data?

Using Data Null and put statement

46.How do you debug and test your SAS programs?

Using Obs=0 and systems options to trace the program execution in log.

47.What can you learn from the SAS log when debugging?

It will display the execution of whole program and the logic. It will also display the error with line number
so that you can and edit the program.

48.What is the purpose of _error_?

It has only to values, which are 1 for error and 0 for no error.

49.How can you put a “trace” in your program?

By using ODS TRACE ON

50.How does SAS handle missing values in: assignment statements, functions, a merge, an update, sort order, formats, PROCs?

Missing values will be assigned as missing in Assignment statement. Sort order treats missing as second
smallest followed by underscore.

51.How do you test for missing values?

Using Subset functions like IF then Else, Where and Select.

52.How are numeric and character missing values represented internally?

Character as Blank or “ and Numeric as.

53.Which date functions advances a date time or date/time value by a given interval?

INTNX.

54.In the flow of DATA step processing, what is the first action in a typical DATA Step?

When you submit a DATA step, SAS processes the DATA step and then creates a new SAS data set.(
creation of input buffer and PDV)

Compilation Phase

Execution Phase

55.What are SAS/ACCESS and SAS/CONNECT?

SAS/Access only process through the databases like Oracle, SQLserver, MsAccess etc. SAS/Connect only
use Server connection.

56.What is the one statement to set the criteria of data that can be coded in any step?

OPTIONS Statement, Label statement, Keep / Drop statements.

57.What is the purpose of using the N=PS option?

The N=PS option creates a buffer in memory which is large enough to store PAGESIZE (PS) lines and
enables a page to be formatted randomly prior to it being printed.

58.What are the scrubbing procedures in SAS?

Proc Sort with nodupkey option, because it will eliminate the duplicate values.

59.What are the new features included in the new version of SAS i.e., SAS9.1.3?

  • The main advantage of version9 is faster execution of applications and centralized access of data andsupport.
  • There are lots of changes has been made in the version 9 when we compared with the version8. The following are the few:SAS version 9 supports Formats longer than 8 bytes & is not possible with version 8.
  • Length for Numeric format allowed in version 9 is 32 where as 8 in version 8.
  • Length for Character names in version 9 is 31 where as in version 8 is 32.
  • Length for numeric informat in version 9 is 31, 8 in version 8.
  • Length for character names is 30, 32 in version 8.3 new informats are available in
  • version 9 to convert various date, time and date time forms of data into a SAS date or SAS time.

60.How to use IF THEN ELSE in PROC SQL?

PROC SQL?

SELECT WEIGHT,

CASE

WHEN WEIGHT BETWEEN 0 AND 50 THEN ’LOW’

WHEN WEIGHT BETWEEN 51 AND 70 THEN ’MEDIUM’

WHEN WEIGHT BETWEEN 71 AND 100 THEN ’HIGH’

ELSE ’VERY HIGH’

END AS NEWWEIGHT FROM HEALTH?

QUIT?

61.How to remove duplicates using PROC SQL?

Proc SQL noprint?

Create Table inter.Merged1 as

Select distinct * from inter.readin ?

Quit?