Statistical Analysis Software (SAS) is an integrated software suite used for data analytics, business intelligence, predictive analytics, and data management. It is developed by the SAS Institute and provides a graphical point-and-click user interface for new users.

With SAS, you can perform these tasks:

- It will access data in any format, i.e., SAS tables, database files, and MS Excel tables.
- It will manage the existing data to provide the data that you need.
- It will interpret the existing data using statistical techniques varying from detailed measures such as correlations to logistic regression and mixed models to complicated methods such as Bayesian hierarchical models and modern model selection.
- It will present the result of the analysis in a significant report in multiple formats such as HTML, RTF, and PDF

**Q1. What are the features of SAS?**

**Ans. **Following are the features of SAS:

- Strong data analysis abilities
- Flexible 4 generation programming language (4GL)
- SAS studio
- Support for various types of data format
- Report output format
- Data encryption algorithm

**Q2. Explain some capabilities of the SAS framework.**

**Ans. **Following are some capabilities of SAS framework:

**Access:**SAS allows us to access data from various sources such as Excel file, SAS datasheets, Oracle database, and more.**Manage:**It allows us to manage data to subset data, create variables, data cleaning, and data validation.**Analyze:**After managing process, it will analyze the data to perform simple evaluates such as frequency and averages and complex analyses, including forecasting and regression.**Present: It helps to present the data**in the form of a list, graphic report, and summary. We can print this report, publish it online, or write them to data files.

**Q3. Explain the basic structure of the SAS base program?**

**Ans. **The basic structure of SAS consist of:

- ‘==DATA’ step, which recovers & manipulates data.
- ‘==PROC’ step, which interprets the data.

**Q4. Explain the difference between SAS functions and procedures.**

**Ans. **Functions expect argument value to be supplied across an observation in an SAS data while the procedure expects one variable value in an observation.

data average ;

set temp ;

avgtemp = mean( of T1 – T24 ) ;

run ;

The expressions of the main function are taken under observation where the “mean” function calculates the average of the different values in the observation.

proc sort ;

by month ;

run ;

proc means ;

by month ;

var avgtemp ;

run ;

Here, “proc” is used to calculate the average temperature by month, and this variable is used for denoting the procedure that means the variable month.

**Q5. What does P-value signify about the statistical data?**

**Ans. **P-value is used to determine the observed result of the test in statistics. P-value makes the task easy for the users by providing a conclusion, and the value is always between 0 and 1.

- If P-Value > 0.05 then it denotes weak evidence against the null hypothesis, which means the null hypothesis cannot be declined.
- If P-value <= 0.05, it denotes strong evidence against the null hypothesis, and indicates that the null hypothesis can be rejected.
- If P-value=0.05, which is the marginal value, indicates that it is possible to go either way.

**Q6. If a variable contains special characters or letters, can it be numeric data type?**

**Ans. **No, it will be defined as ‘character’ data type.

**Q7. When looking for data contained in a character string of 150 bytes, which function is the best to locate that data: scan, index, or indexc?**

**Ans. **Index function – Searches a character expression for a string of characters.

SAS Statements |
Results |

a=’ABC.DEF (X=Y)’;
b=’X=Y’; x=index(a,b); put x; |
10 |

#### .

**Q8. Write code using PROC SORT on a data set containing State, District, and County as the primary variables, along with several numeric variables.**

**Ans. **Syntax:

Proc sort data= Dist_County;

By state district city;

Run;

**Q9. Explain the difference between the SAS sum function and using the “+” operator?**

**Ans. **In SAS, sum function returns the sum of missing and non-missing arguments, whereas the “+” operator returns a missing value if any argument or value is missing.

Example:

data mydata;

input x y z;

cards;

33 3 3

24 3 4

24 3 4

. 3 2

23 . 3

54 4.

35 4 2

;

run;

data mydata2;

set mydata;

a=sum(x,y,z);

p=x+y+z;

Run;

In this code, value of p is missing from 4th, 5th, and 6th observation

Output:

a p

39 39

31 31

31 31

5.

26.

58.

41 41

**Q10. How do you remove duplicate values in SAS?**

**Ans. **There are three methods to delete duplicate observations in the datasheet:

- By using nodups in the procedure

Proc sort data=SAS-Dataset nodups;

by var;

Run;

- By using an SQL query

Proc sql;

Develop SAS – dataset as select * from Old-SAS-Dataset where var=distinct(var);

Quit:

- By cleaning the data

Set temp;

By group;

If first.group and last.group then

Run;

**Q11. What is the length assigned to the target variable by the scan function?**

**Ans. **The length assigned to the target variable by the scan function is 200.

**Q12. What are the different types of functions in SAS?**

**Ans. **Following are some functions in SAS:

- Scan
- Substr
- Trim
- Catx
- Index
- Tranwrd
- Find

**Q13. Explain the feature of TRANWRD function.**

**Ans. TRANWRD **function removes and replaces all the occurrences of a given word. It does not remove trailing blankets in the replacement string and the target string.

**Q14. What is the output of the following program?**

data finance;

Amount=1000;

Rate=.075/12;

do month=1 to 12;

Earned+(amount+earned)*(rate);

output;

end;

run;

**Ans. Output:** 12

**Q15. How do you identify the number of iterations and specific conditions within a single do loop?**

**Ans. **Following code will help you to identify the number of iterations and specific conditions within a single do loop:

data work;

do i=1 to 20 until(Sum1>=20000);

Year+1;

Sum1+2000;

Sum1+Sum1*.10;

end;

Run;

In this code, do statement enables you to execute the do loop until the sum is greater than or equal to 20,000 or unit; it occurs till 10 times.

**Q16. What is a Linear Regression in SAS?**

**Ans. **Linear regression is used to find the relationship between a dependent variable and one or more independent variables. If the score of variable Y is predicted from the score of second variable X, then, X is determined as the predicted variable and Y as the criterion variable.

Example: Correlation between two variables

PROC SQL;

create table CARS1 as

SELECT invoice, horsepower, length, weight

FROM

SASHELP.CARS

WHERE make in (‘Audi’,’BMW’)

;

RUN;

proc reg data = cars1;

model horsepower = weight ;

Run;

**Q17. Write a code to print observation 5 through 10 from a dataset>**

**Ans. **The FIRSTOBS= and OBS=data set options allow SAS to print observations 5 through 10 from the data set READIN.

proc print data = readin (firstobs=5 obs=10);

Run;

**Q18. Mention the methods to perform a “table lookup” in SAS.**

**Ans. **Following are the five methods to perform “table lookup” in SAS:

- Match Merging
- Format Tables
- Direct Access
- PROC SQL
- Arrays

**Q19. What are the most common programming errors that occur in SAS?**

**Ans. **Most common programming errors in SAS are:

- Missing semicolon
- Not checking log after submitting program
- Unmatched quotation marks
- Invalid dataset option
- Invalid statement option
- Not using FSVIEW option vigorously
- Not using debugging techniques

**Q20. What is the feature of max() function in SAS?**

**Ans. **max() function is used in the programming to return the largest value.

- x = max(1, 5, -2)

// outputs 5

- x = max(1, null, 6)

// outputs 6

- x = max(-2)

// outputs -2

- x = max(7, -3*1.5)

// outputs 7

**Q21. Explain the difference between VAR B1 – B3 and VAR B1 — B3?**

**Ans.** A single dash “-” implies the consecutively numbered variable. A double dash “–” implies variables available in the dataset.

Example:

Data Set: ID NAME B1 B2 C1 B3

- B1 – B3 would return B1 B2 B3
- B1– B3 would return B1 B2 C1 B3

**Q22. How to minimize the number of decimal places for the variable using PROC MEANS?**

**Ans. **You can limit the decimal places by using **MAXDEC**=option.

**Q23. Mention the default statistics that PROC MEANS produce?**

**Ans. **Following are the default statistics produce by PROC MEANS:

- N
- MN
- MAX
- MEAN
- STD DEV

**Q24. Explain the condition where you code a SELECT construct instead of IF statements?**

**Ans.** When you have numeric values and a long series of exclusive conditions, then it is better to use the SELECT group rather than IF-THEN or IF-THEN-ELSE statements. It also reduces the CPU time.

The syntax for SELECT WHEN is as follows:

SELECT (condition);

WHEN (1) x=x;

WHEN (2) x=x*2;

OTHERWISE x=x-1;

END;

Example :

SELECT (str);

WHEN (‘Sun’) wage=wage*1.5;

WHEN (‘Sat’) wage=wage*1.3;

OTHERWISE DO;

wage=wage+1;

bonus=0;

END;

END;

**Q25. How to create a permanent SAS dataset?**

**Ans. **There are two ways to create permanent SAS dataset:

- Assign a library and engine
- Create the data, assign both the library and dataset name to make the dataset permanent