Saturday, October 30, 2010

SAS - Basic statements

SAS is a powerful statistical language which consists bascally of an extended SQL language and other statements. In this post we will summarize the basic ones.

In this page, from the UCLA, there is a basic SAS tutorial.

  • DATA statement
data NAME_DATA;
input VAR_1 VAR_2 VAR_3 $ VAR_4; /* $ indicates VAR_3 is alphanumeric */
cards;
/* start of data reading */
123 1 foo 234
124 2 foo2 221;
run;

  • SET statement (inside DATA)
data NAME_DATA;
set NAME_DATA_2;
run; /* NAME_DATA := NAME_DATA_2 */


  • SET statement with conditional (inside DATA)
data NAME_DATA;
set NAME_DATA_2 NAME_DATA_3;
/* concatenates name_data_2 and name_data_3 */
if VAR_1 = 'yes' then answer = 1;
else answer = 0;
/* NAME_DATA := NAME_DATA_2 U NAME_DATA_3 with the additional answer field in each row */
run;

  • PROC SQL
proc sql;
/* sql code like */
create table NAME_DATA as
select * from ANOTHER_TABLE group by VAR_1;
quit;


It is important to notice that the proc sql statement ends with quit, whereas all other instructions finish with run.

  • LEFT JOIN (inside PROC SQL)
LEFT JOIN instruction allows us to join two tables by some relation between them, BUT with the additional functionality that all rows of the first table (the left table) remain at least with one observation in the resulting table, even if there is no matching between rows of the two tables.

  • SELECT DISTINCT (inside SQL code)
SELECT DISTINCT allows you to ensure that the multiplicity of each element inside a table is one, by including its identifying fields into the select distinct command.

  • How to sort a table?
proc sort data = NAME_DATA;
by VAR_1, descending VAR_2;
quit;

proc sql;
select * from NAME_DATA order by VAR_1, VAR_2 desc;
quit;


Or using the assistent. Menu Data -> Query

  • Variable names: not special chars, only '_'. Name literals finish with n, like 'Hello World!'n

No comments:

Post a Comment