DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

A Very Automatic Coding of
 Dummy Variables

for Database Response Modeling
Bruce Ratner, Ph.D.

Qualitative variables, such as gender and marital status, always represent valuable information for the modeling process. However, most modeling techniques cannot directly accept the contextual values (e.g., male or female; married or single) of qualitative variables. Dummy variable coding is the method used to transform qualitative variables into numerical "dummy" variables ready for the modeling process. Manually coding qualitative variables into dummy variables is a tedious task. This article provides a SAS-code program that very automatically creates dummy variables. The program should be a welcomed entry in the toolkit of data analysts who frequently work with qualitative data.

Illustration

data IN;
input ID 2.0 GENDER $1. MARITAL $1.;
cards;
01MS
02MM
03M
04
05FS
08FM
07F
08 M
09 S
10MD
;
run;
data IN;
set IN;
GENDER_ = GENDER; if GENDER =' ' then GENDER_ ='x';
MARITAL_= MARITAL;if MARITAL=' ' then MARITAL_='x';
run;

proc transreg data=IN DESIGN;
model class (GENDER_ / ZERO='x');
output out = GENDER_ (drop = Intercept _NAME_ _TYPE_);
id ID;
run;
proc print; run;

proc sort data=GENDER_ ;by ID;
proc sort data=IN ;by ID;
run;

data IN;
merge IN GENDER_ ;
by ID;
run;
proc print data=IN;
run;

proc transreg data=IN DESIGN;
model class (MARITAL_ / ZERO='x');
output out=MARITAL_ (drop= Intercept _NAME_ _TYPE_);
id ID;
run;
proc print;
run;

proc sort data=MARITAL_;by ID;
proc sort data=IN ;by ID;
run;

data IN;
merge IN MARITAL_;
by ID;
run;
proc print data=IN;
run;




For more information about this article, call Bruce Ratner at 516.791.3544,
1 800 DM STAT-1, or e-mail at br@dmstat1.com.