Generating a Random Sample of
Alphabet Letters: Why?
Bruce Ratner, Ph.D.
Data preparation can be defined as your acquaintance with the data to understanding what they tell you. You must 1] insure there are no impossible or improbable values (e.g., age of 120 years, or a boy named Sue, respectively), and 2] audit missing and zero values. Post-audit may demand imputation for missing values. Importantly, data preparation also includes coming face-to-face with the data distribution (shape): Looking for 1) a clump - a mass of data (spike) at a single value (often at zero), or a quantity of data cohering together so as to make one body of indefinite shape; and 2) a gap - an intervening space between two nonconsecutive adjacent values. Effective data preparation includes spreading out the clumps, closing in the gaps, and reshaping the data in the desirable and reliable bell-shape curve.
The purpose of this article is to provide an unthought-of devise for the data preparation tool kit - the SAS-code for a random-alphabet function, which generates a uniform distribution of alphabet letters. I provide several illustrations as to why this handy implement is a welcomed addition to the data analyst’s tool kit. If you have an interesting data-prep application (not random passwords) of the random-alphabet function, I would appreciate your thought-of idea. Please email me. Thanks.
SAS-code for a Random-alphabet Function
do i=1 to 10000;
1 800 DM STAT-1, or e-mail at firstname.lastname@example.org.