sas - How to loop to get column name? - Stack Overflow

admin2025-04-25  2

The general idea is to get the column name, and calculate the mean and std one by one in SAS. But there are over 50 columns, and no pattern in their column names. I don't know how to get using SAS, but the idea I want is as below by using R:

Let's use a simple data example.

mydata<-data.frame(id=c(1:10),
class=c(1,1,1,1,1,2,2,2,2,2),
age=sample(30:100, 50, replace = TRUE),
bmi=sample(20:30, 50, replace = TRUE),
bmi_week4=sample(20:30, 50, replace = TRUE),
bmi_week8=sample(20:30, 50, replace = TRUE),
hba1c=sample(5:10, 50, replace = TRUE),
hba1c_week4=sample(5:10, 50, replace = TRUE),
hba1c_week8=sample(5:10, 50, replace = TRUE)
)

names<-colnames(mydata[,-c(1,2)])

for (i in 1:length(names)){
this_col=names[i]

#starting from here is the SAS code
  proc univariate
  data = mydata noprint;
  by class;
  var this_col;
  output out = this_col_out
  n = _n mean = _mean std = _std;
  run;

  # I get the format of mean±std for each variable, and save in output_&i.
  data output_&i.;
  set this_col_out;
  this_col=strip(put(_mean,7.2)) ||'±'|| strip(put(_std,8.2));
  run;
  
  #through transpose, I can get the format such as age: 64.5 ± 20.7
  proc transpose data=output_&i. out=w_output_&i.(drop=_label_);
  var this_col;
  run;
}

data all;
set w_output_:;
run;

So how to get the idea from R and integrate it into SAS? Thanks.

The general idea is to get the column name, and calculate the mean and std one by one in SAS. But there are over 50 columns, and no pattern in their column names. I don't know how to get using SAS, but the idea I want is as below by using R:

Let's use a simple data example.

mydata<-data.frame(id=c(1:10),
class=c(1,1,1,1,1,2,2,2,2,2),
age=sample(30:100, 50, replace = TRUE),
bmi=sample(20:30, 50, replace = TRUE),
bmi_week4=sample(20:30, 50, replace = TRUE),
bmi_week8=sample(20:30, 50, replace = TRUE),
hba1c=sample(5:10, 50, replace = TRUE),
hba1c_week4=sample(5:10, 50, replace = TRUE),
hba1c_week8=sample(5:10, 50, replace = TRUE)
)

names<-colnames(mydata[,-c(1,2)])

for (i in 1:length(names)){
this_col=names[i]

#starting from here is the SAS code
  proc univariate
  data = mydata noprint;
  by class;
  var this_col;
  output out = this_col_out
  n = _n mean = _mean std = _std;
  run;

  # I get the format of mean±std for each variable, and save in output_&i.
  data output_&i.;
  set this_col_out;
  this_col=strip(put(_mean,7.2)) ||'±'|| strip(put(_std,8.2));
  run;
  
  #through transpose, I can get the format such as age: 64.5 ± 20.7
  proc transpose data=output_&i. out=w_output_&i.(drop=_label_);
  var this_col;
  run;
}

data all;
set w_output_:;
run;

So how to get the idea from R and integrate it into SAS? Thanks.

Share Improve this question edited Jan 15 at 7:41 DarkBee 15.5k8 gold badges72 silver badges118 bronze badges asked Jan 15 at 7:27 RobinRobin 4591 gold badge3 silver badges17 bronze badges 1
  • 2 If you want to find the MEAN for a bunch of variables you can just list them all in the VAR statement of your PROC MEANS procedure call. It is very inefficient to read through the data multiple times instead of doing them all at once. – Tom Commented Jan 15 at 14:13
Add a comment  | 

3 Answers 3

Reset to default 5

You can use stackodsout to get a summary listing

ods output summary=want (drop=_control_);
proc means print data=have stackodsoutput sum mean std nway ;
  class class ;
run ;

will produce this output data set

You can call all variable names that you want to get statistics on by using PROC MEANS with the output statement and using the autoname option. For example:

proc means data=sashelp.cars noprint;
    by make;
    output out=stats
        mean =
        std  =
        / autoname
    ;
run;

I think you can make use of a macro program and macro variables, along with some macro functions, to accomplish this.

Maybe something like this is what you are looking for (or at least gets you nearer to the solution)?

I have loaded part of your data as an example

   data mydata;
    input id class age bmi bmi_week4 bmi_week8 hba1c hba1c_week4 hba1c_week8;
    cards;
    1     1  40  30        30        23     5           7           9
    2     1  43  29        21        21     8           6           8
    3     1  30  26        23        23     8           9          10
    4     1  82  21        25        22     6           7           9
    5     1  90  24        30        29    10           5           9
    6     2  97  23        27        28     6          10          10
    7     2  91  27        21        29     7           7           9
    8     2  46  24        30        26     8          10           5
    9     2  68  25        23        25     7           6          10
    10    2  51  24        24        28    10           9           8
    1     1  74  23        28        24    10          10           5
    2     1  39  28        20        27     9           6           9
    3     1  66  28        24        20     9           8           7
    4     1  74  30        29        25     9           8           6
    5     1  61  27        28        24     7          10           9
    6     2  67  23        29        23     8           9          10
    7     2  65  22        22        25    10           7          10
    8     2  62  20        24        24     8           7           8
    9     2  86  20        20        23     9           7           5
    10    2  77  30        23        25     8           7           5
    ;
    run;
    
    
Proc sql noprint; *this puts all column names in the macro variable names;
select name into :names separated by ' ' from
dictionary.columns
where upcase(memname) = 'MYDATA'
and upcase(libname)  = 'WORK';

%put &names;
    
    proc sort data = mydata; * you need to sort by class if you want to use by class later;
    by class;
    run;
    
    %macro singleuni();
    
    %let count=%sysfunc(countw(&names,%str( ))); *counts number of items in mv names;
    
    %DO i = 3 %to &count; *loops across all the elements listed in names, skipping the first two (id and class);
    
    
    %let VarInt= %scan(&names, &i); *takes the element i in names;
    
    proc univariate
      data = mydata noprint;
      by class;
      var &VarInt;   
      output out = &VarInt._out
      n = _n mean = _mean std = _std;
      run;
    
      data output_&i.;
      set &VarInt._out;
      this_col=strip(put(_mean,7.2)) ||'±'|| strip(put(_std,8.2));
      run;
      
      proc transpose data=output_&i. out=w_output_&i.;
      var this_col;
      run;
    %end;
    
    %mend;
    
    %singleuni();

Edit to answer comment:

If you want to put everything together and avoid the name column being truncated, just specify the length statement:

   data all;
    length _name_ $30; *this makes it;
    set w_output:;
    run;
转载请注明原文地址:http://anycun.com/QandA/1745595330a90942.html