sas - Transform a SQL query to a function applicable to a data columns - Stack Overflow

admin2025-05-02  0

I have a problem in SAS. I would like to create a function with the proc fcmp which uses this data:

DATA TAB_PARAM;
infile cards delimiter=',';
input CAT_A $ CAT_B $ ;
cards;
AA1,XXX
BB1,YYY
CC1,ZZZ
;
run;
DATA SOURCE;
infile cards delimiter=',';
input CAT_A $ CAT_B $ VALUE;
cards;
AA1,XXX,200
AA1,XXX,500
AA1,ZZZ,800
BB1,YYY,100
BB1,YYY,300
BB1,ZZZ,700
CC1,XXX,900
CC1,ZZZ,600
CC1,ZZZ,400
;
run;
%macro extraction(ca,cb);
proc sql noprint;
select distinct sum(value)as value into:  val 
from source
where CAT_A in ("&ca.") and CAT_B in ("&cb.")
;
quit;
%put val=&val.;
%mend;
%extraction(AA1,XXX);
PROC FCMP OUTLIB=work.funcs.sql; 
FUNCTION calc(ca $,cb $) ; 
rc = RUN_MACRO('extraction',ca,cb,val); 
RETURN(val);
ENDSUB;
run;
OPTIONS cmplib = work.funcs;
data _null_;
ca='BB1';
cb='YYY';
val=calc(ca,cb) ; 
run;
OPTIONS cmplib = work.funcs;
DATA TAB_PARAM_BIS; 
SET TAB_PARAM;
val=calc("CAT_A" , "CAT_B");
RUN;    

The problem is that the function always returns empty values.

Could you help me correct this program?

Thank you in advance for your help

I expect a table TAB_PARAM_BIS like that:

CAT_A   CAT_B   val
-------------------
AA1     XXX     700
BB1     YYY     400
CC1     ZZZ    1000

instead of:

CAT_A   CAT_B   val
-------------------
AA1     XXX     .
BB1     YYY     .
CC1     ZZZ     .

I have a problem in SAS. I would like to create a function with the proc fcmp which uses this data:

DATA TAB_PARAM;
infile cards delimiter=',';
input CAT_A $ CAT_B $ ;
cards;
AA1,XXX
BB1,YYY
CC1,ZZZ
;
run;
DATA SOURCE;
infile cards delimiter=',';
input CAT_A $ CAT_B $ VALUE;
cards;
AA1,XXX,200
AA1,XXX,500
AA1,ZZZ,800
BB1,YYY,100
BB1,YYY,300
BB1,ZZZ,700
CC1,XXX,900
CC1,ZZZ,600
CC1,ZZZ,400
;
run;
%macro extraction(ca,cb);
proc sql noprint;
select distinct sum(value)as value into:  val 
from source
where CAT_A in ("&ca.") and CAT_B in ("&cb.")
;
quit;
%put val=&val.;
%mend;
%extraction(AA1,XXX);
PROC FCMP OUTLIB=work.funcs.sql; 
FUNCTION calc(ca $,cb $) ; 
rc = RUN_MACRO('extraction',ca,cb,val); 
RETURN(val);
ENDSUB;
run;
OPTIONS cmplib = work.funcs;
data _null_;
ca='BB1';
cb='YYY';
val=calc(ca,cb) ; 
run;
OPTIONS cmplib = work.funcs;
DATA TAB_PARAM_BIS; 
SET TAB_PARAM;
val=calc("CAT_A" , "CAT_B");
RUN;    

The problem is that the function always returns empty values.

Could you help me correct this program?

Thank you in advance for your help

I expect a table TAB_PARAM_BIS like that:

CAT_A   CAT_B   val
-------------------
AA1     XXX     700
BB1     YYY     400
CC1     ZZZ    1000

instead of:

CAT_A   CAT_B   val
-------------------
AA1     XXX     .
BB1     YYY     .
CC1     ZZZ     .
Share Improve this question edited Jan 2 at 12:52 marc_s 757k184 gold badges1.4k silver badges1.5k bronze badges asked Jan 2 at 12:41 newcomer2025newcomer2025 212 bronze badges 1
  • Is there a reason to not use plain SQL with an inner join and group? – Richard Commented Jan 3 at 14:58
Add a comment  | 

3 Answers 3

Reset to default 0

Welcome :-) Good question.

To simplify things, I removed the positional parameters in the macro declaration, assuming you want to use this only in a PROC FCMP context.

The short answer to your question is that character variables passed via RUN_MACRO are initialized with both leading and trailing single quotation marks within the macro. Therefore, whenever RUN_MACRO passes a character variable, the Dequote Function is typically called via %Sysfunc to remove the single quotes.

See the edited code below.

%macro extraction();

   %let ca = %sysfunc(dequote(&ca.));
   %let cb = %sysfunc(dequote(&cb.));

   proc sql noprint;
   select distinct sum(value) as value into: val separated by ''
   from source
   where CAT_A in ("&ca.") and CAT_B in ("&cb.")
   ;
   quit;

%mend;

proc fcmp outlib = work.funcs.sql; 
   function calc(ca $,cb $) ; 
   rc = run_macro('extraction', ca, cb, val); 
   return(val);
   endfunc;
run;

options cmplib = work.funcs;

data tab_param_bis;; 
   set TAB_PARAM;
   val = calc(CAT_A , CAT_B);
run; 

Result:

CAT_A  CAT_B  val
AA1    XXX    700
BB1    YYY    400
CC1    ZZZ    1000

As @PeterClemmensen alluded to the issue is that RUN_MACRO does not pass the optional arguments you provide it to the macro call itself. Instead it creates macro variables that are external to the macro. Essentially GLOBAL macro variable although the scoping issues get complex with RUN_MACRO. So a call like this:

run_macro('extraction', ca, cb, val)

Essentially results in three %LET statements followed by a macro call. So something like:

%let ca=value1;
%let cb=value2;
%let val=value3;
%extraction;

By defining parameters to the macro using the same names as used in the call to RUN_MACRO you created LOCAL macro variables that hid access to the values of the external macro variables that RUN_MACRO used to store the values. (For a solution for how to retrieve values from such hidden macro variables use this %symget() macro definition.)

The wrinkle he discovered hiding in the EXAMPLE section of the documentation is that for character values the value is quoted with single quotes. With a little experimentation it is possible to find that for numeric values the BEST12. format is used to convert the number into a string that can be stored in the macro variable.

So for your example just change your macro to reference already existing macro variables instead of actual parameters. You can also assume the values are already quoted so they can be used directly to generate the SQL.

%macro extraction;
%* Inputs CA and CB ;
%* Outputs VAL ;
proc sql noprint;
select distinct sum(value) format=best32. 
  into :val trimmed
  from source
  where CAT_A = &ca and CAT_B = &cb
;
quit;
%mend;

The macro variable names used by the macro for inputs (and outputs) need to match the names used in the RUN_MACRO call in your function definition. In this case CA and CB are the inputs and VAL is the output.

FUNCTION calc(ca $,cb $) ; 
  rc = RUN_MACRO('extraction',ca,cb,val); 
  RETURN(val);
ENDSUB;

Finally change your data step to pass in the actual values of the variables, not their names, when calling your user defined function.

data TAB_PARAM_BIS; 
  set TAB_PARAM;
  val=calc(cat_a,cat_b);
run;  

Also note that RUN_MACRO() does not trim the trailing spaces from the character values you pass it. You can see this if you turn on MPRINT option:

MPRINT(EXTRACTION):   proc sql noprint;
MPRINT(EXTRACTION):   select distinct sum(value) format=best32. into :val trimmed from source
where CAT_A = 'AA1     ' and CAT_B = 'XXX     ' ;
MPRINT(EXTRACTION):   quit;
MPRINT(EXTRACTION):   proc sql noprint;
MPRINT(EXTRACTION):   select distinct sum(value) format=best32. into :val trimmed from source
where CAT_A = 'BB1     ' and CAT_B = 'YYY     ' ;
MPRINT(EXTRACTION):   quit;
MPRINT(EXTRACTION):   proc sql noprint;
MPRINT(EXTRACTION):   select distinct sum(value) format=best32. into :val trimmed from source
where CAT_A = 'CC1     ' and CAT_B = 'ZZZ     ' ;
MPRINT(EXTRACTION):   quit;

Which is fine for SAS queries, but if your actual dataset is in an external database that considers the trailing spaces significant you might need to add TRIM() function in the call.

data TAB_PARAM_BIS; 
  set TAB_PARAM;
  val=calc(trim(cat_a),trim(cat_b));
run;  

Uncomplicated, plain SQL

proc sql ;
  create table want as
  select source.cat_a, source.cat_b, sum(value) as result
  from source
  join param
    on source.cat_a = param.cat_a and source.cat_b = param.cat_b
  group source.cat_a, source.cat_b
;

SQL will deal with all the internal looping and per parameter application.

You could even use the SAS SQL proprietary keyword NATURAL to apply the join criteria

  create table want as
  select source.cat_a, source.cat_b, sum(value) as result
  from source
  natural join param
  group 1, 2
转载请注明原文地址:http://anycun.com/QandA/1746119695a91940.html