Once SASPy is configured the first step in working with SASPy is to initiate a SAS session from a Python environment. This will launch a SAS session in the background that will be available to run statistical analyses on any input data. After the SAS session is initiated, the general order of steps is to send a Pandas data frame to SAS, submit SAS commands from the Python session, retrieve statistical output or data from SAS to the Python environment. The approach is very similar to that of R and Python via the reticulate package.
# Data manipulationimport pandas as pd# Module with a sample data setimport bambi as bmb# Interface with SASimport saspy# Loads a custom functionfrom my_fx.utilities import format_pval_df
WARNING (pytensor.configdefaults): g++ not available, if using conda: `conda install gxx`
WARNING (pytensor.configdefaults): g++ not detected! PyTensor will be unable to compile C-implementations and will default to Python. Performance may be severely degraded. To remove this warning, set PyTensor flags cxx to an empty string.
sas = saspy.SASsession(cfgname ='autogen_winlocal')
SAS Connection established. Subprocess id is 21596
Load an example data set
data = bmb.load_data("sleepstudy")
data.head()
Reaction
Days
Subject
0
249.5600
0
308
1
258.7047
1
308
2
250.8006
2
308
3
321.4398
3
308
4
356.8519
4
308
Send data to SAS
The next step in working with SASPy is to send a Pandas data frame to SAS. This command will send the data frame “data” to the background SAS session. Before sending data to SAS, it may be a good idea to double check that SAS has the proper formatting for dates and that the values, if categorical are recoded to comply with SAS column and value conventions. By default this data will be named _df and will be found in the work library
sas_data = sas.df2sd(data, verbose =False)
Submit SAS commands
The main functions to submit sas commands on data that is available in the sas session are sas.submit() and sas.submitLST(). The primary difference is that the LST version of the function will display the log and any output in the viewer when working in Positron. I personally use the LST version of the function to ensure that the SAS procedures are running correctly. When it is determined that the SAS procedures are running correctly. I then will remove the LST and then extract the tables from SAS to display in a Quarto document. To save the output of SAS procedures I use ods output statements as in the example below.
# Use sas.submitLST() to display output in viewer in an interactive session,# but use sas.submit() when rendering a .qmd document.c = sas.submit("""ods output Tests3=type3_results;proc mixed data = work._df; class Subject Days; model Reaction = Days; random intercept/subject = Subject;run;""")
Retrieve ods output tables from SAS
In the code chunk above, we set ods output to save the Type 3 sums of squares results to a table named type3_results. We can then retrieve that table from SAS into our Python environment. Once in the Python environment, the tables can be formatted to your liking and purpose. Here’s an example of how to format the Test3 table using a custom function to format the p-values.
type3_results = sas.sasdata("type3_results", libref ="work").to_df()# Format the p valuestype3_results["ProbF"] = format_pval_df(type3_results['ProbF'])# Round all numberical values, set index and displaytype3_results.round(2).set_index("Effect")