Python and SAS: Moving Data From SAS to Tableau
Advanced analytics is critical to the modern organization. However, with products like SAS, the data and information generated is often stored in proprietary data sets. This means that visibility to that data and related information is essentially locked up to everyone except the analysts themselves.
There are two distinct use cases for moving data out of proprietary data sets, such as SAS, and into a leading visualization tool like Tableau:
- Enhance the analyst’s exploratory capability via advanced visualization.
- Publish intuitive, visual reports to a broad audience.
The key is to transfer the data stored in SAS data sets into Tableau in an efficient manner. And there are several available, for example, you can:
- Store data in an open relational database that both SAS and Tableau can access directly.
- Export SAS data into a CSV and import the CSV data into Tableau, implemented as a manual process or automate the process using Python.
- Connect SAS data sources to Tableau file targets via Python lists and transport data in-memory.
While there are various methods to move data from proprietary data sets like SAS to Tableau, we believe leveraging Python affords more flexibility and options. To that end, outlined below is a very simple way to use open source software to move data from a SAS dataset (sas7bdat) to a Tableau file. Specifically, extracting SAS data to a csv file and then to TDE. It is not the most efficient method, but our goal for this blog is to merely introduce the concept of sharing data between SAS and Tableau using a small amount of coding.
Assuming that you have already installed Python 2.7, the key steps are outlined as follows:
Step 1: Prep
Download the Python API from Tableau and install.
Download the Python sas7bdat library and install.
Locate the Tableau Python code csv2tde.py and the related schema.ini file that should be located in the Tableau API samplesdirectory.
Copy both files into a working directory of your choice.
Copy a SAS dataset (*.sas7bdat) file into your working directory.
Step 2: Code Modification of csv2tde.py.
1. Open csv2tde.py with whatever Python editor and add the following library:
# Import SAS module
from sas7bdat import *
2. Locate the following lines of code (starts on line 60 in the version I used):
# Identify CSV input
if len(sys.argv) < 2 :
raise NameError('csv filename argument missing')
csvFile = sys.argv[1];
3. Replace (or comment out) the above code with the following code that reads your SAS data and creates a csv file:
# Read SAS file and instantiate the sas7bdat class
sas = SAS7BDAT('YOUR_SAS_FILE.sas7bdat')
# Create the SAS data into a CSV file
sas.convertFile('YOUR_SAS_FILE.csv', ',')
# Assign your CSV file to an internal variable that Tableau requires
csvFile = 'YOUR_SAS_FILE.csv'
Step 3: Modify the schema.ini file as needed.
Change the name of the csv file in brackets from ‘orders.csv’ to ‘YOUR_SAS_FILE.csv’ name.
List out the columns you are importing from your csv file and give them a target name to be used in Tableau.
Since many date types are problematic coming from SAS, it is often easier to import them into Tableau as CharString as opposed toDateTime or Date. Once in Tableau, it is merely a matter of changing the data type to date.
Step 4: Run your Python code as modified.
The previous steps provide you a quick exercise to understand the basics of moving SAS (specifically, sas7bdat files) data into a Tableau file using Python. The exercise exposes you to free Python libraries, Tableau APIs, and some coding techniques for creating Tableau files.
A version of this article was published on http://www.streamintegration.com.