Tag Archives: SAS

R for SAS and SPSS Users, A Review

If you’ve followed me for any length of time, you’ll know that I’m bullish on good data analysis. I am not a programmer by training but, through a variety of projects, I’ve found a solid understanding of data manipulation has served me well in my work as an accountant. My primary tool for 4 years has been SAS. It is an awesome, easy to learn language that has allowed me to look like a hero on numerous projects. However, starting next week, I’m moving to a company that doesn’t offer SAS as an option.

I didn’t want to lose the edge that a good analytics package offers, so I started looking around for an open source alternative. That’s when I found “R”.

Like SAS, R is a statistics package. It’s designed for statisticians to perform t-tests, calculate standard deviations, and all the other black-box stuff that gives auditors the warm-fuzzies. However, like SAS, it also has strong data manipulation abilities. There is even a package called sqldf, that allows users to write SQL statements in the R command line.

Not surprisingly, because the name “R” is so generic, it is nearly impossible to find good tutorials online through tradition searching. So, I decided it was time to buy a book. Fortunately, Robert Muenchen from University of Tenn, has written a book for people exactly like me: R for SAS and SPSS Users (ISBN-10: 0387094172).

Unlike a lot of programming books, Muenchen’s is packed with examples and real world applications. He walks readers through a particular scenario, then shows how to accomplish with SAS, SPSS and R. It’s like a Rosetta Stone for users needing to move from one language to the other. Overall a great help and I’d highly recommend it.

Mapping an MS Access Database to SAS Library

This post is for people, like myself, who assume things are harder than they usually are. Worked on an analysis yesterday where I needed to map an Access database to a SAS library so I could use the tables in some joins. My first thought was ODBC. I proceeded to setup the connection, only to find out my company didn’t give everyone that ability. Bummer. 15 minutes wasted.

Searched around and couldn’t find any other ideas. Then I thought, I’m going to try to just reference the database directly in the LIBNAME statement and see what happens. It worked. So, this post is to save people who, like myself, usually try to hard way first, only to find their way back to simplicity.

Example:
LIBNAME TEST "C:\test\database.accdb";

JoA: Dashboard Your Scorecard

The Journal of Accountancy had a good article last month on building Excel-based dashboards. It got the creative juices flowing for me, and I am thinking about integrating some dashboards into my daily work.

Modern BI tools like SAS and Business Objects can be used to automate feeds to predesigned Excel dashboards which can then be accessed by anyone over a shared network drive. The hardest (and most important) step is determining the appropriate metrics.

Dashboard reports created in Microsoft Excel are powerful, flexible and easy to design. In much the same way that an automobile dashboard graphically displays numerous measures of performance from the gas level to oil pressure, a computer dashboard presents critical data in a variety of visual formats. From this organized visual display, optimal business decisions can be made quickly and efficiently.

Should Accountants Be Required to Code

Sometimes, I wonder where I fit on the career spectrum. I am trained as a CPA and 90% of my daily work involves accounting in some form. However, I am also a programmer who geeks out when I find a new way to automate processes or transform data into knowledge.

My boss made a comment a few weeks ago in a meeting, “If you want to get Jared excited, ask him about two things: 1) His family or 2) SAS.” Sometimes, I feel like my two halves are exclusive. I don’t know how they fit together. But, then there are days when someone comes to me looking for information (which they have been told was impossible to get), and I can write some code to pull it together in an hour. Those are good days.

Here’s the thing. I think programming makes me a better accountant. There is something about starting with a blank slate, then outlining, writing, and debugging code that forces you to think logically.

It also helps me to work more efficiently. I had coffee with a former Deloitte colleague over the weekend. We reminisced about 80 hour workweeks and what we learned from those experiences. We concluded that if there is one thing our Big 4 experience taught us to be, it’s efficient. If you were not efficient, you drowned from the sheer volume of work. Efficiency meant I got to have dinner with my wife, instead of staring at workpapers.

There are certain tasks that humans are good at (recognizing patterns), but there are others where computers are far superior. Combing through thousands of records and joining two tables by hand is something I could do. But, I will probably miss something, and it would take days (if not weeks). OR, I could write something that looks like this and let the computer do it in less than a minute with 100% accuracy.

PROC SQL;
CREATE TABLE &Table. AS
SELECT A.*, B.*
FROM &Table1. A INNER JOIN &Table2. B
ON A.&Field1. = B.&Field1.;
QUIT;

Each day, accountants are increasingly asked to take on responsibilities that used to be confined to IT (report writing, data analysis, etc.). However, most of my peers rely on Excel for all but the most complicated analysis. How much time and effort is wasted in performing tasks that a computer should really be doing?

This week is Computer Science Education Week. While I don’t think all accountants need to learn C++, I think it’s time our colleges added basic courses in programming to accounting degrees. Like it or not, our profession is shaped by technology. Simple AIS and MIS courses just aren’t enough to prepare students for what is required of them in the workplace, and it will only get more complex from here.

SAS EG: Filtering a dataset via prompt manager variable

Nobody who reads this blog is likely a SAS programmer. However, on the off chance this would be useful to someone who finds it via Google, I wanted to post.

Currently, I am working on a project in SAS Enterprise Guide where I needed to allow the user to provide variable value(s) via prompt manager and then filter the resulting dataset based on the user’s input. After rubbing a few brain cells together, created a macro that does the trick. It works by embedding a loop inside of a WHERE statement in SQL. If the user selects one value the loop ends and returns that value into the WHERE statement. However, if multiple values are selected, the loop iterates until all values have been returned.

There’s probably a better way, but at least this works. Here’s a quick outline of the variables.

PRODUCTCATEGORY: This is the name of the prompt variable I setup. Replace it with your variable of choice.
IN: Input dataset
OUT: Output dataset
FIELD: Variable in the dataset on which the filtering is performed.


%MACRO PROCESS(IN,OUT,FIELD);
PROC SQL; CREATE TABLE &OUT. AS SELECT * FROM &IN.
WHERE &FIELD. IN (
%IF &PRODUCTCATEGORY_COUNT = 1 %THEN
"&PRODUCTCATEGORY";
%ELSE %DO I=1 %TO &PRODUCTCATEGORY_COUNT;
"&&PRODUCTCATEGORY&I"
%END;
);
QUIT;
%MEND;

DO Loops: Part Deux (It’s a pun. Get it?)

For Part II in this series of posts on DO Loops in SAS, I wanted to discuss an application slightly more complex than the previous example. Loops can not only be used to perform data correction (as shown in the previous post), but can also be very useful in creating new records where none exist.

We’ll discuss using order data again as our example.

Your client would like you to create a daily trend report showing YTD Quantity Sold by Product by Date. In our made-up table, let’s say for simplicity there are four fields: Order Number, Order Date, Product and Quantity. This fictitious company sells two products (Widgets and Super-Widgets) and follows a calendar fiscal year. The data might look something like this.

OrderNumber OrderDate Product Quanity
2135468 1/2/09 Widget 10
2135469 1/3/09 Widget 17
2135470 1/4/09 Super-Widget 50

You probably noticed, in its current state, the table above cannot generate this report. We don’t have an appropriate date field and quantities aren’t cumulative. The good news is that, utilizing a loop, with a few lines of code we can create a table that will serve our purpose. In order to generate the requested report, we need to create a table that looks like this. Notice that quantities are cumulative and we have a one record for each product every day.

RunDate Product Quantity
1/1/09 Widget 0
1/1/09 Super-Widget 0
1/2/09 Widget 10
1/2/09 Super-Widget 0
1/3/09 Widget 27
1/3/09 Super-Widget 0
1/4/09 Widget 27
1/4/09 Super-Widget 50

So how do we get from Point A to Point B? Good question. There are two steps. Let’s dive in.

 

STEP #1: It’s loop time.

Using a combination of a DO UNTIL loop and the OUTPUT statement, we can generate a new order record for every date that each order existed within the fiscal year. “What?” You might say. Take the first order (2135468). That order was placed on 1/2/09, which means it should be included in the cumulative quantity from 1/2/09 – 12/31/09. Therefore, we need to create 363 new records for that order which contain the dates 1/3 – 12/31, so when we summarize that quantity is included in the total. This can be accomplished by using the following code.


*Fiscal Year start date;
%LET BEGDTE = ‘01JAN2009’D;

*Date of current reporting;
%LET RPTDTE = ‘04JAN2009’D;

DATA TEST_2;
	SET TEST_1;
	RUNDATE = &RPTDTE.;
	DO WHILE (RUNDATE GE &BEGDTE.);
		IF ORDERDATE le RUNDATE;
		OUTPUT;
		RUNDATE = RUNDATE - 1;
	END;
RUN;


This will create a table that contains an individual record for every order on every date that order existed between the reporting date and the order date. If you have a large orders table this step could result in millions of records. Literally.

 

STEP #2: Summarize the data.
After that we can use a simple PROC SUMMARY to summarize the data and we have successfully moved from Point A to Point B.

PROC SUMMARY DATA=TEST_2 NWAY MISSING;
        CLASS RUNDATE PRODUCT;
	VAR QUANTITY;
        OUTPUT OUT=TEST_3 (DROP=_TYPE_ _FREQ_) SUM=;
RUN;


We can now generate the requested trend report and the client is happy. What more could you ask.

DO Loops: Useful Tool in SAS User’s Toolbox

DO Loops in SAS are outside the scope of what most beginning users feel comfortable tackling. However, they are a powerful, time-saving tool in the report writer’s toolbox. For those unfamiliar with programming, loops are a set of instructions that the programmer tells the computer to repeat until predefined condition(s) are true. It’s a simple concept but powerful in execution.

To illustrate, here is an example of where a loop might be used. Say, you have a table that contains order data. This particular company has decided that its Order Number is seven digits long and starts with “0000000”. The next order in sequence would be “0000001” and so forth. Well, the person who sent you the data decided it would be a good idea to put it into Microsoft Excel, which then reads the Order Number field as numeric and promptly removes all the leading zeros.

Perfect. (This has never happened to me…can you tell?) However, with three lines of code a DO Loop can easily correct this issue.

Here we go.

In SAS, there are two types of loops, DO WHILE and DO UNTIL. The main difference is when the logical condition is evaluated.

DO WHILE
The condition is evaluated at the top of the loop before the statements within the DO loop are executed. If the expression is true, the DO loop iterates. If the expression is false the first time it is evaluated, the DO loop does not iterate. Not even once.

DO UNTIL
The condition is evaluated at the bottom of the loop after the statements in the DO loop have been executed. If the expression is true, the DO loop does not iterate again. The DO loop always iterates at least once.

So, using our example above we want to use a DO WHILE loop, which will evaluate the condition before execution and then run if the condition is met. Here is our code. The loop statement is italicized in the middle of the data step.

 

DATA TEST_2;
   SET TEST_1;
      DO WHILE (LENGTH(COMPRESS(OrderNumber)) LT 7);
         OrderNumber="0"||OrderNumber;
      END;
RUN;

 

This code is performing the following steps.

1. Reads in the first record.
2. DO Loop looks at the Order Number field. If the length of the number is less than seven digits, then a “0” is appended to the front.
3. DO Loop checks the condition again. If the length is still less than seven digits, the loop iterates again. If not, the loop ends, the next record is read-in, and the process repeats itself.

This example gives you an idea of how DO Loops might be used, but their application can go far beyond simple data correction. Until next time. Happy coding.