"It’s not as scary as I thought!"
“It’s not as scary as I thought!”
I love working with my student workers.
There are many reasons why I love them but one of the reasons is because they agree with my thoughts on how work should be done in the office. They are willing to learn and practice what they have learned for the next project.
When I taught them how to correctly input data, structure the dataset, and utilize the things that I am going to share in this series— XLOOKUP, Pivot Table, and PowerQuery— one of the common reactions was “It’s not as scary as I thought!”
It seems to me that one of the common myths in higher education administration is this:
Having a data management/analysis mindset or learning data management tools is only for people who have “data” in their job titles.
Throughout this series, I would like to share how to make data-driven decisions for all college administrators regardless of their positions.
If you are wondering why you should set aside your work hours to learn these tools (or read this article), you will be able to answer why by yourself after witnessing how this would change your work.
For example, here’s an example. It may look like a list of students that your office manages!
The column YR is supposed to show the start term of each student. As you can tell, there was no rule or set format to note the term.
Fall 2017 can be written F17, F’17, F 17, FA 17 just to name a few depending on how you feel on that day.
When you read each row, you may have no problem- OK the student started his/her study back in F 17.
However, when the office needs to report the total number of students who started back in Fall 2017 or to see the enrollment trend, you need to manually count the number of students to add students who started in F 17, F17, F’17, etc.
Even with filters, it may take more time than it should. This is especially true if you have thousands of data.
"If you don’t have some sort of format to note each data, the dataset has no power or story to tell you. What is more, numbers or percentages may not be accurate which becomes unreliable data."
In short, a non-standardized dataset makes it challenging for the office to make informed decisions that include future recruitment, enrollment, and retention strategies.
Today’s takeaway: Standardize the data and be consistent
To put it simply, choose how you are going to write the term -Fall 2017 -and stick with it unless there’s an absolute need to change it.
Now most of us work as a team- the question is how can we ensure standardized data usage?
There are several ways to do this: you can create the drop-down menu/select list in Microsoft EXCEL and/or Google Spreadsheet.
What I did is I created the query in Slate CRM (Consumer Relationship Management) that returns the same value for each semester so the value (like start term in this case) can be the same all across the board from the application to the query and report.
Similarly, when it is inevitable to add more columns, I usually go all the way right and add columns there.
By doing so, the structure of the data is not touched even when all the previous datasets are added in one place.
With a few clicks or keyboard shortcuts, it will be much easier to get the historical enrollment numbers.
Before I move on to the next data management tip, I wanted to explain several ways to clean up the non-standardized data that you saw above.
How to clean up non-standardized data
Method 1. Use Filter or Replace in Microsoft Excel or Google Spreadsheet
The first method does not require any fancy knowledge.
What you do is click the Filter button or use keyboard shortcuts (Mac: cmd+F/ Window: ctrl +F) then click Replace to replace each not-standardized data with standardized data.
For example, you can find “F” and replace it with “Fall.” You will need to do the same for all variables like “Fa’ ” to “Fall” until you have just one format: “Fall 2017.”
Note that Find/Replace is space-sensitive meaning you would need to actually type “ “ to find all values with space next to the character (F in this case).
Likewise, you can filter each variable and change the values to standardized values.
The downside of utilizing filter/replace is it is a labor-heavy solution in which you need to do the same process for all sheets if you have more than one sheet to standardize (like I do!).
Method 2. Use PowerQuery
PowerQuery may sound like not-college-administrator-friendly but it can be your go-to solution to minimize the time-consuming labor if you lower down your guard just like I did!
There are many YouTubers who create step-by-step guides so I recommend watching some videos if you are new to PowerQuery.
1. Turn each sheet into the table and name each table. Make sure to hit the “Save” button before you launch PowerQuery
2. Click Data- From Table/Range to launch PowerQuery editor
3. To standardize the data, click Replace Values to replace the non-standardized data with standardized data
For example, replace “Sp 17” to “Spring 2017.”
*Note that it is space-sensitive. For example, “Fall “ needs to be updated to “Fall”
4. When the data clean up/ standardized process is done, click the Close and load to… button to close the PowerQuery editor
5. Ta-da! The new, standardized dataset is available for you to analyze.
Method 3. Use SQL
Lastly, the same result can be achieved with the SQL REPLACE function:
Screenshots will be added later. Stay tuned!
What are the real and practical benefits of having standardized datasets?
Let’s suppose that you need a list of students who graduated in the Spring 2017 term.
Before, you had to find all the same values in different formats to see Spring 2017 graduates.
After the standardization, you can simply filter the Spring 2017 out to get the list. It minimizes the possibility of human errors and makes datasets to be more accurate and reliable.
We explored three ways to perform data cleanup to get a standardized dataset. I would personally recommend using PowerQuery out of three options.
"What makes PowerQuery powerful (pun intended) is that you can utilize advanced buttons to navigate big data and perform advanced data management without having to memorize Excel formulas or SQL commands."
It may take some time to get a hang of it but it’s totally worth it.
Once my student worker showed me the code lines that did not work and told me “I tried until it works and I am trying to have the same attitude in my life.”
This specific student’s visa requests were denied three times. However, the student did not give up and eventually was able to come to the U.S. after the fourth visa interview.
This is what I hope to say throughout this series: it’s not as scary as it seems. In fact, essential tools will be your favorite work buddies sooner or later!
Let's review what we discussed today!
When you enter data (=create database), consider data entry as marriage, choose one among many, and stick with it as much as you can!
In my next post, the correct way to structure and manage datasets will be shared. In other words, we will look at why the following data management is wrong:
Did you find it helpful?
Click the button below to check out all the Professional series!
© copyright SEVIS SAVVY 2023