Throughout the Professional series, we had a look at the importance of standardizing data entry and having the primary key (anchor) in each dataset.
Standardized data values allow practitioners to listen to what data tells us. Similarly, the primary key helps us to cross-reference datasets efficiently which could help us make peace with colleagues as much as possible!
In the second post, we also took a look at how Chick-fil-A works to correctly structure the data (table). Correctly structured data do not have multiple headings, merged cells, or more than one value in a cell.
Based on the correctly structured data, tables and charts can be created which can reflect any updates in the raw database.
Now we are ready to fly even higher!
Have you ever heard of the frog in the well story? The frog thought the sky was the size of the well. It isn’t until the frog jumps out of the well that it realizes the sky is much bigger than what it saw.
Data is like that. Having your organization’s data and comparing it with bigger data can tell you more than just looking at your local data.
For example, if you are a higher education professional like me and if your school is one of the campuses in the state system, you can refer to the state system data to compare various data points.
Additionally, if you are in the International Education department like me, you may want to look at the Open Door data or SEVIS by Numbers to get insights on international recruitment, enrollment, and retention.
Likewise, if you are selling the book titled “The Frog in the Well,” you may want to look at the order data in the e-commerce platform to see how well your book is selling.
In short, looking at the bigger data helps you to make data-driven decisions.
In today’s post, I will show you how to analyze various datasets using API, and/or PDF files. Specifically, we will look at the State University of New York (SUNY) data that is publicly available.
For those who are not familiar with the SUNY system, SUNY is comprised of 64 campuses and it can be categorized into 4 different types of institutions:
- Community Colleges
- Comprehensive Colleges
- Doctoral Degree Granting Institutions
- Technology Colleges
How do I know all of this? I downloaded the report from data.ny.gov and utilized the pivot table to see the list of institutions in each category. Click the Learn More button below to learn step-by-step:
1. Go to data.ny.gov to get the “CSV for Excel” file of Headcount Enrollment by Student Level and Student Load by Institutions of the State University of New York: Beginning Fall 2011
2. Click the Insert- Pivot Table to add a Pivot Table. From there, drag College or Institution Type to the Field, College or Institution Name to the Rows, Year to Columns then Undergraduate Full-Time to Values then set the values as Average.
Here is the list of SUNY campus names in each institution type:
Community Colleges
Comprehensive Colleges
Doctoral Degree Granting Institutions
Technology Colleges
The reason why we looked into each institution type is that we don’t want to compare community college numbers with flagship universities’ numbers. If we do, we will get skewed views of the numbers.
Unfortunately, New York State is not allowing direct access to the dataset of international student enrollment numbers. The public can only play with visualized interactive data using Tableau and download it as PDF files.
We will take a look at how to download them as PDF files and then use PowerQuery to get the data that we need in the next post.
How to get datasets using API
For now, we will utilize the report named State University of New York (SUNY) Trends in Enrollment of Students by Race/Ethnicity and by SUNY Sector: Beginning Fall 2002 using API to get the non-resident alien percentage per each institutional type.
*Non-resident aliens may include international students and other nonimmigration visa holders.
Pulling the % of nonresident aliens for 4 different SUNY campus types will give us a base to compare it with local data to see if my school has a similar % of international students compared to other schools in the same category.
1. Go to the report, click API, and copy the JSON API Endpoint
2. There are two ways to call API
- Click Data- Get Data- From File – From JSON
- Click Data- From Web
3. Paste the API endpoint
- JSON: Paste the endpoint next to the File name and click Import
- Web: Paste the endpoint, select the URL and click Connect
4. API called a list of records
5. Under the Transform menu, click To Table
6. Use the dropdown menu to select what you need then click OK
7. Click the extension icon next to the column name
8. Make sure to uncheck the Use original column name as prefix and click OK
9. Click anywhere in the table and click Detect Data Type to change the data type of each column. Choosing the correct data type is a necessary step to use the pivot table.
10. If the data type is not automatically detected, click the icon before the column name and select Whole Number as the datasets are numbers
11. Click Close & load to then add the query as a table
12. Now that we got the dataset from the NYS government website using API, we will customize it to see non-resident alien enrollment numbers
13. Click Insert- Pivot Table to launch the Pivot Table
14. A pop-up window will automatically select the table range. You can set the location for the Pivot Table, I will add it to the New Worksheet
15. Move “institution_sector” to Filter, “term” to Rows, and “total_enrollment” and “non_resident_alien” to Values
16. To get the % of non-New York residents compared to the total enrollment, go to PivotTable Analyze- Fields, Items & Sets and then click Calculated Field
17. Type the Name of the Field and click each field to create the %. When you are done click OK
18. You can also add a Pivot Chart by clicking PivotTable Analyze- Pivot Chart. I went with a Combo chart style to show total enrollment numbers and non-resident alien numbers as bar graphs and % of non-NY resident students as a line graph. To do so, you need to check the Secondary Axis for % of students field.
SUNY enrollment data from Fall 2002 to Fall 2022
Here are the analyzed SUNY data from Fall 2002 to Fall 2022 that show the percentage of non-New York resident students:
Community Colleges
Comprehensive Colleges
Doctoral Degree Granting Institutions
Technology Colleges
In summary, PowerQuery can be used to get the real time datasets that are open to the public. Pivot Table is helpful for grouping and organizing data to see what I need. Last but not least, a Pivot Chart can be used to visualize the data to see the trend.
Here are some of the findings from the Pivot Table and Pivot Chart:
- Doctoral Degree Granting Institutions have the highest number of enrolled non-resident-alien students with an average of 12.85% of total enrolled students since the Fall 2002 term, followed by Comprehensive Colleges, Technology Colleges, and Community Colleges
- None of the SUNY institution types have a similar trend of total enrolled NY resident students and enrolled non-resident students.
- The percentage of enrolled non-NY-resident students at SUNY Technology Colleges decreased the most rapidly.
- Technology and Community Colleges recovered the percentage of enrolled non-resident students since the pandemic
- Doctoral Degree Granting Institutions have seen a somewhat steady increase in the percentage of enrolled non-resident-aliens
- The total enrollment decreased in the past 3 years regardless of the institution type
As I mentioned previously, this is not exactly what my office wanted to see as non-resident aliens include non-immigrant-visa holders other than student visa holders.
Therefore, in my next post, I will share how to convert PDF files to datasets using PowerQuery to get a closer look at the SUNY international enrollment data.
Did you find it helpful?
Click the button below to check out all the Professional series!
© copyright SEVIS SAVVY 2023