Data Analysis using STATA
The dissemination of research and policy analysis skills is part of the mandate of the Center for Development Policy (CDP). Providing practical data analysis skills is a first step to creating a critical mass of competent policy analysts, researchers and data analysts. The goal of this course in data analysis using STATA is to equip the participants with the practical skills necessary to manage and analyze quantitative data using STATA, one of the modern data analysis software programs. Quantitative data analysis is a prerequisite to building a critical mass of experts who are capable of conducting research and policy-relevant analysis to inform evidence-based policy interventions. This course on data analysis using Stata will focus on the following:
- Stata basics: starting and exiting Stata, checking/creating/changing to the working directory, the Stata window, accessing data within Stata (i.e. used in Stata manuals), manually inputting data, importing data of different formats into Stata and exporting data to other platforms;
- The grammar/language of Stata: functions, operators, qualifiers (e.g. in and if qualifiers), do files, log files, Stata routines (loops: an introduction to programming);
- Creating and changing variables: generate, egenerate, replace, recode, handling missing values, handling strings, labeling of data sets, variables, and data values as well as handling date functions;
- Appending, joining, and merging data sets;
- Reshaping data: Wide Vs Long;
- Downloading Data into Stata from online sources using wbopendata, sdmxuse and freduse platforms;
- Data Visualization: creating and changing graphs (bar charts, scatter plots, hbar, stem and leaf plot and others);
- Data cleaning: detecting outliers and other influential data points and how to remedy such issues;
- Data transformations e.g. log-linearization, creation of dummy variables, sub-sample extraction using qualifiers, etc.
- Describing and comparing distributions: frequency tables, kernel Vs normal density plots, and summary statistics;
- Introduction to simple and multiple regression: specification, estimation, and diagnostics;
- Specify, estimate and interpret models with categorical predictors and their possible interactions;
- Exporting Stata output (tables and graphs) to other Microsoft word and formats. These outputs are what we normally see in highly reputable journals.
*  By the end of the training, the participants are expected to have learned how to handle raw data from quantitative surveys, prepare the data in a usable form that suits their research agenda and conduct simple descriptive and inferential statistics. Participants will also learn how to tabulate and visualize quantitative data and present outputs in policy reports in formats easily understandable to non-technical audiences like policy makers.
✓  - Policy analysts ✓  - Data scientists ✓  - Consultants ✓  - Researchers ✓  - University students, both undergraduate and graduate students who intend to use Stata during their dissertation writing; ✓  - PhD students who intend to use quantitative analysis using STATA;
✓  Some basics on statistics and Econometrics, though this is not mandatory ✓  Have a laptop and Stata software installed. CDP will provide Stata software if not having Stata ✓  If having an ongoing research project, it will be useful in discussion