Wednesday, February 24, 2010

Panel Data

This is Betsy at the RSC. I recently studied a little more on panal data analysis. Panel data is characterized by a large number of phenomena observed over multiple time periods. This is in contrast to times series or cross-sectional data which have one phenomenon over multiple times and multiple phenomena observed in one time period respectively. Panel data is considered balanced if there is an observation for each entity at each time; however, this is not to be mistaken for no missing data, if there is more than one variable some variables may have missing information. There are many different methods and techniques for analyzing panel data this post deals particularly with fixed effect models in STATA.

Panel data allows you to control for unobservable or unmeasurable data like cultural differences across entities. Also you can control for unobservable data across time that is constant over entity. Fixed effect models explore the relationship between the independent and dependent variable within each entity, allowing for differences between entities.

There are two basic approaches that can be used for panel data. One is by including dummy variables for each state and each time. This controls for all the differences between entities and trends in time. The command to do this in STATA is xi: regress dependent independent i.time i.entity. This results in STATA creating dummy variables for each entity and time and running them into the regression. Areg is another command that also uses dummy variables. Areg allows one case, either entity or time to be absorbed in the model. The result is the same but in this instance dummy variables are only internally produced and individual coefficients are not reported only significance. Example syntax for areg is areg dependent independent, absorb(time). If you want to included both time and entity you absorb the larger of the two and include dummies for the other.

You can also use xtreg in STATA. Before you use xtreg you must classify the data as a panel dataset by using the xtset command (xtset entity year). Then the syntax is xtreg dependent independent, fe. This method produces the same results but rather than creating dummy variables for each entity and time, it relaxes the assumption of one intercept term and allows each entity its own.

Hopefully this helps you get started in analyzing panel data sets!

No comments:

Post a Comment

Comments should be for clarification of concepts discussed in a specific blog post. "Thanks!" and "You are awesome!" are discouraged.