October 4, 2017

Three Stage Sampling

Caveat emptor: This blog post has not been thoroughly checked for errors. One of IDinsight’s project teams is in the process of designing the sampling strategy for a large scale household survey and is considering using a three stage sampling design in which they would first select districts, then villages (or urban wards), and then households. In addition, someone was asking about three stage clustering for an RCT somewhere on Slack (I can’t seem to find the slack post now) so I thought it might be useful to write a short post on three stage designs. Read more

April 21, 2017

Simple Random Sampling vs. PPS Sampling

A question came up on one of our evaluations on whether we should use simple random sampling (SRS) or probability proportional to size (PPS) sampling when selecting villages (our primary sampling units) for a matching study. Under SRS, you randomly select primary sampling units (PSUs) until you reach your desired sample size. With PPS sampling, you select your PSUs using some measure of size. PPS is often used in a first stage of a two-stage sampling design because if you use PPS to select PSUs and then select a fixed number of units (households in our case) per PSU in the second stage of sampling, the probability of selection will be identical for all units. Read more

February 15, 2017

Fixed Effects vs Difference-in-Differences

TL;DR: When you have longitudinal data, you should use fixed effects or ANCOVA rather than difference-in-differences since a difference-in-difference specification will spit out incorrect variance estimates. If the data is from a randomized trial, ANCOVA is probably a better bet. Trying to understand when to use fixed effects and when to use difference-in-differences (DiD), in the past, always made me feel like an idiot. It seemed like I was missing something really obvious that everyone else was getting. Read more

August 31, 2016

Web Scraping 101

More and more organizations are publishing their data on the web. This is great, but often websites don’t offer an option to download a clean and complete dataset from the site. In this situation, you have two options. First, you (or some unlucky intern) can hunker down and spend a week wearing out the ‘c’ and ‘v’ keys on your keyboard as you cut and paste ad nauseam from the website to an Excel spreadsheet. Read more

July 4, 2016

Multiple Hypothesis Testing

layout: post title: “Multiple Hypothesis Testing” date: 2016-07-04 10:40:48 -0400 categories: jekyll update This week, I volunteered to read and summarize one of the articles for IDinsigh’s tech team’s book club. The topic for this week is multiple hypothesis testing and the article I volunteered to summarize is “Multiple Inference and Gender Differences in the Effects of Early Intervention: A Reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects” by Michael Anderson. Read more

© Doug Johnson 2020. Site design by Emir Ribic

Powered by Hugo & Kiss.