This is probably a bit too heavy for this blog – please refer to this for a background; and since my friend Dan is the GOD of referencing, links there will lead you to all that you need to know
I appreciate Emmerich and Dan for their spirited defence of RCTs and having spent some time with CMF and J-PAL and the brilliant academics we worked with, my impressionable mind has been reasonably impressed. The C-GAP blog, Rodrik’s paper and the responses to the same, in my opinion, have been largely academic, evaluating the cost-benefit of running these experiments. I will take a slightly more lay-man stance here.
On innovation, yes – RCTs are amazing! Imagine using a scientific trial to evaluate social programs – breaking the myth that in social programs, all we could rely on are our gut feel and years of experience and may be, qualitative studies. The studies take time – that’s no issue. Any good study will and should take time. The studies are costly – no problem, as long as there are donors with research funding. RCTs exclude some people from benefits – no problem, no organization can serve everyone at the same time and some tricky field issues can be resolved with some clever designing.
Where I have a problem is the way in which RCTs researchers promote it as the ultimate tool. Social experiments, which we call development interventions, when tried in a particular context, lets say Orissa, and succeed are not these days acknowledged as successful enough, if they cannot be proven to be replicable. The romance with scale, replication and cost-effectiveness have brought damning judgements on many development projects and many promising interventions have lost out on funding because of their seemingly non-replicable nature. Academic investigations into such studies seem to escape this scrutiny. And rightly, this is not just with RCTs, this also holds for other kinds of studies which attempt to judge social programs. Why would an MFI, for instance, in Bihar modify its operations because a study in Hyderabad puts out a certain result?
Secondly, researchers need to internalise that most practitioners are a different breed of people, unlike themselves and some academics-turned-policy makers. These people have their eyes and ears on the ground and are also exposed to a large amount of information that comes their way regarding different organisations and their programs in different parts of the country. From what I have seen, RCTs almost seem to assume that their study examines a problem from scratch and in isolation. Thus, when RCTs are promoted using results that the initial rounds of the study throw up, one needs to exercise more restraint. For example, in the evaluation of an intervention where a south-Indian MFI bundled a micro-insurance product with their micro-credit product, the researchers have said there was no evidence that introducing the insurance product in any way adversely affected the composition of the MFI’s clients. In another study with repayment schedules in a state in east India, researchers found that monthly repayment schedules did not seem to increase default rates. Both these studies have been widely publicized and I can imagine, must have amazed (and horrified) many practitioners who attended conferences and seminars where these studies were presented.
However, the forcefulness of these conclusions tend to gloss over the fact that most of these are only preliminary findings; worse still, they look at only one small component of a program over a pre-defined period of time. It is obvious that human relationships change with time – my relationship with my banker will change as soon as I discover there are hidden charges that were not explained to me previously and the same will dramatically improve as soon if I am told that as a reward for my excellent credit history, my subsequent loans will cost me less; my relationship with my insurance company will change if it takes them 3 weeks to process my claim, after having hassled me for over a week about proper documentation in presenting my claim – meaning, my behaviour with an organization/program with which we have a transactional relationship is dynamic. My social networks impact the nature and the extent to which I influence others regarding the particular organization/program.
Also, the artificial separation of the organization from the program simply refuses to convince me. Especially in contexts where programs are made or marred by those implementing it, this separation is quite inexplicable. Unless we standardize implementers all over the world, we cannot study them the way RCTs propose. These studies spend a considerable amount of time and expertise determining sample sizes, emphasizing on the law of large numbers; how then would the same law work when it came to the number of experiments? How many experiments would I need before I can say that I have covered all types of organizations in India and I have an answer for the standard prototype? I cannot even hope to achieve it for a state, let alone a country or a continent.
With this realization, I firmly believe one must temper down such ‘conclusions’ that these studies yield. It is true governments and practitioners indulge in rhetoric and without some good rhetoric, they wouldn’t survive. However, academics, in my opinion, ought to desist from such strong posturing – even if all this is being done in an attempt to legitimize a particular methodology and out of the conviction that this is “the way”. These results, promoted as methodologically sound to a bunch of practitioners and policy makers, which later, could be revised, corrected or retracted (on studying the same program location for a longer duration or on studying multiple locations and realizing that much of the previous results are actually attributable to non-replicable forces), erode the credibility of the researcher; and in this case, since the methodology is promoted as the infallible hero, it is RCT that is likely to take a beating. A far more modest study would have looked at a bunch of MFIs at the same time and remarked that in a majority of cases, insurance schemes did seem to be doing badly and that it is possible that one can find reasons for the same if one looked into how programs are being managed and implemented by each organization. By looking at anthropological and sociological accounts of the same population over time, some more inferences can be drawn as to why these events happen as they do. Not the most scientifically accurate study, but one that is possibly a better representative of reality than what an RCT with a single MFI will reveal.
Having proven that RCTs can study a snapshot better than any other technique can, we need to hear from its proponents, how they can be made less costly and easier to handle – so that we can have multiple rounds of the same experiment and in multiple locations at the same time. With this, we will also need to know how to integrate the existing knowledge of on an area, its people and interventions. Also, if RCTs are like incisions into a program/organisation’s body, we have to see how it can be made as painless and non-disruptive as possible. This will give it some further credence and then, the non-standardization issue can be approached (and solved, I am sure, using some complicated equation). As I see it, converting socio-economic impacts into solve-able mathematical equations is not the final frontier; being able to answer the ‘why’ and ‘how’ and attaching the same to the estimated impact probably is…
Finally, as Dan points out, the debate on how good RCTs are, are probably premature. I got caught up in this since I was an ‘insider’. If I were not, there is little chance I would worry too much about these questions, especially, which study was better than the other? I would just worry in general about any evaluation and its impact on my work in the field…