Tax Experiments in Developing Countries : A Critical Review and Reflections on Feasibility

This CDI Practice Paper by Giulia Mascagni provides a critical assessment of the literature on tax experiments to date. It examines the main conceptual, methodological and data-related challenges, and provides practical reflections on how to move forward in lowand middle-income countries where this type of research is still underdeveloped. It offers a guide for practitioners on the main challenges in quantitative research on tax compliance and on the methods used tackle them, which may be of interest for evaluation research more generally. CDI Innovation and learning in impact evaluation Centre for Development Impact PRACTICE PAPER


Introduction
In recent years, the international community has widely recognised that taxation is crucial in ensuring sustainable development, the consolidation of democratic institutions, and in paving the way for independence from foreign aid in the long run.Despite this increasing attention, lowand middle-income countries still face many obstacles to increasing tax revenues, both domestic and international (for a review, see Moore 2013).On the domestic side, tax mobilisation is particularly hard when government transparency and accountability are lacking.While most European countries have gone through processes of statebuilding that included bargaining between states and citizens over revenues, such processes are still at an early stage in many low-income countries, particularly in Africa.In the absence of a social contract, whereby citizens pay taxes and expect the state to provide goods and services in return, they may not trust the government to use their resources wisely.This results in low voluntary compliance, which makes tax mobilisation even harder and more costly since it requires a higher reliance on coercion.However, public administrations in many African countries suffer from capacity constraints related to, amongst others, the difficulty in recruiting and retaining skilled staff, the low or inefficient use of modern technologies and the lack of adequate financial resources.Consequently, they are not always able to fully monitor taxpayers and to enforce tax laws.This in turn may further reduce compliance by fuelling the perception that tax authorities are weak and do not have the ability to detect and punish evaders.Therefore, tax compliance is related to many of the common obstacles to tax mobilisation, such as low accountability and capacity constraints, and it is one of the major problems faced by tax administration.
One way in which the research community can help governments mobilise additional tax revenue is by understanding these constraints better and by evaluating possible ways to overcome them.However, the evaluation of drivers and constraints to tax compliance suffers from a number of methodological issues related mainly to the difficulty of measurement.Obviously, it is difficult to get honest answers about dishonest behaviour.Still, recent developments in econometric methods and in the availability of data have created new opportunities to bring this field to the frontier of research.On the one hand, the view that randomisation is a gold standard in applied economic research, with all its merits and drawbacks, has contributed to the widespread adoption of more rigorous methods.On the other hand, the establishment of revenue authorities (RAs) in many low-and middleincome countries and the digitalisation of tax records present a new opportunity to apply such methods in these countries.In this context, tax experiments can be a tool to study tax compliance and overcome some of the common challenges in this area of research, as well as having the potential to support policymakers by achieving a better understanding of the drivers of compliance.

PAGE 2
Tax experiments share the same methodological foundations of the broader quantitative evaluation research.To a certain extent, they are located in the tradition related to randomisation, to which randomised controlled trials (RCTs) also belong.While randomisation certainly has merit, one of the most important being bringing more rigour to quantitative evaluation methods, it has also attracted criticism (see for example Deaton 2009 andBasu 2013).One of the main criticisms of RCTs is the external applicability of results and the possibility of scaling them up from the village to the national level.Related to this issue, RCTs are criticised for not being able to give answers to the broad but fundamental questions that policymakers are interested in.While tax experiments share some of these drawbacks, they can alleviate the problem of scalability by using large and nationally representative data sets.In this sense, they can constitute cases from which the broader evaluation community can learn, even if it may not always be possible to apply the same methods in other sectors.
The objective of this CDI Practice Paper is to give a critical assessment of the literature on tax experiments to date: what the main conceptual, methodological and data-related challenges are, how they have been tackled, and practical reflections on how to move forward in lowand middle-income countries where this type of research is still underdeveloped.By doing this, it offers a guide for practitioners on the main challenges in quantitative research on tax compliance and on the methods used to tackle them, which may be of interest for evaluation research more generally.

Critical review of tax experiments: methods and issues
This section focuses on the most recent tax experiments (TEs), although this literature dates back a few decades.While a comprehensive review of early TEs is not provided here, 1 it is useful to highlight a few common traits that lay the foundations for more recent studies.Initially, tax experiments were largely carried out in artificial lab settings that allowed the main variables of interest to be controlled more easily than in real life or survey-based research.By doing this, researchers could capture actual behavioural responses rather than just perceptions on taxpaying behaviour, which is what surveys typically capture.The early TEs traditionally focused on economic elements determining taxpayer behaviour, both in terms of probability to evade and extent of evasion, by testing the model of Allingham and Sandmo (1972).This model is positioned in the tradition of the economics of crime where dishonest behaviour (in this case, non-compliance) is determined by the expected returns from tax evasion, which weight the costs, determined by the probability of being caught and the sanctions applied; and the gains, which increase when tax rates and taxable income increase.These 'traditional' factors are still very relevant today, particularly in low-income countries, where the capacity to enforce tax laws is still a major problem.In addition, early TEs also started shedding light on other factors, such as social norms, moral motives, emotions and institutions.However, they suffered from clear drawbacks, most notably the use of students rather than real taxpayers, small samples and the artificiality of the lab environment.The latter is particularly important: participants know they are part of a study, that there are no real-life consequences to their behaviour, and that they will not bear the monetary consequences or gains from their actions.As a result, their behaviour in the lab may differ from what they would actually do in real life.These drawbacks made early TEs not totally fit for providing policy recommendations.
Most of these drawbacks were addressed by a more recent stream of research: large-scale field experiments, which allow for greater realism and are therefore better positioned to provide policy advice.These studies have three main characteristics that distinguish them from previous experiments.Firstly, they are embedded in real taxpaying situations.Typically, the participants are real taxpayers who are not aware they are part of a study and who are observed through the data they release to the tax administration via their tax returns.Secondly, they involve a large number of taxpayers that are usually representative of the whole population, making it easier to translate the findings into policy at the national level.Thirdly, these experiments require a close collaboration with the RA, not only because the data are obtained from tax returns, but also because it is typically the tax authority that communicates with taxpayers and potentially influences their behaviour.Therefore, the interface for taxpayers is the RA rather than a researcher, thus making the experiment real for participants.By doing this, large field experiments can tackle many of the challenges of more traditional TEs, while still being part of the same tradition.
Although large-scale field TEs solve many of the drawbacks identified in previous experiments, they also present new challenges.While in the lab context it is possible to observe both real income and reported income, tax returns only contain information on declared income without indicating the extent of under-reported income.This problem can be partially solved by the use of appropriate research design and econometric techniques based on the principle of randomisation.Put simply, the study would start by dividing a given sample of taxpayers into two groups that are comparable, in that they have similar characteristics, such as in terms of location, income, age, employment status, etc.One of the groups would be used as a control group, while the other (or others) is 'treated' with the intervention that is being studied.For example, in the typical TE, the treatment is a letter that the RA sends to taxpayers along with their bill, which underlines a specific determinant of compliance, such as the threat of an audit or information about the social importance of taxpaying.By analysing the changes in reported income in the two groups before and after the treatment, the researcher can draw conclusions about the effects of these treatments on compliance.In this example, the difference between the changes in reported income in the control group and treatment group is taken as indicative of changes in compliance.In other words, if the treated group increases reported income more than the control group, keeping other characteristics fixed, this difference can be attributed to a change in compliance.In this way, it is possible to draw conclusions about evasion even if the actual extent of under-reported income cannot be observed.
For example, the pioneering study by Slemrod, Blumenthal and Christian (2001) was based on a sample of 22,368 tax returns from taxpayers in Minnesota (US).The treatment in this study was a letter from the Department of Revenue informing the taxpayer that they had been selected for a close examination of their tax return, and that if any irregularities were found, past tax returns would be scrutinised as well.This treatment increased the perceived probability of audit, and allowed an analysis of the effect of threats of audit on compliance.The main result was that the letters led to a significant increase in reported income.To gain a more nuanced understanding of how this effect came about, Slemrod et al. stratified their sample along two dimensions: income (low, middle, high) and opportunity to evade 2 (low, high).They found that the strongest effect occurred for low-and middle-income taxpayers with high opportunity to evade.For high-income taxpayers, there seemed to be a perverse effect, whereby the letter reduced rather than increased reported income.One possible explanation of this is that high-income taxpayers can afford professional tax consultants, and can therefore find legal ways to reduce their taxable income in response to the letter.Similar effects are highly likely in many low-income countries where high-quality tax advisers are scarce and only a small portion of the population, namely rich individuals and large firms, can afford to pay for their services.
Blumenthal, Christian and Slemrod (1998) used a very similar design, focusing on the effect of moral appeals to stimulate voluntary compliance, but they failed to find support that such appeals matter for compliance.Again, the sample was stratified along the dimensions of income and opportunity to evade.Torgler (2004) confirmed this finding in a field experiment in collaboration with a local tax administration in Switzerland, expecting that the effect of moral appeals and social norms might be higher in a tight and localised community.However, he found that moral suasion has almost no effect on compliance.Similarly, Fellner, Sausgruber and Traxler (2013) studied evasion in the context of TV licences in Austria.
They found that while threat letters had a strong effect, moral and social appeals did not affect compliance.These results are echoed in Castro and Scartascini (2013) in the case of property taxes in a municipality of Argentina.In line with the results of Slemrod et al. (2001), they argue that the effect of their messages are likely to be heterogeneous across different groups, based for instance on past compliance behaviour and wealth.For example, their equity message did not affect those who did not comply in the past, while it affected compliance negatively for those who did, as they might have revised their perception on evasion upwards after the message.
However, before drawing conclusions about the lack of effectiveness of moral factors, at least three caveats are due.Firstly, it is important to remember that lab studies 3 in the early TE tradition have shown the importance of moral and social factors.Secondly, experiments can only draw conclusions on the effect of a specific treatment in a specific context, not on whether moral appeals are effective generally.For example, Ali, Fjeldstad and Sjursen (2014) demonstrate, using survey data, that there is a link between compliance and service delivery, but it depends on specific services that differ across countries.Thirdly, while moral and social factors almost certainly affect the level of compliance, they may have a smaller effect on variations in compliance.The reason is that the latter relies mostly on changes in behaviour of those who do not comply, and decide to start doing so in response to one of the treatments.As suggested by the results of Castro and Scartascini (2013), evaders are likely to care less about moral and social factors, therefore being potentially less responsive to moral appeals.Indeed, some research has revealed a strong and positive effect of messages about social norms and public goods.For example, Hallsworth et al. (2014) find that these factors affect payments of outstanding taxes due to the RA.Notably, by looking at the timeliness of outstanding payments, the authors are able to circumvent the issue of not being able to observe evasion.In addition, Bott et al. (2014) confirm that moral appeals matter in the case of Norway, where they result in doubling the average foreign income reported with respect to a base letter.
Another way to get round the problem of studying compliance when true income is not observed is to use audit data.By doing this, it is possible to get a more accurate measure of under-reporting, as declared income would be known from the tax return and real income from the audit.This method is used by Kleven et al. (2011) in studying a random sample of 42,800 taxpayers from Denmark, using data from both tax returns and randomised audits.First of all, the authors show that tax compliance is generally high despite relatively high tax rates.While this may be expected in the context of northern European countries, the situation may be very different in low-income countries.In fact, an important CDI PRACTICE PAPER use of administrative tax data is to start shedding light on and quantifying differences in levels of compliance across countries as a preliminary step to the analysis of its drivers.The treatment used in Kleven et al. (2011) was a letter threatening an audit and the sample was stratified according to the reporting environment.In particular, the authors separated those who self-report their income and those who are subject to third-party reporting.They found that the letter had a positive effect on compliance, which was entirely driven by changes in self-reported income.Therefore, they conclude that taxpayers comply because they are unable to cheat (due to third-party reporting), rather than unwilling to do so.Pomeranz ( 2013) underlines the importance of information and reporting systems in the case of the VAT in Chile.She finds that the response to an increased probability of audit amongst firms is largely driven by transactions that are not covered by the paper trail generated by the VAT, namely sales to final consumers rather than transactions between firms.This is because the VAT paper trail makes it easier to detect evasion and therefore has a deterrence effect.

Opportunities and challenges of TEs in low-and middle-income countries
From the overview of TEs provided in Section 2 emerges a notable gap regarding low-and middle-income countries.Some studies are ongoing in such countries and the results have not yet been published, for instance the study in Montevideo discussed below.In addition, it is important to recognise that some studies have been successfully completed and published.This is the case in particular of studies from Latin America, including for example Pomeranz (2013) on Chile and Castro and Scartascini (2013) on Argentina, with some examples also from Asia (such as Khan, Khwaja and Olken 2014).Nonetheless, this gap reflects an actual lack of evidence, particularly in Africa.There is currently no large-scale field experiment available for any African country, although researchers have been able to use administrative data on a few occasions.This section outlines the main opportunities and challenges in filling this gap, and argues that the main challenges can be overcome through research design and a good understanding of the local context.

Opportunities
As previously mentioned, TEs have several methodological advantages.First of all, they can tackle one of the common scepticisms around randomisation, namely the issue of scaling up.Field TEs are often carried out using a large sample of taxpayers that is nationally representative, as opposed to smaller scale experiments that may only involve one or more villages in a specific region of a country.By doing this, field TEs are better positioned to influence policy because the results are more readily scaled up to the national level.Secondly, TEs provide the opportunity to analyse compliance even if evasion is not directly observable.Data from randomised audits offer the most accurate measure of the extent of evasion, but it is not always available.However, even by relying only on tax returns data, it is possible to draw conclusions about compliance by observing differences in reporting behaviour between the treatment and control groups.This method is particularly suitable in a situation where it is virtually impossible to change the actual conditions, such as the tax rate, the actual probability of audit, or sanctions, for a selected group of people.However, by changing perceptions about the probability of detection or about the equity of the tax system through the use of letters, one can observe responses in compliance.The underlying idea is that taxpayers' behaviour does not only depend on changes in actual factors, but also on their perceptions and on information available to them (Castro and Scartascini 2014).These methodological advantages are made possible by the availability of data from tax returns.Notably, the use of administrative data also allows the production of rigorous studies with much lower budgets than the typical RCT, since RAs routinely collect these data.In this sense, TEs are a low-cost tool to evaluate RA initiatives and to find effective ways to increase tax revenues.For example, they can show that a simple and relatively cheap intervention, such as the 'threat of audit' letter, can increase tax liabilities by 12 per cent (Slemrod et al. 2001).By doing this, TEs can help tax authorities to disentangle which factors matter most for increasing compliance, even if in most cases all factors coexist to various extents.Governments have different preferences and views on how to improve compliance, from aggressive enforcement to voluntary compliance.However, both RAs and taxpayers are likely to benefit from a system where tax payments are made voluntarily based on a relation of trust, rather than coercively at a higher individual and social cost.The research community can help governments move in that direction by demonstrating evidence of the behavioural responses to this multitude of factors, including but also beyond deterrence.This requires a certain degree of buy-in from RAs, which underlines once again the importance of a close collaboration throughout the whole process.
TEs can also be useful as tools to evaluate specific policy initiatives that were already undertaken by RAs.For example, a team of researchers is carrying out an evaluation of a randomised policy intervention in Montevideo, Uruguay, where tax holidays are assigned to 'good' taxpayers using a lottery system (see Dunning et al. 2014).The initiative is aimed at promoting positive incentives for compliance, as opposed to the negative ones that may have occurred after a tax amnesty.Tax authorities often experiment with similar initiatives and are constantly looking for new strategies to increase tax revenues.However, such initiatives are only seldom Finally, TEs are a good case for research that can potentially be both of high academic quality and policy relevant.The latter is particularly ensured by the close collaboration required with the RA, which would ideally have a prominent role from the early stages, particularly in the research design phase.This close collaboration may also generate positive effects for the RA, such as handson capacity building on research methods and the establishment of partnerships with international research organisations, in addition to using the results to gain a better understanding of compliance and to identify the most effective ways to tackle it.

Challenges
It is likely that the single biggest obstacle in carrying out this type of research is the lack of data, or the unwillingness of some RAs to share it.This is particularly likely to be an issue in Africa, where RAs are relatively new and often have limited experience of collaboration with the research community.Moreover, until a few years ago many RAs in low-income countries did not have the necessary level of modernisation and digitalisation to make tax records available.However, today most countries, including in Africa, have modern RAs that can certainly benefit from increased collaboration with the research community and more rigorous evaluations.Some RAs have embraced this view and started sharing data with researchers in an anonymised format and in compliance with all the relevant laws.In the African context, South Africa is pioneering, with a large research department within SARS (South Africa Revenue Service) undertaking a great amount of work using data from tax records.For example, a recent study looked at the tax system's progressivity and its impact on the economy (Steyn and Jordaan 2012).Similarly, the Ethiopian Revenue and Customs Authority has recently provided data to researchers studying the impact of registration machines on VAT payments.This existing work is stimulating the interest of other tax administrations, and it may act as a catalyst for other RAs' involvement in this field.Importantly, it shows that the available data can be used to produce sound and relevant research.Although issues remain around data quality and consistency, they may not be so acute to prevent any research from being done.
As far as methods are concerned, although TEs can tackle some of the main issues, a number of caveats are called for.For example, letter treatments may end up having unintended effects.Let us take the example of a 'threat' letter, signalling an increase in the probability of audit.If taxpayers have always perceived a high probability of audit, they would not respond by changing their behaviour.Similarly, evaders may believe that the tax authority is unable to uncover irregularities in their tax returns, especially in developing countries where administrative capacity may be low.Therefore, they would maintain their dishonest behaviour even after receiving the letter.
Another example is a 'social norms' letter that informs taxpayers that 90 per cent of the community complies with the law and only a minority evades.Taxpayers may believe, prior to receiving the letter, that almost everyone complies.So the treatment may increase their perception of evasion, leading them to adjust their behaviour to align with the community and thus comply less rather than more.In other words, the final result of treatments depends on previous beliefs, as they affect the taxpayers' response to the message they receive (Castro and Scartascini 2014).
In principle, good research design can help in preventing some of these issues.However, this requires some degree of commitment and flexibility from the RA to make sure that the research project is successful.There may be circumstances, particularly in low-income countries, where the RA may be interested and willing to collaborate in the research, but it is still constrained in several ways, for example related to skilled employees, administrative capacity or IT systems.This implies that their collaboration with researchers can represent a significant burden, for example in making data available in an adequate and anonymised format.More than being an insurmountable obstacle, this issue requires a realistic and pragmatic approach, as well as adequate planning of research activities to take these specific constraints into account.
In addition to these general constraints, there may be more practical challenges.For example, the message on the letter should be salient, clear and concise.The standard format of the RA letters to taxpayers may or may not satisfy these requirements.If the message is not designed properly, for example because the local authority is unable to change its standard format, then the treatment may not have any effect as 'inattentive taxpayers' may not even be aware of it.In other words, 'information matters but how you present it does too' (Castro and Scartascini 2014: 10).This issue is fully preventable, for example through the use of pictures, well-designed messages, or separate letters to make the message more salient.However, there may be cases where the tax authority is unable or unwilling to adopt these measures.In other cases, the specific constraints faced by the RA can be very basic and simple; for example, it may not be possible in all countries to post letters to physical addresses.In this case, researchers may explore the possibility of delivering the message by other means or using text messages.

PRACTICE PAPER
Similarly, clarity and simplicity is required in the communication of research results to policymakers.Tax experiments are sometimes complex, and the results may be difficult to distil down to simple policy messages.However, policymakers need precisely that, as well as clear indications on what may work.A related issue regards the fact that TEs are able to draw conclusions only on a specific treatment, with a specific design, and in a given context.This can result in experiments falling short of giving the sort of 'big picture' answers that policymakers often want.This issue was raised, in the more general context of RCTs, by a World Bank blog4 that underlined that much analysis is done on simple policy tweaks that are sometimes isolated from the broader context.One of the studies object to this criticism was precisely a tax experiment conducted in Punjab (Khan et al. 2014), which analysed the effect of different incentives for tax collectors to raise more revenue.The study was criticised for abstracting from specific labour market factors that tax collectors face, and from the possibility that inspectors would, with time, find ways to circumvent these incentives and make money out of them.
Related to the need to be closely connected to the local context, it is very important to take into account social and cultural factors.Researchers need to be aware of and respect social sensitivities, not only in tailoring the content and form of letters.For example, Castro and Scartascini (2014) underline how it may be appropriate to exclude very poor neighbourhoods from experiments, to avoid sending 'threat' messages to the poorest and most vulnerable people who may face the hard choice of feeding their children or paying tax.

Conclusions
This paper has reviewed recent TEs and discussed the opportunities and challenges for expanding this type of research in developing countries.While it is far from being fully exhaustive, it leads to four concluding points.Firstly, experimental research on tax compliance has been advancing fast in recent years, allowing it to be better placed to offer policy-relevant solutions than earlier TEs.
From a policy perspective, this makes TEs more attractive as a relatively low-cost tool that RAs can use to evaluate existing initiatives or to identify effective prospective measures.The fact that the methods have already been used in Europe and the US provides a good opportunity to learn from the existing body of work.
Secondly, despite these improvements there is still a gap as far as low-income countries are concerned, and particularly on the African continent.While this is due to relatively recent developments in the modernisation and digitalisation of tax administrations, there is now an opportunity to expand this field of research to new countries.Thirdly, when applying the methods to developing countries it is necessary to keep in mind the specific challenges that were highlighted in the previous section.In addition to data availability and quality, one of the most critical elements is the role of the local authorities.This is a particular challenge in countries where there is not a long tradition of collaboration between RAs and the research community, and where capacity constraints are high.The degree and quality of local authorities' involvement is a key element in determining both success and failure of the experiment.It also influences the usefulness of the results, i.e. whether they are likely to be picked up by policymakers or not.One possible way to ensure RA commitment is for research to be demand-driven.In other words, it should respond to a specific knowledge gap highlighted by the RA or a specific demand for evaluation of a current or prospective policy initiative.In this way, the TE can be built into the process of decision-making as a source of knowledge and evidence.
Finally, most of the difficulties described above can potentially be addressed by good research design that is both in line with the local context and conducive to rigorous evidence.The recent digitalisation of RAs makes it feasible to expand TE research in new contexts, which is fully in line with the policy priority given to tax revenue mobilisation in developing countries.Therefore, TEs have the potential to become a useful evaluation tool to better understand compliance and to improve the effectiveness of tax administrations' actions to tackle it.

Centre for Development Impact (CDI)
The Centre is a collaboration between IDS (www.ids.ac.uk) and ITAD (www.itad.com).
The Centre aims to contribute to innovation and excellence in the areas of impact assessment, evaluation and learning in development.The Centre's work is presently focused on: (1) Exploring a broader range of evaluation designs and methods, and approaches to causal inference.
(2) Designing appropriate ways to assess the impact of complex interventions in challenging contexts." Tax compliance is related to many of the common obstacles to tax mobilisation, such as low accountability and capacity constraints… One way in which the research community can help governments mobilise additional tax revenue is by understanding these constraints better and by evaluating possible ways to overcome them.However, the evaluation of drivers and constraints to tax compliance suffers from a number of methodological issues related mainly to the difficulty of measurement.
[But] recent developments in econometric methods and in the availability of data have created new opportunities to bring this field to the frontier of research. " Studies, Brighton BN1 9RE, UK T +44 (0) 1273 915637 F +44 (0) 1273 621202 E ids@ids.ac.ukW www.ids.ac.uk (3) Better understanding the political dynamics and other factors in the evaluation process, including the use of evaluation evidence.This CDI Practice Paper was written by Giulia Mascagni.The opinions expressed are those of the author and do not necessarily reflect the views of IDS or any of the institutions involved.Readers are encouraged to quote and reproduce material from issues of CDI Practice Papers in their own publication.In return, IDS requests due acknowledgement and quotes to be referenced as above.© Institute of Development Studies, 2015 ISSN: 2053-0536 AG Level 2 Output ID: 317 CDIevaluated, thus creating inefficiencies as the RA cannot clearly distinguish between those that work and those that do not.In this context, researchers can contribute by rigorously evaluating existing initiatives and indicating what the most effective policy options are.In the Montevideo example, they can establish whether positive incentives are indeed an effective tool to increase compliance.