Policy analysts are continuously striving to find new insights with new sources and types of data. Analysts are looking for the “best available data”. This is important for improving policy analysis, but sometimes analysts are more interested in publishing results than understanding exactly what is in the underlying data. Sages know that databases need to be reviewed carefully before placing great weight on the subsequent analysis. A database might be the “best available”, but still have significant limitations needed to caveat any analytical results.
Running regressions is fun, but an analyst should know what the underlying data includes. Interpreting regression results requires a thorough understanding of what the included variables measure and what key observations are missing from the data.
In the case of Base Erosion and Profit Shifting analyses, the OECD Action 11 report on Measuring and Monitoring BEPS spent the first chapter highlighting the caveats and limitations of current existing databases for analyzing BEPS. All of the databases have limitations, and company-level data has many advantages for analyzing BEPS issues. Several empirical studies have used company financial data from Bureau van Dijk’s ORBIS (global) or AMADEUS (European) databases. Whilst possibly the “best available”, the ORBIS financial report data is not truly global and is incomplete in many respects.
The BEPS Action 13 Country-by-Country reports for the largest MNEs with actual tax, rather than financial, data, will significantly improve the available data for tax agency researchers. Until 2019 at the earliest, ORBIS or AMADEUS databases may remain the “best available.” However, economists from the US Joint Committee on Taxation demonstrated what can be done with comprehensive tax data from MNEs and their subsidiaries available from one country’s tax data. (Dowd, Landesfeld and Moore, 2015, see link).
We would caution analysts to caveat results from the ORBIS database carefully, as the following table from the BEPS Action 11 report shows the ORBIS database is highly skewed to European companies and European headquarters, and a careful look will reveal the absence of many significant tax entities that have been shown to have engaged in BEPS behaviors.
Because the ORBIS database includes only available financial information, it does not have financial information about most unconsolidated subsidiaries of US-headquartered companies, which file consolidated financial reports. It includes some entities in tax havens, but generally without any detailed financial information. Also, when evaluating an available database, it is important to check some of the largest and most important key entities with other available data. For example, we were not surprised to find that the ORBIS database did not include Apples Sales International for many years, and when a record for the entity was included it included just the name of the entity, not financial information. Yet it was clear from parliamentary inquiries (and confirmed by the recently released EU illegal state-aid ruling involving Apple’s taxes in Ireland) that Apples Sales International would be a significant MNE observation in estimating the potential tax revenue lost through BEPS.
The fact there may be several million observations for analysis does not overcome the fundamental data limitation of incomplete, unrepresentative and key missing companies. However, incompleteness and unrepresentativeness does not prevent good analysis from being done when reported with appropriate limitations. Clearly policy analysts should also follow the Hippocratic oath of Do no harm. And analysts should be humble with respect to their empirical findings with best available, but compromised, data.
Similarly, suggestions that an empirical finding is “robust” because the underlying data has been tortured in multiple ways and similar results have been found by a number of researchers using the same databases should not be believed if the underlying data is incomplete and unrepresentative.
Working with the best available data is important for continued progress in analyzing BEPS and other issues. But the old adage of garbage-in, garbage-out still applies. Another adage could be: cavalier use of “best available data” could be BAD analysis.
Tom Neubig and Bob Cline