Further thoughts on data and policy indicators a-propos two recent papers on procurement regulation & competition: comments re (Tas: 2019a&b)

The EUI Robert Schuman Centre for Advanced Studies’ working papers series has two interesting recent additions on the economic analysis of procurement regulation and its effects on competition, efficiency and value for money. Both papers are by BKO Tas.

The first paper: ‘Bunching Below Thresholds to Manipulate Public Procurement’ explores the effects of a contracting authority’s ‘bunching strategy’ to seek to exercise more discretion by artificially estimating the value of future contracts just below the thresholds that would trigger compliance with EU procurement rules. This paper is relevant to the broader discussion on the usefulness and adequacy of current EU (and WTO GPA) value thresholds (see eg the work of Telles, here and here), as well as on the regulatory decisions that EU Member States face on whether to extend the EU rules to ‘below-threshold’ contracts.

The second paper: ‘Effect of Public Procurement Regulation on Competition and Cost-Effectiveness’ uses the World Bank’s ‘Benchmarking Public Procurement’ quality scores to empirically test the positive effects of improved regulation quality on competition and value for money, measured as increases in the number of bidders and the probability that procurement price is lower than estimated cost. This paper is relevant in the context of recent discussions about the usefulness or not of procurement benchmarks, and regarding the increasing concern about reduced number of bids in EU-regulated public tenders.

In this blog post, I reflect on the methodology and insights of both papers, paying particular attention to the fact that both papers build on datasets and/or indexes (TED, the WB benchmark) that I find rather imperfect and unsuitable for this type of analysis (regarding TED, in the context of the Single Market Scoreboard for Public Procurement (SMPP) that builds upon it, see here; regarding the WB benchmark, see here). Therefore, not all criticisms below are to the papers themselves, but rather to the distortions that skewed, incomplete or misleading data and indicators can have on more refined analysis that builds upon them.

Bunching Below Thresholds to Manipulate Procurement (Tas: 2019a)

It is well-known that the EU procurement rules are based on a series of jurisdictional triggers and that one of them concerns value thresholds—currently regulated in Arts 4 & 5 of Directive 2014/24/EU. Contracts with an estimated value above those thresholds are subjected to the entire EU procurement regulation, whereas contracts of a lower value are solely subjected to principles-based requirements where they are of ‘cross-border interest’. Given the obvious temptation/interest in keeping procurement shielded from EU requirements, the EU Directives have included an anti-circumvention rule aimed at preventing Member States from artificially splitting contracts in order to keep their award below the relevant jurisdictional thresholds (Art 5(3) Dir 2014/24). This rule has been interpreted expansively by the Court of Justice of the European Union (see eg here).

‘Bunching Below Thresholds to Manipulate Public Procurement’ examines the effects of a practice that would likely infringe the anti-circumvention rule, as it assesses a strategy of ‘bunching estimated costs just below thresholds’ ‘to exercise more discretion in public procurement’. The paper develops a methodology to identify contracting authorities ‘that have higher probabilities of bunching estimated values below EU thresholds’ (ie manipulative authorities) and finds that ‘[m]anipulative authorities have significantly lower probabilities of employing competitive procurement procedure. The bunching manipulation scheme significantly diminishes cost-effectiveness of public procurement. On average, prices of below threshold contracts are 18-28% higher when the authority has an elevated probability of bunching.’ These are quite striking (but perhaps not surprising) results.

The paper employs a regression discontinuity approach to determine the likelihood of bunching. In order to do that, the paper relies on the TED database. The paper is certainly difficult to read and hardly intelligible for a lawyer, but there are some issues that raise important questions. One concerns the authors’ (mis)understanding of how the WTO GPA and the EU procurement rules operate, in particular when the paper states that ‘Contracts covered by the WTO GPA are subject to additional scrutiny by international organizations and authorities (sic). Accordingly, contracts covered by the WTO GPA are less likely to be manipulated by EU authorities’ (p. 12).  This is simply an acritical transplant of considerations made by the authors of a paper that examined procurement in the Czech Republic, where the relevant threshold between EU covered and non-EU covered procurement would make sense. Here, the distinction between WTO GPA and EU-covered procurement simply makes no sense, given that WTO GPA and EU thresholds are coordinated. This alone raises some issues concerning the tests designed by the author to check the robustness of the hypothesis that bunching leads to inefficiency in procurement expenditure.

Another issue concerns the way in which the author equates open procedures to a ‘first price auction mechanism’ (which they are not exactly) and dismisses other procedures (notably, the restricted procedure) as incapable of ensuring value for money or, more likely, as representative of a higher degree of discretion for the contracting authority—which is a highly questionable assumption.

More importantly, I am not sure that the author understood what is in the TED database and, crucially, what is not there (see section 2 of Tas (2019a) for methodology and data description). Albeit not very clearly, the author presents TED as a comprehensive database of procurement notices—ie, as if 100% of procurement expenditure by Member States was recorded there. However, in the specific context of bunching below thresholds, the TED database is very likely to be incomplete.

Contracting authorities tendering contracts below EU thresholds are under no obligation to publish a contract notice (Art 49 Dir 2014/24). They could publish voluntarily, in particular in the form of a voluntary ex ante transparency (VEAT) notice, but that would make no sense from the perspective of a contracting authority that seeks to avoid compliance with EU rules by bunching (ie manipulating) the estimated contract value, as that would expose it to potential litigation. Most authorities that are bunching their procurement needs (or, in simple terms) avoiding compliance with the EU rules will not be reflected in the TED database at all, or will not be identified by the methodology used by Tas (2019a), as they will not have filed any notices for contracts below thresholds.

How is it possible that TED includes notices regarding contracts below the EU thresholds, then? Well, this is anybody’s guess, but mine is that a large proportion of those notices will be linked to either countries with a tradition of full transparency (over-reporting), to contracts where there are any doubts about the potential cross-border interest (sometimes assessed over-cautiously), or will be notices with mistakes, where the estimated value of the contract is erroneously indicated as below thresholds.

Even if my guess was incorrect and all notices for contracts with a value below thresholds were accurate and justified by the existence of a potential cross-border interest, the database cannot be considered complete. One of the issues raised (imperfectly) by the Single Market Scoreboard (indicator [3] publication rate) is the relatively low level of procurement that is advertised in TED compared to the (putative/presumptive) total volume of procurement expenditure by the Member States. Without information on the conditions of the vast majority of contract awards (below thresholds, unreported, etc), any analysis of potential losses of competitiveness / efficiency in public expenditure (due to bunching or otherwise) is bound to be misleading.

Moreover, Tas (2019a) is premised on the hypothesis that procurement below EU thresholds allows for significantly more discretion than procurement above those thresholds. However, this hypothesis fails to recognise the variety of transposition strategies at Member State level. While some countries have opted for less stringent below EU threshold regimes, others have extended the EU rules to the entirety of their procurement (or, perhaps, to contracts up to and including much lower values than the EU thresholds, to the exception of some class of ‘micropurchases’). This would require the introduction of a control that could refine Tas’ analysis and distinguish those cases of bunching that do lead to more discretion and those that do not (at least formally)—which could perhaps distinguish between price effects derived from national-only transparency from those of more legally-dubious maneuvering.

In my view, regardless of the methodology and the math underpinning the paper (which I am in no position to assess in detail), once these data issues are taken into account, the story the paper tries to tell breaks down and there are important shortcomings in its empirical strategy that, in my view, raise significant issues around the strength of its findings—assessed not against the information in TED, but against the (largely unknown, unrecorded) reality of procurement in the EU.

I have no doubt that there is bunching in practice, and that the intuition that it raises procurement costs must be right, but I have serious doubts about the possibility to reliably identify bunching or estimate its effects on the basis of the information in TED, as most culprits will not be included and the effects of below threshold (national) competition only will mostly not be accounted for.

(Good) Regulation, Competition & Cost-Effectiveness (Tas: 2019b)

It is also a very intuitive hypothesis that better regulation should lead to better procurement outcomes and, consequently, that more open and robust procurement rules should lead to more efficiency in the expenditure of public funds. As mentioned above, Tas (2019b) explores this hypothesis and seeks to empirically test it using the TED database and the World Bank’s Benchmarking Public Procurement (in its 2017 iteration, see here). I will not repeat my misgivings about the use of the TED database as a reliable source of information. In this second part, I will solely comment on the use of the WB’s benchmark.

The paper relies on four of the WB’s benchmark indicators (one further constructed by Djankov et al (2017)): the ‘bid preparation score, bid and contract management score, payment of suppliers score and PP overall index’. The paper includes a useful table with these values (see Tas (2019b: Table 4)), which allows the author to rank the countries according to the quality of their procurement regulation. The findings of Tas (2019b) are thus entirely dependent on the quality of the WB’s benchmark and its ability to capture (and distinguish) good procurement regulation.

In order to test the extent to which the WB’s benchmark is a good input for this sort of analysis, I have compared it to the indicator that results from the European Commission’s Single Market Scoreboard for Public Procurement (SMSPP, in its 2018 iteration). The comparison is rather striking …

Source: own elaboration.

Source: own elaboration.

Clearly, both sets of indicators are based on different methodologies and measure relatively different things. However, they are both intended to express relevant regulators’ views on what constitutes ‘good procurement regulation’. In my view, both of them fail to do so for reasons already given (see here and here).

The implications for work such as Tas (2019b) is that the reliability of the findings—regardless of the math underpinning them—is as weak as the indicators they are based on. Likely, plugging the same methods to the SMSPP instead of the WB’s index would yield very different results—perhaps, that countries with very low quality of procurement regulation (as per the SMSPP index) achieve better economic results, which would not be a popular story with policy-makers…  and the results with either index would also be different if the algorithms were not fed by TED, but by a more comprehensive and reliable database.

So, the most that can be said is that attempts to empirically show effects of good (or poor) procurement regulation remain doomed to fail or , in perhaps less harsh terms, doomed to tell a story based on a very skewed, narrow and anecdotal understanding of procurement and an incomplete recording of procurement activity. Believe those stories at your own peril…

Data and procurement policy: some thoughts on the Single Market Scoreboard for public procurement

There is a growing interest in the use of big data to improve public procurement performance and to strengthen procurement governance. This is a worthy endeavour and, like many others, I am concentrating my research efforts in this area. I have not been doing this for too long. However, soon after one starts researching the topic, a preliminary conclusion clearly emerges: without good data, there is not much that can be done. No data, no fun. So far so good.

It is thus a little discouraging to confirm that, as is widely accepted, there is no good data architecture underpinning public procurement practice and policy in the EU (and elsewhere). Consequently, there is a rather limited prospect of any real implementation of big data-based solutions, unless and until there is a significant investment in the creation of a proper data foundation that can enable advanced analysis and policy-making. Adopting the Open Contracting Data Standard for the European Union would be a good place to start. We could then discuss to what extent the data needs to be fully open (hint: it should not be, see here and here), but let’s save that discussion for another day.

What a recent twitter threat has reminded me is that there is a bigger downside to the existence of poor data than being unable to apply advanced big data analytics: the formulation of procurement policy on the basis of poor data and poor(er) statistical analysis.

This reflection emerged on the basis of the 2018 iteration of the Single Market Scoreboard for Public Procurement (the SMSPP), which is the closest the European Commission is getting to data-driven policy analysis, as far as I can see. The SMSPP is still work in progress. As such, it requires some close scrutiny and, in my view, strong criticism. As I will develop in the rest of this post, the SMSPP is problematic not solely in the way it presents information—which is clearly laden by implicit policy judgements of the European Commission—but, more importantly, due to its inability to inform either cross-sectional (ie comparative) or time series (ie trend) analysis of public procurement policy in the single market. Before developing these criticisms, I will provide a short description of the SMSPP (as I understand it).

The Single Market Scoreboard for Public Procurement: what is it?

The European Commission has developed the broader Single Market Scoreboard (SMS) as an instrument to support its effort of monitoring compliance with internal market law. The Commission itself explains that the “scoreboard aims to give an overview of the practical management of the Single Market. The scoreboard covers all those areas of the Single Market where sufficient reliable data are available. Certain areas of the Single Market such as financial services, transport, energy, digital economy and others are closely monitored separately by the responsible Commission services“ (emphasis added). The SMS organises information in different ways, such as by stage in the governance cycle; by performance per Member State; by governance tool; by policy area or by state of trade integration and market openness (the latter two are still work in progress).

The SMS for public procurement (SMSPP) is an instance of SMS by policy area. It thus represents the Commission’s view that the SMSPP is (a) based on sufficiently reliable data, as it is fed from the database resulting from the mandatory publications of procurement notices in the Tenders Electronic Daily (TED), and (b) a useful tool to provide an overview of the functioning of the single market for public procurement or, in other words of the ‘performance’ of public procurement, defined as a measure of ‘whether purchasers get good value for money‘.

The SMSPP determines the overall performance of a given Member States by aggregating a number of indicators. Currently, the SMSPP is based on 12 indicators (it used to be based on a smaller number, as discussed below): [1] Single bidder; [2] No calls for bids; [3] Publication rate; [4] Cooperative procurement; [5] Award criteria; [6] Decision speed; [7] SME contractors; [8] SME bids; [9] Procedures divided into lots; [10] Missing calls for bids; [11] Missing seller registration numbers; [12] Missing buyer registration numbers. As the SMSPP explains, the addition of these indicators results in the measure of ‘overall performance’, which

is a sum of scores for all 12 individual indicators (by default, a satisfactory performance in an individual indicator increases the overall score by one point while an unsatisfactory performance reduces it by one point). The 3 most important are triple-weighted (Single bidder, No calls for bids and Publication rate). This is because they are linked with competition, transparency and market access–the core principles of good public procurement. Indicators 7-12 receive a one-third weighting. This is because they measure the same concepts from different perspectives: participation by small firms (indicators 7-9) and data quality (indicators 10-12).

The most recent snapshot of overall procurement performance is represented in the map below, which would indicate that procurement policy is rather disfunctional—as most EEA countries do not seem to be doing very well.

Source: European Commission, 2018 Single Market Scorecard for Public Procurement (based on 2017 data).

Source: European Commission, 2018 Single Market Scorecard for Public Procurement (based on 2017 data).

In my view, this use of the available information is very problematic: (a) to begin with, because the data in TED can hardly be considered ‘sufficiently reliable‘. The database in TED has problems of various sorts because it is a database that is constructed as a result of the self-declaration of data by the contracting authorities of the Member States, which makes its content very dishomogeneous and difficult to analyse, including significant problems of under-inclusiveness, definitional fuzziness and the lack of filtering of errors—as recognised, repeatedly, in the methodology underpinning the SMSPP itself. This should make one take the results of the SMSPP with more than a pinch of salt. However, these are not all the problems implicit in the SMSPP.

More importantly: (b) the definition of procurement performance and the ways in which the SMSPP seeks to assess it are far from universally accepted. They are rather judgement-laden and reflect the policy biases of the European Commission without making this sufficiently explicit. This issue requires further elaboration.

The SMSPP as an expression of policy-making: more than dubious judgements

I already criticised the Single Market Scoreboard for public procurement three years ago, mainly on the basis that some of the thresholds adopted by the European Commission to establish whether countries performed well or poorly in relation to a given indicator were not properly justified or backed by empirical evidence. Unfortunately, this remains the case and the Commission is yet to make a persuasive case for its decision that eg, in relation to indicator [4] Cooperative procurement, countries that aggregate 10% or more of their procurement achieve good procurement performance, while countries that aggregate less than 10% do not.

Similar issues arise with other indicators, such as [3] Publication rate, which measures the value of procurement advertised on TED as a proportion of national Gross Domestic Product (GDP). It is given threshold values of more than 5% for good performance and less than 2.5% for poor performance. The Commission considers that this indicator is useful because ‘A higher score is better, as it allows more companiesto bid, bringing better value for money. It also means greater transparency, as more information is available to the public.’ However, this is inconsistent with the fact that the SMSPP methodology stresses that it is affected by the ‘main shortcoming … that it does not reflect the different weight that government spending has in the economy of a particular’ Member State (p. 13). It also fails to account for different economic models where some Member States can retain a much larger in-house capability than others, as well as failing to reflect other issues such as fiscal policies, etc. Moreover, the SMSPP includes a note that says that ‘Due to delays in data availability, these results are based on 2015 data (also used in the 2016 scoreboard). However, given the slow changes to this indicator, 2015 results are still relevant.‘ I wonder how is it possible to establishes that there are ‘slow changes’ to the indicator where there is no more current information. On the whole, this is clearly an indicator that should be dropped, rather than included with such a phenomenal number of (partially hidden) caveats.

On the whole, then, the SMSPP and a number of the indicators on which it is based is reflective of the implicit policy biases of the European Commission. In my view, it is disingenuous to try to save this by simply stressing that the SMSPP and its indicators

Like all indicators, however, they simplify reality. They are affected by country-specific factors such as what is actually being bought, the structure of the economies concerned, and the relationships between different tendering options, none of which are taken into account. Also, some aspects of public procurement have been omitted entirely or covered only indirectly, e.g. corruption, the administrative burden and professionalism. So, although the Scoreboard provides useful information, it gives only a partial view of EU countries' public procurement performance.

I would rather argue that, in these conditions, the SMSPP is not really useful. In particular, because it fails to enable analysis that could offer some valuable insights even despite the shortcomings of the underlying indicators: first, a cross-sectional analysis by comparing different countries under a single indicator; second, a trend analysis of evolution of procurement “performance” in the single market and/or in a given country.

The SMSPP and cross-sectional analysis: not fit for purpose

This criticism is largely implicit in the previous discussion, as the creation of indicators that are not reflective of ‘country-specific factors such as what is actually being bought, the structure of the economies concerned, and the relationships between different tendering options’ by itself prevents meaningful comparisons across the single market. Moreover, a closer look at the SMSPP methodology reveals that there are further issues that make such cross-sectional analysis difficult. To continue the discussion concerning indicator [4] Cooperative procurement, it is remarkable that the SMSPP methodology indicates that

[In previous versions] the only information on cooperative procurement was a tick box indicating that "The contracting authority is purchasing on behalf of other contracting authorities". This was intended to mean procurement in one of two cases: "The contract is awarded by a central purchasing body" and "The contract involves joint procurement". This has been made explicit in the [current methodology], where these two options are listed instead of the option on joint procurement. However, as always, there are exceptions to how uniformly this definition has been accepted across the EU. Anecdotally, in Belgium, this field has been interpreted as meaning that the management of the procurement procedure has been outsource[d] (e.g. to a legal company) -which explains the high values of this indicator for Belgium.

In simple terms, what this means is that the data point for Belgium (and any other country?) should have been excluded from analysis. In contrast, the SMSPP presents Belgium as achieving a good performance under this indicator—which, in turn, skews the overall performance of the country (which is, by the way, one of the few achieving positive overall performance… perhaps due to these data issues?).

This should give us some pause before we decide to give any meaning to cross-country comparisons at all. Additionally, as discussed below, we cannot (simply) rely on year-on-year comparisons of the overall performance of any given country.

The SMSPP and time series analysis: not fit for purpose

Below is a comparison of the ‘overall performance’ maps published in the last five iterations of the SMSPP.

Source: own elaboration, based on the European Commission’s Single Market Scoreboard for Public Procurement for the years 2014-2018 (please note that this refers to publication years, whereas the data on which each of the reports is based corresponds to the previous year).

Source: own elaboration, based on the European Commission’s Single Market Scoreboard for Public Procurement for the years 2014-2018 (please note that this refers to publication years, whereas the data on which each of the reports is based corresponds to the previous year).

One would be tempted to read these maps as representing a time series and thus as allowing for trend analysis. However, that is not the case, for various reasons. First, the overall performance indicator has been constructed on the basis of different (sub)indicators in different iterations of the SMSPP:

  • the 2014 iteration was based on three indicators: bidder participation; accessibility and efficiency.

  • the 2015 SMSPP included six indicators: single bidder; no calls for bids; publication rate; cooperative procurement; award criteria and decision speed.

  • the 2016 SMSPP also included six indicators. However, compared to 2015, the 2016 SMSPP omitted ‘publication rate’ and instead added an indicator on ‘reporting problems’.

  • the 2017 SMSPP expanded to 9 indicators. Compared to 2016, the 2017 SMSPP reintroduced ‘publication rate’ and replaced ‘reporting problems’ for indicators on ‘missing values’, ‘missing calls for bids’ and ‘missing registration numbers’.

  • the 2018 SMSPP, as mentioned above, is based on 12 indicators. Compared to 2017, the 2018 SMSPP has added indicators on ‘SME contractors’, ‘SME bids’ and ‘procedures divided into lots’. It has also deleted the indicator ‘missing values’ and disaggregated the ‘missing registration numbers’ into ‘missing seller registration numbers’ and ‘missing buyer registration numbers’.

It is plain that there are no two consecutive iterations of the SMSPP based on comparable indicators. Moreover, the way that the overall performance is determined has also changed. While the SMSPP for 2014 to 2017 established the overall performance as a ‘deviation from the average’ of sorts, whereby countries were given ‘green’ for overall marks above 90% of the average mark, ‘yellow’ for overall marks between 80 and 90% of the average mark, and ‘red’ for marks below 80% of the average mark; in the 2018 SMSPP, ‘green’ indicates a score above 3, ‘yellow’ indicates a score below 3 and above -3, and ‘red’ indicates a score below -3. In other words, the colour coding for the maps has changed from a measure of relative performance to a measure of absolute performance—which, in fairness, could be more meaningful.

As a result of these (and, potentially, other) issues, the SMSPP is clearly unable to support trend analysis, either at single market or country level. However, despite the disclaimers in the published documents, this remains a risk (to the extent that anyone really engages with the SMSPP).

Overall conclusion

The example of the SMSPP does not augur very well for the adoption of data analytics-based policy-making. This is a case where, despite acknowledging shortcomings in the methodology and the data, the Commission has pressed on, seemingly on the premise that ‘some data (analysis) is better than none’. However, in my view, this is the wrong approach. To put it plainly, the SMSPP is rather useless. However, it may create the impression that procurement data is being used to design policy and support its implementation. It would be better for the Commission to stop publishing the SMSPP until the underlying data issues are corrected and the methodology is streamlined. Otherwise, the Commission is simply creating noise around data-based analysis of procurement policy, and this can only erode its reputation as a policy-making body and the guardian of the single market.


New edition of Public Procurement Indicators published by the European Commission -- some odd uk numbers

In late December 2016, the European Commission published its Public Procurement Indicators 2015. The statistical information included in this report shows some interesting trends, such as the general increase of procurement expenditure in the EU in 2015 -- which was up by almost 7% from 2014, to reach a total of €450.21 billion -- as well as the continued trend of concentration of procurement expenditure that results from aggregation and/or centralisation of procurement at Member State level.

Regarding the trend towards greater concentration of procurement expenditure in large awards, it is interesting to note that 'at EU level more than one third of the value ... is awarded through contract award notices of 100 million euros or more. This relative concentration of procurement, in large awards, is extremely remarkable in the UK and to a lesser extent in Poland and France. On the opposite side Germany and France concentrate a large fraction of the value procured in the works sector in the smaller size awards'.

This seems surprising because projects of more than €100 million may be relatively common in works (ie infrastructure), as well as large framework agreements for common use equipment (notably, IT hardware), but services contracts of that size would have seemed much less common at first thought (although it is possible that IT expenditure is moving from goods to services as cloud computing and other services are 'virtualised'). In any case, the the fact that the trend is much stronger in the UK than in the rest of the EU (combined) strikes as odd.

Indeed, as the Commission's report stresses, the UK leads the statistics for the award of very large contracts (ie those of a value over €100 million), both for works (66%) and, possibly more remarkable, for services (70%) -- with a smaller but still very significant lead on goods (52%). What is worth emphasising is that the UK's figures are 10 times the magnitude of those for any other Member State (and around 50 to 100 times those of most Member States) both for works and services, and that they double the figures for any other Member State in goods (while still being around 50 to 100 times those of most Member States).

A recalculation of the figures concerning very large contracts excluding data for the UK shows that 22% of procurement expenditure at EU27 level is awarded through contracts of €100 million or more for works and services, and 25% for goods (and, anecdotally, it should be taken into account that a  significant part of the latter is attributable to Italy's Consip centralised procurement activities). Thus, the fact that UK alone can move total figures up by 11% (ie a deviation of 50% of the EU27 statistics) seems quite striking.

Moreover, in general, the UK shows a disproportionately high share of contracts advertised in TED (and the estimated value of its contracts is larger throughout the value scale, with many less small contracts and many more large contracts than the EU average, which has an effect on the obligation to publish notices). This is particularly noticeable when compared to other EU countries with large procurement expenditure -- eg in 2015 the UK advertised an estimated 37% of its public expenditure, whereas Germany advertised less than 10% (see graphs below).

In view of these statistical divergences, a closer look at the UK numbers seems necessary in order to try to understand this trend towards concentrated expenditure through very large contracts. However, there is no detailed information in the report on the basis of which to carry out a qualitative analysis.

Many hypotheses are imaginable, such as the possibility that very large centralised contracts are tendered (for example, by the Crown Commercial Service) but they are not necessarily executed to a large percentage of their estimated value, or that the UK is actually significantly more centralised in terms of procurement than other Member States, particularly in services. Each of these possibilities opens itself up for speculation--for instance, about the reliability of statistical information that could include awarded but unexecuted procurement value (which may be very, extremely relevant in the inminent Brexit-related negotiations, as well as for any reevaluation of GPA coverage both for the EU and the UK) or, on the second scenario, about the drivers for such significant differences in centralisation volumes in different Member States and about the possibility of centralised procurement of services in a way that still allows for proper provision of (public) services to end users.

Either way, I find these issues most intriguing and, in case the issue of unexecuted (contracted) expenditure is included in the statistics, I think that more work should go into the collection of actual information and the publication of raw data that allows for refined analysis--ideally, in relation to the 2016 version of the public procurement indicators.

 

World Bank's "Benchmarking Public Procurement 2017"

The World Bank has recently published its report Benchmarking Public Procurement 2017, where it presents a 'cross-country analysis in 180 economies on issues affecting how private sector does business with the government. The report covers two thematic pillars: the procurement process and complaint review mechanisms'.

The information is structured around eight main indicators, which cover the following areas:

  1. Needs assessment, call for tender, and bid preparation: The indicators assess the quality, adequacy, and transparency of the information provided by the procuring entity to prospective bidders.
  2. Bid submission phase: The indicators examine the requirements that suppliers must meet in order to bid effectively and avoid having their bid rejected.
  3. Bid opening, evaluation, and contract award phase: The indicators measure the extent to which the regulatory framework and procedures provide a fair and transparent bid opening and evaluation process, as well as whether, once the best bid has been identified, the contract is awarded transparently and the losing bidders are informed of the procuring entity’s decision.
  4. Content and management of the procurement contract: The indicators focus on several aspects during the contract execution phase related to the modification and termination of the procurement contract, and the procedure for accepting the completion of works.
  5. Performance guarantee: The indicators examine the existence and requirements of the performance guarantee.
  6. Payment of suppliers: The indicators focus on the time and procedure needed for suppliers to receive payment during the contract execution phase.
  7. Complaints submitted to the first-tier review body: The indicators explore the process and characteristics of filing a complaint before the first-tier review body.
  8. Complaints submitted to the second-tier review body: The indicators assess whether the complaining party can appeal a decision before a second-tier review body and, if so, the cost and time spent and characteristics for such a review. 

The report aims to make progress in the much needed collection of more information, particularly of statistical nature, about the procurement systems that exist around the world. In its own words, '[i]t aims to promote evidence-based decision making by governments and to build evidence in areas where few empirical data have been presented so far. As researchers recognize, “the comparison of different forms of regulation and quantitative measurement of the impact of regulatory changes on procurement performance of public entities will help reduce the costs of reform and identify and disseminate best practices.”' [with reference to Yakovlev, Tkachenko, Demidova & Balaeva, 'The Impacts of Different Regulatory Regimes on the Effectiveness of Public Procurement' (2015) 38 (11) International Journal of Public Administration 796-814].

The report also recognises some of its main substantive and methodological limitations (see p.26). However, even taking those into account, the benchmarking exercise seems rather imperfect and with limited potential to inform policy-making and reform. A couple of examples will illustrate why. 

First, in terms of the methodology for the scoring of procurement systems, I am not sure I understand the logic for the award of points or the scale used to weight the different criteria. For instance, when assessing the accessibility of the procurement process, procurement systems are awarded 1 point if bidders are required to register on a government registry of suppliers, and 0 points if there is no registration requirement. To my mind, this is contrary to what logic would dictate because a system that does not require previous or additional registration is more open than one that does.

Similarly, when assessing the existence and requirements for the provision of bid securities, procurement systems get get a score of 1 for either option they provide in a range of questions concerning whether a bid security or a bid declaration is required, whether the bid security amount is no more than a certain percentage of the contract value or value of the submitted bid, or no more than a certain flat amount; whether suppliers have choice regarding the form of bid security instrument; or if bidders are required to post a bid security instrument, whether there is a time frame for the procuring entity to return the instrument. Additionally, procurement systems are awarded an additional maximum of 1 point for each of the forms of bid security instrument they accept: cash deposit, bank guarantee, insurance guarantee (1/3 of a point each). This means that systems that have more flexibility in the way they regulate bid securities will get higher scores (which is fair enough), but that systems that do not require bid securities will get no points. This, for instance, makes the UK (50 points) lag behind Spain (94 points) in this indicator, despite the fact that the UK is recorded as having no bid security requirement and Spain being recorded as requiring a bid security proportionate to the value of the contract (I am not assessing this information which, at least in the case of Spain, requires some nuances). once again, this is contrary to what logic would dictate because procurement systems that do not require bid securities are more open and accessible (particularly to SMEs).

Second, in terms of the comparisons that can be made with the scores as published, I am not sure that the way the information is presented can actually help understand the drivers of different scores for different countries. Most points are awarded on the basis of a yes/no answer to given questions. Given that some questions are rather open-ended or simply confusing (eg the question concerning Criteria for bid evaluation queries whether the procurement system includes "Price and other qualitative elements", but all procurement systems get a score of 1 regardless of the answer), their ability to allow for comparisons is minimal. Moreover, the individual scoring for each criterion is not provided, which prevents direct comparisons even where questions are narrower and actually award different scores to different answers.

Overall, sadly, I am afraid that the report Benchmarking Public Procurement 2017 can only be seen as a first step towards creating a useful system and scoring matrix to benchmark all public procurement systems in the world. I would think that this is possible, particularly once the field work of information collection is in place (unless it was collected as direct responses to the questionnaire linked to the scoring rule) and that the published version of the report can be significantly improved solely on the basis of a better analysis of the raw information collected by the World Bank team. On that point, it is a shame that this information is not published by the World Bank and I would invite them to reconsider the possibility of publishing the database of raw information, so that more specific proposals on how to improve the scoring method without having to collect additional information can be developed.

Some recent indicators of public procurement in the EU

The European Commission has published some indicators on the evolution of public procurement in the EU up to December 2014 (most recent available data). There are two sets of indicators worth having a look at.

Public Procurement Performance

First, the Commission (DG Grow) has published indicators on public procurement performance in the Member States, which provide a comparative view of the countries' adherence to 'good procurement' as measured by 6 simplified indicators. Or, in other words, indicators aimed to measure 'the extent to which purchasers obtain good value for money'.  The creation of a single 'quick-look' indicator seems appealing. However, some attention to the way in which the indicator is calculated may raise issues as to its usefulness.

Source: European Commission.

In that regard, it is worth mentioning that the Commission has created 6 discrete indicators: [1] One Bidder; [2] No Calls for Bids; [3] Aggregation; [4] Award Criteria; [5] Decision Speed; and [6] Reporting Quality (details available here). Interestingly, in order to construct the 'Overall Performance' indicator (used in the map above), the Commission uses a 'weighted average of all the performance indicators. Triple weight is given to most important indicators: One Bidder and No Calls for Bids.' Given this methodology, the Commission is careful to indicate that

Like all indicators, however, these indicators simplify reality. They are affected by country-specific factors such as the composition of procurement, the structure of the economies concerned, and the relationships between different tendering options, none of which are taken into account. Also, some aspects of public procurement are omitted entirely or covered only indirectly - for instance corruption, administrative burden and professionalism. Thus, although the Scoreboard provides very useful information, it gives only a partial view of EU countries' public procurement performance.

In my opinion, this is a valuable first step towards developing performance indicators in public procurement. However, the 'qualitative policy judg[e]ment on what is good practice' behind some of the criteria is questionable. For instance, the rationale behind criterion [3] Aggregation is that 'Buying in bulk often leads to better prices and also offers an opportunity to exchange know-how. While not every type of purchase can benefit from aggregation, excessively low aggregation levels mean that an opportunity is probably being missed. Aggregation measures the proportion of procedures with more than one public buyer.'

This is by no means clear, given the difficulty in assessing the net economic effects of procurement aggregation [see A Sanchez-Graells and I Herrera Anchustegui, 'Impact of Public Procurement Aggregation on Competition: Risks, Rationale and Justification for the Rules in Directive 2014/24', in R Fernandez & P Valcarcel (eds), Centralizacion de compras publicas (Madrid, Civitas, 2016) 129-163]. Moreover, the reasons that led the Commission to give a positive value of the indicator when Member States aggregate 10% or more of their procurement expenditure seems completely arbitrary.

Ultimately, the use of such indicator may push Member States towards excessive aggregation of demand (particularly through procurement centralisation, see discussion on the UK CCS' strategy below), which seems to be a policy drive of the European Commission that may well create excessive difficulties [particularly when cross-border collaboration is involved, as discussed in A Sanchez-Graells, 'Collaborative Cross-Border Procurement in the EU: Future or Utopia?'].

Therefore, great care needs to be exercised to avoid creating indicators that may trigger specific policy options with doubtful beneficial net effects.

evolution of public procurement markets

Second, the Commission has also published raw indicators of the volume of procurement subjected to the EU rules in 2014. This serves to provide a broad overview of the evolution of EU public procurement markets in recent years. 

There are two results I find interesting. At a general level, the 'estimate of total general government public procurement expenditure (TGGPPE), excluding utilities and defence, was 1,931.5 billion euros in 2014, 2.7 % higher than in 2013, continuing the increased trend of recent years'. However, there are great national disparities that still reflect the effects of the economic crisis, with 'countries like Spain, Italy or Cyprus ... with their TGGPPE the minimum in the last four years'.

And, at a country level, I find it remarkable that, overall, the UK publishes larger contracts than the EU average (see graph below). This issue is linked to the discussion on aggregation above because, '[t]he concentration of procurement in large notices is outstanding in the UK, particularly in the procurement of services, where the UK alone accounts for 84 % of the total value procured at EU level in awards of more than 100 million euros' (emphasis added).

Source: European Commission. Graph represents the distribution of contract award notices in logarithmic scale in million Euros. The dashed-blue line represents EU distribution. 

Source: European Commission. Graph represents the distribution of contract award notices in logarithmic scale in million Euros. The dashed-blue line represents EU distribution. 

Qualitatively, it is worth stressing that this is, at least in large part, the immediate result of the enormous framework agreements for services contracts tendered by the Crown Commercial Service (CCS) in recent years. However, this strategy has led to significant operative problems and the CCS is moving away from such large service frameworks, in favour of alternative procurement strategies

Also from a qualitative perspective, analysing this data would require to access details on whether these contracts are adequately split into lots, eg so as to ensure SME access to procurement markets in the UK. If not, this could be an indicator that UK markets are relatively more geared towards large suppliers than in the rest of the EU, which would be a worrying situation and definitely not in line with declared policy goals.

Therefore, once more, care needs to be exercised in the extrapolation of any policy implications derived from such high-level quantitative indicators.