top of page

How ‘Dirty-Data’ Undermines Procurement Strategy … And The Benefits Of Putting It Right.

Quality spend data has always been the Achilles heel of procurement teams trying to take a more strategic approach to spend management. Jonathan Dutton FCIPS explains how that has been changing in the third of a series of articles for SUPPLY CLUSTERS

Savings tend to zero over time. Especially with indirect expenditure. Your 6th tender for stationery savings? Good luck with that …

So, increasingly, procurement teams are now constantly on the hunt for new ways to save. New addressable spend categories. New opportunities.

Scouring the P2P data from the General Ledger is a good place to look for clues. Analyse the data well enough and hopefully, it can reveal all.

Data woes

The first data you normally request is a simple list of suppliers in descending order of spend. Alas, the first thing you notice when examining P2P data is that it is wrong. Even a casual glance can often reveal data worries. Where is the supplier we bought from only six weeks ago? Why are they not on the list? Why is Telstra listed nine times? Why does the sum of the spend not add up to the figure we were expecting – not even close? How can I trust this data?

The old adage “rubbish in, rubbish out” describes human error. Simply, sloppy data input & system compliance means poor reporting outputs. Due to bad typing, skipping non-mandatory fields, wrong interpretation of the question, bad screen design, etc. Or to assuming every supplier is a new one, and entering an incorrect name, or not taking time to properly search for the correct supplier name.

A common problem is duplicate suppliers. Often this is caused by incorrect nomenclature. For example: Supplier Name, Supplier Name Plc, Supplier Name Pty Ltd, Supplier Name Australia, Supplier Name NZ, and so on … your actual supplier’s name entered in different formats as well as for each subsidiary. Your database thinks they are different suppliers. One public sector agency in a state government jurisdiction recently reduced their supplier database from a historic 55,000 to 15,000 mostly from de-duping suppliers entered incorrectly [see CASE STUDY].

Unreliable reporting

Using ‘dirty-data’ to run reports obviously runs the risk of poor and unreliable reporting. Yet this is often exacerbated by system design and reporting suites. Even major system installs (certainly older ones) come with a surprisingly limited range of standard reports. Running bespoke reports can be a tricky business. Labels and menus on data fields are confusing – what do they all really mean? Usually only experienced and skilled systems operators can run reports on complex systems.

If your reports are from a dashboard or portal facility (through the cloud or not) but call on data sources from other systems, or legacy systems, the scope for confusion and error grows exponentially. The application programming interface (API) bridges between systems are not infallible, especially if they were made to order, or are older versions. They can sometimes distort data or misdirect it. And, again, compromise reporting.

Data lakes

Exporting data from a range of systems into “data lakes” (next-gen data storage facilities) is one way to manage data. In theory, it makes it easier to de-dupe dodgy data and clean it up. But it can also risk confusing data patterns, bring more duplication and make reporting or pattern deduction more difficult, not easier. It also presents security issues (all your eggs in one basket), compromises data privacy (who can see what?) and can reduce data down to its raw state (harder to manage). Yet data-lake creation, management and mining is not a quick process.

But it certainly puts all your data in one place and common sense suggests an easier path to making true sense of it all. And can enable technology to do more of the work rationalising your dirty-data.

Finding the Pareto

The real benefit of having all your data in one spot is the opportunity to identify patterns. The first pattern to look for is the Pareto analysis – 80% of the spend with 20% of the suppliers? This is a common pattern in spend data, though rarely as stark as 80/20. More like 70/30 or 60/40 even.

Establishing this core spend pattern is the essential key to actionable insights for procurement teams setting out to take a more strategic approach to their spend management to drive wider benefits:


Regional Health Department uses spend data to forge aligned procurement strategy

A state health department identified some 55,000 suppliers listed on P2P general ledger going back years. This figure was reduced to 15,000 suppliers through a data clean-up and a dedupe process. All suppliers were then tiered into 4 priorities initially through a manual process: Tier 1 – Top 29 identified as genuinely strategic suppliers critical to daily operations Tier 2 – The top 260 suppliers with > $1m expenditure per annum Tier 3 – The 350 suppliers key to services, especially regionally Tier 4 – The other 14,650 ‘commodity’ suppliers – with many options or alternatives

This was a work process that took almost one year to complete. Yet fairly obvious procurement strategies followed the revelations from the mass data:

First, invest in the Top 29 strategic suppliers – build closer strategic relationships, get them to open offices in state, and innovate for long-term more affordable healthcare Second, deal with the top 260 suppliers earning over $1m and use SRM approach to drive innovation and other commercial & health benefits Third, work to ensure security of supply and efficient supply chains with all top 350 key suppliers – especially regionally throughout the state Fourth, persuade over 14,000+ commodity suppliers to work more in-state rather than inter-state to drive greater local economic benefit

Using Artificial Intelligence

Advances in A.I., as well as machine learning and robotic process automation (RPA), have become game-changing in just the last few years.

For procurement, perhaps the ‘next step’ evolution was for A.I. to help clean dirty-data. For example, tidying address fields enable de-duping of suppliers. With a cleaner list, running the data against a commercially available database of companies yields a wealth of new reliable information such as the United Nations Standard Products and Services Code (UNSPSC) – a taxonomy of products and services designed for eCommerce. It is a four-level hierarchy coded as an eight-digit number, with an optional fifth level adding two more digits. Version 16, released in 2014, contained over 50,000 commodities.

This approach matches suppliers to spend categories and is break-through intel for spend analysis. Cross-matching other simple data (geography, internal dept. cost codes, total spend by category/supplier, payment schedules, compliance registers) creates those enticing graphical representations and dashboards (sexy bubble-maps) we see overlaid on maps of Australia, your strategic procurement Kraljic Matrix or on your organisation’s site map. The insights can be quite revelationary.

Yet much of this is, nowadays, fairly standard spend-analysis work – just made possible by cleaner data and UNSPSC (Level 4) coding. Adding external data sources (HR data, weather patterns, demographics) is how BIG DATA earned its reputation for great insights.

This is how A.I. really works is at the next level for procurement teams – creating actionable insights from previously muddled or old spend data: yet it does much more than that. It extrapolates data trends, calculates probabilities & predicts and matches data sources to offer competing narratives of your expenditure.

Clever algorithms search data in a non-linear way. They contextualise the data (this is a shoe distributor, this product is size 9, it must be a shoe), learn as they go, then offer conclusions, well presented in an easy to understand way.

Plainly, A.I. points out stark contradictions and inconsistencies in your spend. For example, A.I can today generate these obvious questions from well-presented data:

  • Why do we pay different prices to a supplier for the same products purchased by our different sites?

  • Why are we 60% paid-up on a contract that is, as yet, only 30% delivered?

  • Why do we pay some suppliers within 7 days, others in 90 days?

  • Why are each of our top three contracts all operating under 65% confirmed compliance?

These WHY questions are actionable procurement insights. The answers to these questions make tangible commercial benefits from a strategic approach to procurement. They make a difference. Unlike that 6th stationery tender.

Jonathan Dutton FCIPS is the former founding CEO of CIPSA until 2013 – the procurement peak body in this region. He now works as an independent management consultant specialising in procurement and has a non-executive role at Supply Clusters

26th April 2019


Featured Posts
Recent Posts
Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square
bottom of page