2014-10
It is pretty hard to avoid noticing that data analysis is a very hot topic—not just in the world of audit and risk, but particularly in the world of information technology and business overall.
“Big Data” has of course been a popular term for several years. In a previous post, I referred to the longstanding roles that ACL and auditors have played in actually putting big data concepts into practice.
Text Analytics, like Big Data, is another very popular term that is gaining a lot of attention. It has much in common with Big Data. In some ways, it is just a sub-set of Big Data, in that it also typically involves deriving valuable insights from massive amounts of data to help drive business improvements of various types.
The distinction of text analytics is primarily the focus around “unstructured data” and the application of data analysis to “free form” data typically contained in, for example, emails and social media (e.g., tweets and Facebook posts). When people talk about unstructured and free form data it is often not exactly an accurate term. In practice, every email message and social media posting has some form of structure, but not in as relatively rigid a way as is typical of, say, a vendor address or telephone number data element, which is typically of fixed length and defined data types.
While a Twitter post is definitely structured in terms of length, a post on Facebook or LinkedIn, or an email, can be far more variable in terms of length and data types contained in a specific data element.
This makes data analysis of the content more complex. But, the principles are not that different from the ways that text analysis has been used for the purposes of audit, risk management, and compliance for many years.
Take, for example, fraud detection or anti-bribery and anti-corruption regulatory compliance (such as the Foreign Corrupt Practices Act (FCPA) or UK Bribery Act requirements). The use of data analysis for both these purposes typically involves examining vast amounts of data looking for some indicators of a problem.
ACL functions such as FIND() and the various string functions allow for powerful and sophisticated analysis of textual data. It may be simple: such as looking for suspect keywords such as “facilitation” in a reference field for a payment or journal entry, as part of an anti-bribery and corruption test. It could also be as relatively simple as looking for matches of names or account numbers between employee and vendor databases in order to detect fraudulent schemes. Or it can be a lot more complex, dealing with large variable length records and looking for various combinations of text strings throughout a complex database.
My point is that so many of the ways that ACL data analysis tools are used for audit and risk management purposes are actually very good examples of text analytics. The IT world seems excited about the topic as if it were something very new and heavily technology dependent. While powerful technology is critical, it is certainly not something new within the ACL audit and risk world.
Just another example of how auditors around the world, or at least an impressive group among them, have been leaders in their use of data analysis for many years—and without resorting to buzzwords that make an area seem more complex and new than it really is!
(Source: ACL Blog)