Sunday, 7 January 2018

Do You Believe in AI Fairy Tales?
Automatic speech transcription, Self-driving cars, a computer program beating the world champion GO player and computers learning to play video games and achieving better results than humans. Astonishing results that makes you wonder what Artificial Intelligence (AI) can achieve now and in the future. Futurist Ray Kurzweil predicts that by 2029 computers will have human level intelligence and by 2045 computers will be smarter than humans, the so called “Singularity”. Some of us are looking forward to that, others think of it as their worst nightmare. In 2015 several top scientists and entrepreneurs called for caution over AI as it could be used to create something that cannot be controlled. Scenarios envisioned in movies like 2001, a Space Odyssey or the Terminator in which AI turns against humans, violating Asimov’s first law of robotics, are not the ones we’re looking forward to. Question is if these predictions and worries about the capabilities of AI, now or in the future, are realistic or just fairy tales.

What is AI?

AI is usually defined as the science of making computers do things that require intelligence when done by humans. To get a computer to do things it requires software. To let a computer do smart things it needs algorithms. Today the most common algorithms used in AI are Supervised learning, Transfer learning, Unsupervised learning and Reinforcement learning. Note that the nowadays popular term Deep Learning is just a form of Supervised Learning using (special forms of ) Neural Nets. Supervised learning takes both input and output data (labelled data) and uses algorithms to create computer models that are able to predict the correct label for new input data. Typical applications are image recognition, facial recognition, automatic transcription of audio, (speech to text) and automatic translation. Supervised learning takes a lot of data, about 50,000 hours of audio are required to train a human like performing speech transcription system. Transfer learning is similar to Supervised Learning but stores knowledge gained while solving one problem and applying it to a different but related problem. For example, applying knowledge gained while learning to recognise cars to recognise trucks. Unsupervised learning doesn’t use labelled data and tries to find patterns in data. There are little to no successful practical applications of Unsupervised learning however. Reinforcement learning also doesn’t use labelled data but uses feedback mechanisms to let the computer programme “learn” how to improve its behaviour. Reinforcement learning is used in AlphaGo (the programme that beat the GO world champion) and in teaching computers to play video games. Reinforcement learning is even more data hungry than the other AI techniques. Besides playing (video) games there are no practical applications of Reinforcement learning yet.

What makes AI successful?

As Andrew Ng, Coursera founder and Adjunct Professor at Stanford University indicates, the most successful applications of AI in practice use supervised learning. He estimates that 99% of the economic value created today with AI is using this approach.  The AI supported optimisation of ad placements on webpages is by far the most successful in terms of the additional revenue it generates for its users. Very little economic value is created with the remaining techniques, despite the high level of attention these have had in the media. Todays “rise” of AI may have struck you as a surprise. A couple of years ago we were not even aware of the practical usability of AI, let alone imagined that we would have AI on our phone (Siri) or in our house (Alexa) supporting us with everyday tasks. However, AI is nothing new, it has been researched since the 1960’s. The current leading algorithm used to estimate the Deep Learning neural networks, backpropagation, was popularised by Geoffrey Hinton in 1986, but has its roots somewhere in the 1960’s. Lack of data and computational power made the algorithm impractical. This has changed as the availability of (labelled) data has grown tremendously and, more importantly, computing power has increased significantly by the introduction of GPU computing. These two factors are the key reasons for AI to be successful today. So it’s not research driven progress, but engineering driven progress. Still, for the best performing supervised learning applications, super computers or High Performance Computing (HPC) systems are required because huge neural nets need to be constructed and estimated. To illustrate, Google’s AlphaGo programme ran on special hardware with 1202 CPUs and 176 GPUs when playing against Go Champion Lee Sedol. Many experts, among them Rodney Brooks, roboticist and AI researcher, questions if much progress can be expected as computational power is not expected to increase much further. Therefore, it could be that we're not at the beginning of an AI revolution, but at the end of one.

What can we expect from AI in the future?

Browsing through the newspapers and other media the number of stories on the achievements of AI and how it will impact the world is huge. Futurist predictions about what AI will allows us to do in the future are mind boggling. Will we really be able to upload our mind to a computer and live forever or learn Kung Fu like Neo in the Matrix movie? Most of these predictions state that AI will increase in power quickly assuming it is driven by an exponential law of progress, similar to Moore’s law. This is doubtful as for AI to acquire the predicted powers it not only requires faster computers, it also requires smarter and more capable software and algorithms. Trouble is, research progress doesn’t follow a law or pattern and therefore can’t be predicted. Deep Learning took 30 year to deliver value. Many AI researchers see it as an isolated event. As Rodney Brook says there is no “law” that dictates when the next breakthrough in AI will happen. It can be tomorrow, but it can also take a 100 years. I think most futurists make the same prediction mistake as many of us do. We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run (Roy Amara’s law). Take for example computers. When they were introduced in the 1950’s there was widespread fear that it would take over all jobs. Now 60 years later, most jobs are still there, new jobs have been created due to the introduction of computers and we have applications of computers we never even imagined.

As Niels Bohr said many year ago: ”Predictions are hard, especially if they are about the future” this also applies to predicting how Artificial Intelligence will develop in the next years. AI today is capable of performing very narrow tasks well, but the success is very brittle. Change the rules of the task slightly and it needs to be retrained and tuned all over again. For sure there will be progress, and more activities we do will get automated. Andrew Ng has a nice rule of thumb for it, any mental activity that takes about of second of thought from a human will get automated with AI. This will impact jobs, but at a much slower rate than many predict. This will provide us the time to learn how to safely design and use this technology, similar to the way we learned to use computers. So, when we are realistic about what AI can do in the future, there is no need to get too excited or upset, sit back and enjoy Hollywood’s AI doomsday movies and other fairy tales about AI. If you have the time I recommend reading some of the work AI researchers publish, for example Rodney Brooks, Andrew Ng, John Holland or scholars like Jaron Lanier or Daniel Dennett.  

Sunday, 5 November 2017

Averaging Risk is Risking the Average
To assure public safety regulatory agencies like the ACM in the Netherlands or Ofgem in the UK monitor the performance of gas and electric grid operators on area’s like costs, safety and the quality of their networks. The regulator compares the performance of the grid operators and decides on incentives to stimulate improvements in these areas. Difficulty with these comparisons is that grid operators use different definitions and/or methodologies to calculate performance, which complicates a like for like comparison on for example asset health, criticality or risk across the grid operators. In the UK this has led to a new concept for risk calculations, the concept of monetised risk. In calculating monetised risk not only the probability of failure of the asset is used, also the probability of the consequence of a failure and its financial impact are taken into account. The question is if this new method delivers more insightful risk estimations to allow for a better comparison among grid operators. Also, will it support fair risk trading among asset groups or the development of improved risk mitigation strategies?

The cost - risk trade-off that grid operators need to make is complex. Costly risk reducing adjustments to the grid need to be weighed against the rise in cost of operating the network and therefore the rates consumers pay for using the grid. For making the trade-off, an estimate of the probability of failure of an asset is required. In most cases, specific analytical models are developed to estimate these probabilities. Using pipeline characteristics like type of material, age, and data on the environment the pipeline is in (i.e. soil type, temperature and humidity) pipeline specific failure rate models can be created. Results from inspections of the pipeline can be used to further calibrate the model. Due to the increased analytics maturity of grid operators, these models are becoming more common. Grid operators are also starting to incorporate these failure rate models in the creation of their maintenance plans.

Averaging the Risk

As you can probably imagine, there are many ways for constructing failure rate models. This makes it difficult for a regulator to compare reported asset conditions from the grid operators, as these estimates could have been based on different assumptions and modelling techniques.  That is why, in the UK at least, it was agreed between the 4 major gas distribution networks (GDN), to standardise the approach.  In short, the method can be described as follows.
  1. Identify the failure modes of each asset category/sub group in the asset base and estimate the probability of failure for each identified failure mode. 
  2. For each failure mode the consequences of the failure are identified, including the probability of the consequence occurring.
  3.  For each consequence the monetary impact is estimated. 
  4.  By summing up over all failure modes and consequences, a probability weighted estimate of monetised risk for an asset category/sub group is calculated. Summarising over all asset categories/sub groups gives a total level of monetised risk for the grid.

This new standardised way of calculating risks makes the performance evaluation much easier, it also  allows for a more in-depth comparison. See for more details on the method the official documentation.

An interesting part of this new way of reporting risk is the explicit and standardised way of modelling asset failure, consequence of asset failure and cost of the consequence. This is similar to how a consolidated financial statement of a firm is created. Therefore, you could interpret it as a consolidated risk statement. But can risks of individual assets or asset groups be aggregated in the described way and provide a meaningful estimate of the total actual risk? The above described approach sums the estimated (or weighted average) risk for each asset category/sub group, so it’s an estimate of the average risk for the complete asset base. However risk management is not about looking at the average risk, it’s about extreme values. For those who read Sam Savage’s The Flaw of Averages or Nassim Taleb’s Black Swan know what I’m talking about.

Risking the Average

Risks are characterised by extreme outcomes, not averages. To be able to analyse extreme values, a probability distribution of the outcome you’re interested in is required. Averaging reduces the distribution of all possible outcomes to a point estimate, hiding the spread and likelihood of all possible outcomes. Also, averaging risks ignores the dependence between each of the identified modes of failure or consequence. To illustrate let’s assume that we have 5 pipelines, each with a probability of failure of 20%. There is only one consequence (probability =1) with a monetary impact of 1,000,000. The monetised risk per pipeline than becomes 200,000 (=0,20*1,000,000), for the total grid it is equal to 1,000,000. If we take dependence of the failures into account than there will be a 20% probability of all pipes failing when these are fully correlated events. There will be a 0,032% change of all pipes failing if they are fully independent. The estimated financial impact than ranges from 1,000,000 in the fully correlated case to 1,600 in the fully independent case. That’s quite a range which isn’t visible in the monetised risk approach.

Regulators must assess risk in many different areas. Banking has been top of mind in the past years, but industries like Pharma and Utilities also had a lot of attention. How a regulator decides to measure and asses risk is very important. If risks are underestimated, this could impact society (like a banking crisis, deaths due to the admission of unsafe drugs or increase of injuries due to pipeline failures). If risks are overestimated costly mitigation might be imposed, again impacting society with high costs. The above example shows that the monetised risk approach is insufficient as it estimates risk with averages, where in risk mitigation the extreme values are much more important. What than is a better way of aggregating these uncertainties and risks than just averaging them?

Monte Carlo Simulation

The best way to better understand the financial impact of asset failure is to construct a probability density function of all possible outcomes using Monte Carlo simulation and based on that distribution make the trade-off between costs and risk. Monte Carlo Simulation has proven its value in many industries and in this case will provide what we need. Using the free tools of Sam Savage’s the above hypothetical example of 5 pipe lines can be modelled and the distribution of financial impact analysed. In just a few minutes the below cumulative distribution (CDF) of the financial impact for the 5 pipelines case can be created. Remember that the monetised risk calculation resulted in a risk level equal to the average, 1,000,000.

From the graph it immediate follows that P(Financial Impact<=Monetised Risk) = 33%. It implies that the P(Financial Impact>Monetised Risk) = 1-33%=66%. So, a 66% chance that the financial impact of pipe failures will be higher than the calculated monetised risk. Therefore we’re taking a serious risk by using the averaged asset risks. Given the objective of better comparison of grid operator performance and enabling risk trading between asset groups, the monetised risk method is to simple I would say. By averaging the risks, the distribution of financial impact is rolled up into one number leaving you no clue on what the actual distribution looks like (See also Sam Savage’s : The Flaw of Averages) A better way would be to set an acceptable “risk threshold” (say 95%) and use the estimated CDF to determine the corresponding financial impact.

This approach would also allow for better comparison of grid operators by creating a cumulative distribution for all of them and plotting them together into one graph (See example below). In a similar way risk mitigations can be evaluated and comparisons made between different asset groups, allowing for better informed risk trading.

Standardising the way in which asset failures and consequence of failures are estimated and monetised definitely is a good step towards a comparable way to measure risk. But risks should not be averaged in the way the monetised risk approach suggests. There are better ways, which will provide insight on the whole distribution of risk. Given the available tools and computing power, there is no reason not to do so. It will improve our insights on the risks we face and help us find the best mitigation strategies to reducing public risks.

Friday, 27 January 2017

Want to get value from Data Science? Keep it simple and focussed!

What is the latest data science success story you have read? The one from Walmart? Maybe a fascinating result from a Kaggle competition? I’m always interested in these stories wanting to understand what has been achieved, why it was important and what the drivers for success were. Although the buzz on the potential of data science is very strong, the number of stories on impactful practical applications of data science is still not very large. The Harvard Business Review recently published an article explaining why organisations are not getting value from their data science initiatives. Although there are many more reasons than mentioned in the article one key reason for many initiatives to fail is a disconnect between the business goals and the data science efforts. Also, the article states that the focus of data scientists is to keep fine tuning their models instead of taking on new business questions, causing delays in the speed at which business problems are analysed and solved.

Seduced by inflated promises, organisations have started to mine their data with state of art algorithms expecting that it is turned into gold instantly. This expectation that technology will act as a philosopher’s stone, makes data science comparable to alchemy. It looks like science, but it isn’t. Most of the algorithms fail to deliver value as they can’t provide an explanation as to why things are happening nor provide actionable insights or guidance for influencing the phenomena being investigated. To illustrate, take the London riots in 2011. Since the 2009 G20 summit, the UK police has been gathering and analysing a lot of social media data, but still they were not able to prevent the 2011 riots from happening nor track and arrest the rioters. Did the police have too little data or lack of computing or algorithmic power? No, millions have been spent. Despite all the available technology the police was unable to make sense of it all. I see other organisations struggle with the same problem trying to make sense of their data. Although I’m a strong proponent of using data and mathematics (and as such data science) for answering business questions, I do believe that technology can never be sufficient to provide an answer. Likewise, the amount, diversity and speed of the data.

Inference vs Prediction

Let’s investigate the disconnect between the business goals and the data science efforts as mentioned in the HBR article. Many of today’s data science initiatives result in predictive models. In a B2C context these models are used to predict whether you’re going to click on an ad, buy a suggested product, if you’re going to churn, or if you’re likely to commit fraud or default on a loan. Although a lot of effort goes into creating highly accurate predictions, questions is if these predictions really create business value. Most organisations require a way to influence the phenomenon being predicted instead of the prediction itself. This will allow them to decide on the appropriate actions to take. Therefore, understanding what makes you click, buy, churn, default or commit fraud is the real objective. To be able to understand what influences human behaviour requires another approach than creating predictions, it requires inference. Inference is a statistical, hypothesis driven approach to modelling and focusses on understanding the causality of a relationship. Computer science, the core of most data science methods, focusses on finding the best model to fit the data and doesn’t focus on understanding why.  Inferential models provide the decision maker with guidance on how to influence customer behaviour and thus value can be created. This might better explain the disconnect between business goals and the analytics efforts as reported in the HBR article. For example, knowing that a call positively influences customer experience and prevents churn for a specific type of customer gives the decision maker the opportunity to plan such a call. Prediction models can’t provide these insights, but will provide the expected number of churners or who is most likely to churn. How to react on these predictions is left to the decision maker.

Keep it simple!

Second reason for failure mentioned in the HBR article is that data scientists put a lot of effort in improving the predictive accuracy of their models instead of taking on new business questions. Reason mentioned for this behaviour is the huge effort for getting the data ready for analysis and modelling. Consequence of this tendency is that it increases model complexity. Is this complexity really required? From a user’s perspective, complex models are more difficult to understand and therefore also more difficult to adopt, trust and use. For easy acceptance and deployment, it is better to have understandable models. Sometimes this is even a legal requirement, for example in credit scoring. A best practice I apply in my work as a consultant is to balance the model accuracy well against the accuracy required for the decision to be made, the analytics maturity of the decision maker and the accuracy of the data. This also applies to data science projects. For example, targeting the receivers of your next marketing campaign requires less accuracy than have a self-driven car find its way to its destination. Also, you can’t make more accurate predictions than the accuracy of your data. Most data are uncertain, biased, incomplete and contain errors, when you have a lot of data this becomes even worse. This will negatively influence the quality and applicability of the model based on this data. In addition, research shows that the added value of more complex methods is marginal compared to what can be achieved with simple methods. Simple models already catch most of the signal in the data, enough in most practical situations to base a decision on. So, instead of creating a very complex and highly accurate model, better to test various simple ones. They will capture the essence of what is in the data and speed up the analysis. From a business perspective, this is exactly what you should ask you data scientists to do, come up with simple models fast and if required for the decision use the insights from these simple models to direct the construction of more advanced ones.

The question “How to get value from your data science initiative?” has no simple answer. There are many reasons why data science projects succeed or fail, the HBR article only mentions a few. I’m confident that the above considerations and recommendations will increase the chances of your next data science initiative to be successful. Can’t promise you gold however, I’m no alchemist.

Thursday, 13 October 2016

The Error in Predictive Analytics

For more predictions see :
We are all well aware of the predictive analytical capabilities of companies like Netflix, Amazon and Google. Netflix predicts the next film you are going watch. Amazon shortens delivery times by predicting what you are going to buy next, Google even lets you use their algorithms to build your own prediction models. Following the predictive successes of Netflix, Google and Amazon companies in telecom, finance, insurance and retail have started to use predictive analytical models and developed the analytical capabilities to improve their business. Predictive analytics can be applied to a wide range of business questions and has been a key technique in search, advertising and recommendations.  Many of today's applications of predictive analytics are in the commercial arena, focusing on predicting customer behaviour. First steps in other businesses are being taken. Organisations in healthcare, industry, and utilities are investigating what value predictive analytics can bring. In these first steps much can be learned from the experience the front running industries have in building and using predictive analytical models. However, care must be taken as the context in which predictive analytics has been used is quite different from the new application areas, especially when it comes to the impact of prediction errors.

Leveraging the data

It goes without saying that the success of Amazon comes from, besides the infinite shelf space, its recommendation engine. Similar for Netflix. According to McKinsey, 35 percent of what consumers purchase on Amazon and 75 percent of what they watch on Netflix comes from algorithmic product recommendations. Recommendation engines work well because there is a lot of data available on customers, products and transactions, especially online. This abundance of data is why there are so many predictive analytics initiatives in sales & marketing.  Main objective of these initiatives is to predict customer behaviour, like which customer is likely to churn or buy a specific product/service, which ads will be clicked on or what marketing channel to use to reach a certain type of customer. In these types of applications predictive models are created either using statistical (like regression, probit or logit) or machine learning techniques (like random forests or deep learning) With the insights gained from using these predictive models many organisations have been able to increase their revenues.

Predictions always contain errors!

Predictive analytics has many applications, the above mentioned examples are just the tip of the iceberg. Many of them will add value, but it remains important to stress that the outcome of a prediction model will always contain an error. Decision makers need to know how big that error is. To illustrate, in using historic data to predict the future you assume that the future will have the same dynamics as the past, an assumption which history has proven to be dangerous. The 2008 financial crisis is prove of that. Even though there is no shortage of data nowadays, there will be factors that influence the phenomenon you’re predicting (like churn) that are not included in your data. Also, the data itself will contain errors as measurements always include some kind of error. Last but not last, models are always an abstraction of reality and can't contain every detail, so something is always left out. All of this will impact the accuracy and precision of your predictive model. Decision makers should be aware of these errors and the impact it may have on their decisions.

When statistical techniques are used to build a predictive model the model error can be estimated, it is usually provided in the form of confidence intervals. Any statistical package will provide them, helping you asses the model quality and its prediction errors. In the past few years other techniques have become popular for building predictive models, for example algorithms like deep learning and random forests. Although these techniques are powerful and able to provide accurate predictive models, they are unable to provide a confidence intervals (or error bars) for their predictions. So there is no way of telling how accurate or precise the predictions are. In marketing and sales, this may be less of an issue. The consequence might be that you call the wrong people or show an ad to the wrong audience. The consequences can however be more severe. You might remember the offensive auto tagging by Flickr, labelling images of people with tags like “ape” or “animal” or the racial bias in predictive policing algorithms.


Where is the error bar?

The point that I would like to make is that when adopting predictive modelling be sure to have a way of estimating the error in your predictions, both on accuracy and precision. In statistics this is common practice and helps improve models and decision making. Models constructed with machine learning techniques usually only provide point estimates (for example, the probability of churn for a customer is some percentage) which provides little insight on the accuracy or precision of the prediction. When using machine learning it is possible to construct error estimates (see for example the research of Michael I. Jordan) but it is not common practice yet. Many analytical practitioners are not even aware of the possibility. Especially now that predictive modelling is getting used in environments where errors can have a large impact, this should be top of mind for both the analytics professional and the decision maker. Just imagine your doctor concluding that your liver needs to be taken out because his predictive model estimates a high probability of a very nasty decease? Wouldn’t your first question be how certain he/she is about that prediction? So, my advice to decision makers, only use outcomes of predictive models if accuracy and precision measures are provided. If they are not there, ask for them. Without them, a decision based on these predictions comes close to a blind leap of faith.

Wednesday, 3 August 2016

Airport Security, can more be done with less?

One of the main news items of the past few days is the increased level of security at Amsterdam Schiphol Airport and the additional delays it has caused travellers both incoming and outgoing. Extra security checks on the roads around the airport are being conducted, also in the airport additional checks are being performed. Security checks have increased after the authorities received reports of a possible threat. We are in the peak of the holiday season where around 170.000 passengers per day arrive, depart or transfer at Schiphol Airport. With these numbers of people for sure authorities want to do their utmost to keep us save, as always. This intensified security puts the military police (MP) and security officers under stress however as more needs to be done with the same number of people. It will be difficult for them to keep up the increased number of checks for long. Additional resources will be required, for example from the military. Question is, does security really improve by these additional checks or could a more differentiated approach offer more security (lower risk) with less effort?

How has airport security evolved?

If I take a plane to my holiday destination …I need to take of my coat, my shoes, and my belt, get my laptop and other electronic equipment out of my back, separate the chargers and batteries, hand in my excess liquids, empty my pockets, and step through a security scanner.  This takes time, and with an increasing numbers of passengers waiting times will increase. We all know these measures are necessary to keep us save but taking a trip abroad doesn’t start very enjoyable. These measures have been adopted to prevent the same attack from happening again and has resulted in the current rule based system of security checks. Over the years the number of security measures has increased enormously, see for example the timeline on the TSA website, making it a resource heavy activity which can’t be continued in the same way in the near future. A smarter way is needed.

Risk Based Screening

At present most airports apply the same security measures to all passengers, a one size fits all approach. This means that low risk passengers are subject to the same checks as high risk passengers. This implies that changes to the security checks can have an enormous impact on the resources requirements. Introducing a one minute additional check by a security officer to all passengers at Schiphol requires 354 additional security officers to check 170.000 passengers.  A smarter way would be to apply different measures to different passenger types, high risk measures to high risk passengers and low risk measures to low risk passengers. This risk based approach is at the foundation of SURE! (Smart Unpredictable Risk Based Entry) a concept introduced by the NCTV (The National Coordinator for Security and Counterterrorism) Consider this, what is more threatening, a non-threat passenger with banned items (pocket knife, water bottle) or a threat passenger with bad intentions (and no banned items). I guess you will agree that the latter is the more threatening one and this is exactly where risk based screening focusses on.  Key component in risk based security is to decide what security measures to apply to which passenger, taking into account that attackers will adapt their plans when additional security measures are installed.

Operations Research helps safeguard us

The concept of risk based screening makes sense as scarce resources like security officers, MP’s and scanners are utilized better. In the one size fits all approach a lot of these resources are used to screen low risk passengers and as a consequence less resources are available for detecting high risk passengers. Still, even with risk based screening trade-offs must be made as resources will remain scarce. Also decisions need to be made in an uncertain and continuously changing environment, with little, false or no information. Sound familiar? This is the exactly the area where Operations Research shines. Decision making under uncertainty can for example be supported by simulation, Bayesian belief networks, Markov decision and control theory models. Using game theoretic concepts the behaviour of attackers can be modelled and incorporated, leading to the identification of new and robust counter measures. Queuing theory and waiting line models can be used to analyse various security check configurations (for example centralised versus decentralised, and yes centralised is better!) including the required staffing. This will help airports to develop efficient and effective security checks limiting the impact on passengers while achieving the highest possible risk reduction. These are but a small number of examples where OR can help, there are many more.

Some of the concepts of risk based security checks, resulting from the SURE! Programme are already put into practice. Schiphol is working towards centralised security and recently opened the security check point of the future for passengers traveling within Europe. It’s good to know that the decision making rigour comes from Operations Research, resulting in effective, efficient and passenger friendly security checks. 

Thursday, 21 July 2016

Towards Prescriptive Asset Maintenance

Every utility deploys capital assets to serve its customers.  During the asset life cycle an asset manager repetitively must make complex decisions with the objective to minimise asset life cycle cost while maintaining high availability and reliability of the assets and networks. Avoiding unexpected outages, managing risk and maintaining assets before failure are critical goals to improve customer satisfaction. To better manage asset and network performance utilities are starting to adopt a data driven approach. With analytics they expect to lower asset life cycle cost while maintaining high availability and reliability of their networks. Using actual performance data, asset condition models are created which provide insight on the asset deterioration over time and what the driving factors of deterioration are. With this insights forecasts can be made on the future asset and network performance. These models are useful, but lack the ability to effectively support the asset manager in designing a robust and cost effective maintenance strategy.

Asset condition models allow for the ranking of assets based on their expected time to failure. Within utilities it is common practice to use this ranking in deciding which assets to maintain. By starting at the assets with the shortest time to failure, assets are selected for maintenance until the budget available for maintenance is exhausted.  This prioritisation approach will ensure that the assets most prone to failure are selected for maintenance, however it will not deliver the maintenance strategy with the highest overall reduction of risk. Also the approach can’t effectively handle constraints in addition to the budget constraint. For example constraints on manpower availability, precedence constraints on maintenance projects, or required materials or equipment. Therefore a better way to determine a maintenance strategy is required taking into account all these decision dimensions. More advanced analytical methods, like mathematical optimization (=prescriptive analytics), will provide the asset manager with the required decision support.

In finding the best maintenance strategy the asset manager could instead of making a ranking, list all possible subsets of maintenance projects that are within budget and calculate the total risk reduction of each subset. The best subset of projects to select would be the subset with the highest overall risk reduction (or any other measure). This way of selecting projects also allows for additional constraints, like required manpower, required equipment or spare parts, time depended budget limits, to be taken into account. Subsets that do not fulfil these requirements are simply left out. Also, subsets could be constructed in such a manner that mandatory maintenance projects are included.  With a small number of projects this way of selecting projects would be possible, 10 projects would lead to 1024 (=2^10) possible subsets. But with large numbers this is not possible, a set of 100 potential projects would lead 1.26*10^30 possible subsets which would take too much time, if possible at all, to construct and evaluate them all.  This is exactly where mathematical optimisation proofs its value because it allows you to implicitly construct and evaluate all feasible subsets of projects, fulfilling not only the budget constraint but any other constraint that needs to be included. Selecting the best subset is achieved by using an objective function which expresses how you value each subset. Using mathematical optimisation assures the best possible solution will be found. Mathematical optimisation has proven its value many times in many industries, also in Utilities, and disciplines, like maintenance. MidWest ISO for example uses optimisation techniques to continuously balance energy production with energy consumption, including the distribution of electricity in their networks. Other asset heavy industries like petrochemicals use optimisation modelling to identify cost effective, reliable and safe maintenance strategies.

In improving their asset maintenance strategies, utilities best next step is to adopt mathematical optimisation. It allows them to leverage the insights from their asset condition models and turn these insights into value adding maintenance decisions. Compared to their current rule based selection of maintenance projects in which they can only evaluate a limited number of alternatives, they can significantly improve as mathematical optimisation lets them evaluate trillions (possibly all) alternative maintenance strategies within seconds. Although “rules of thumb”, “politics” and “intuition” will always provide a solution that is “good”, mathematical optimisation assures that The Best solution will be found.  

Tuesday, 19 July 2016

Big Data Headaches
Data driven decision making has proven to be key for organisational performance improvements. This stimulates organisations to gather data, analyse it and use decision support models to improve their decision making speed and quality. With the rapid decline in cost of both storage and computing power, there are nearly no limitations to what you can store or analyse. As a result organisations have started building data lakes and invested in big data analytics platforms to store and analyse as much data as possible. This is especially true in the consumer goods and services sector where big data technology can been transformative as it enables a very granular analysis of human activity (up to the personal level). With these granular insights companies can personalise their offerings, potentially increasing revenue by selling additional products or services. This allows for new business models to emerge and is changing the way of doing business completely. As the potential of all this data is huge, many organisations are investing in big data technology expecting plug and play inference to support their decision making. The big data practice however is something different and is full of rude awakenings and headaches.

That big data technology can create value is proven by the fact that companies like Google, Facebook and Amazon exist and do well. Surveys from Gartner and IDC show that the number of companies adopting big data technology is increasing fast. Many of them want to use this technology to improve their business and start using it in an exploratory manner. When asked about the results they get from their analysis many of them respond that they experience difficulty in getting results due to data issues, others report difficulty getting insights that go beyond preaching to the choir. Some of them even report disappointment as their outcomes turn out to be wrong when put into practice. Many times the lack of experienced analytical talent is mentioned as a reason for this, but there is more to it. Although big data has the potential to be transformative, it also comes with fundamental challenges which when not acknowledged can cause unrealistic expectations and disappointing results. Some of these challenges are even unsolvable at this time.

Even if there is a lot of data, it can’t be used properly

To illustrate some of these fundamental challenges, let’s take an example of an online retailer. The retailer has data on its customers and uses it to identify generic customer preferences. Based on the identified preferences offers are generated and customers targeted. The retailer wants to increase revenue and starts to collect more data on the individual customer level. The retailer wants to use the additional data to create personalised offerings (the right product, at the right time, for the right customer, at the right price) and to make predictions about future preferences (so the retailer can restructure its product portfolio continuously). In order to do so the retailer needs to find out what the preferences of its customers are and the drivers of their buying behaviour. This requires constructing and testing hypotheses based on the customer attributes gathered. In the old situation the number of available attributes (like address, gender, past transactions) was small. Therefore only a small number of hypothesis (for example “women living in a certain part of the city are inclined to buy a specific brand of white wine”) can be tested to cover all possible combinations. However with the increase in the number of attributes, the number of combinations of attributes that are to be investigated increases exponentially. If in the old situation the retailer had 10 attributes per customer, a total of 1024 (=210) possible combinations needed to be evaluated. However when the number of attributes increases to say 500 (which in practice is still quite small), the number of possible combinations of attributes increases to 3.27 10150  (=2500) This exponential growth causes computational issues as it becomes impossible to test all possible hypotheses even with the fastest available computers. The practical way around this is to significantly reduce the number attributes taken into account. This will leave much of the data unused and many possible combinations of attributes untested, therefore reducing the potential to improve. This might also cause much of the big data analysis results to be too obvious.

The larger the data set, the stronger the noise

There is another problem with analysing large amounts of data. With the increase in the size of the data set, all kinds of patterns will be found but most of them are going to be just noise. Recent research has provided proof that as data sets grow larger they have to contain arbitrary correlations. These correlations appear due to the size, not the nature, of the data, which indicates that most of the correlations will be spurious. Without proper practical testing of the findings, this could cause you to act upon a phantom correlation. Testing all the detected patterns in practice is impossible as the number of detected correlations will increase exponentially with the data set size. So even though you have more data available you’re worse of as too much information behaves like very little information. Besides the increase of arbitrary correlations in big data sets, testing the huge number of possible hypotheses is also going to be a problem. To illustrate, using a significance level of 0.05, testing 50 hypothesis on the same data will give at least one significant result with a 92% chance.

P(at least one significant result) = 1 − P(no significant results) = 1 − (1 − 0.05)50 ≈ 92%

This implies that we will find an increasing number of statistical significant results due to chance alone. As a result the number of False Positives will rise, potentially causing you to act upon phantom findings. Note that this is not only a big data issue, but a small data issue as well. In the above example we already need to test 1024 hypotheses with 10 attributes.

Data driven decision making has nothing to do with the size of your data

So, should the above challenges stop you from adopting data driven decision making? No, but be aware that it requires more than just some hardware and a lot of data. Sure, with a lot of data and enough computing power significant patterns will be detected even if you can’t identify all the patterns that are in the data. However, not many of these patterns will be of any interest as spurious patterns will vastly outnumber the meaningful ones.  Therefore, with the increase in size of the available data also the skill level for analysing the data needs to grow. In my opinion data and technology (even a lot of it) is no substitute for brains. The smart way to deal with big data is to extract and analyze key information embedded in “mountains of data” and to ignore most of it. You could say that you first need to trim down the haystack to better locate where the needle is. What remains are collections of small amounts of data that can be analysed much better. This approach will prevent you from getting a big headache from your big data initiatives and will improve both speed and quality of drive data driven decision within your organisation.