AI Governance – Foundation For Trustworthy AI

Chatbots, conversational systems, banking apps, smartphones, self-driving cars and even home devices, like Alexa, are rapidly becoming part of our daily lives.  The mainstreaming of AI across a broad variety of industries, from healthcare and financial services to autonomous vehicles and manufacturing, is driving ethical and governance concerns.  Given the opacity of this technology, and the pervasiveness of its reach, increasingly fearful reactions have struck root in the public consciousness.  

To address this widening trust gap, industry leaders must step forward to educate the public on what enterprise AI is realistically capable of, and what it is not.  In this article BGV Partners Anik Bose and Venkat Raghavan, and Zelros CEO Christophe Bourguignat, explore the theme of Trustworthy AI as the foundation for implementing AI Governance.  They look at the challenges presented by AI Model Sprawl and the explosion of Data Clouds, and identify avenues where growth, investment and innovation will likely coalesce.  Trustworthy AI, they conclude, presents a necessary framework for the responsible and safe development of this game-changing technology. 

BGV is elevating awareness and thought leadership around this topic in partnership with industry leaders like Samsung, NXP Semiconductors, ARM, Tech Mahindra and Global Corporate Venturing by co-sponsoring a specialized award on Ethical AI for the Extreme Tech Challenge, a global startup competition focused on “tech for good.”  To learn more about the competition and the Specialized Ethical AI award please visit www.extremetechchallenge.org

AI Model Sprawl and The Widening Trust Gap

The early days of AI experimentation are over.  Whether Google Search or Netflix Recommendations, auto email correction or virtual assistants, Consumer AI is already entrenched, invisible and pervasive in our daily lives. Enterprise AI is now getting mainstream across a number of vertical use cases.  A back of the napkin calculation could conceivably claim that today’s modern enterprises are running 25+ AI models at once.  That number will quickly scale to 50, then 75, and onward and upward along a hyperbolic trajectory. As more and more enterprises leverage AI based systems, the integrity of those AI models and the impact of their decisions and recommendations become increasingly consequential in both breadth — cutting across an incredibly wide array of industries — and in depth — building automated data driven decisions at an exponential rate without human intervention.   

In this brave new world it is critical to ensure that AI-powered engines behave with the same level of accuracy and fairness with respect to protected populations and customer sub-groups (i.e. young/old, rural/urban, customers from different geographies, etc).  It is equally important to avoid using biased variables in models (like racial or ethnic origin, political opinions, etc) and to ensure ongoing monitoring for verification, explainability and stability over time. This is particularly crucial in a regulated industry like insurance, where legal compliance, with the Insurance Distribution Directive (IDD) in Europe, for example, is required.

The introduction of AI into loan origination and financing paints a clearer picture of what’s at stake.  While the rise of algorithmic lending is helping companies raise loan production without raising delinquency or default rates, a bias in the AI governance model risks hurting, rather than helping, low-income and minority borrowers.  “African Americans may find themselves the subject of higher-interest credit cards simply because a computer has inferred their race,” claims Nicol Turner Lee, a fellow with the Brookings Institution. Machine learning underwriting models have so many data points from which to make inferences that race, socio-economic background, and a host of other variables could influence a loan decision without the lenders even knowing.  As long as the system’s operators cannot explain the set of signals that were used to derive the model’s outcome the algorithm cannot be “fixed.” Bias in the outcomes may not even be the fault of the algorithm, but of those that are developing the algorithm, or those that are inputting the data into the model to begin with.  Trustworthy AI cannot emerge so long as the model remains a black box.

Data Sets Contribution to Bias

Data Sets may contribute to bias due to the existence of unbalanced, unrepresentative or non-representative data in an AI Model’s training set. If the data is more representative of some classes of people than others, the model’s predictions may also be systematically worse for the under represented classes.  Additionally, any learnings from the prediction data in deployed models can also amplify the bias. To aggravate the predicament, a learning algorithm may bias drift over time. It is entirely possible that after a model has been deployed, the distribution of the data that the deployed model sees is different from the data distribution in the training dataset. This change might introduce bias in a model over time. In these cases, operators of the AI-engine need to monitor the bias metrics of a deployed model continuously and adjust for this bias drift.  There are a number of techniques to mitigate bias and improve fairness (the details of which are outside the scope of this paper), however, the imperative for a consistent  methodology backed by an AI governance framework is clear.

BGV portfolio company Zelros (AI Recommendation for Insurance products) has been a pioneer in introducing the first standard for ethical enterprise-grade AI, three years ahead of the industry.  This standard mitigates model drift, guarantees explainability and supports model fairness, and is now fully integrated in their platform,  The company will  soon release a new capability to mitigate biases detected in fairness reports through synthetic data generation.   As AI models proliferate across a broader swath of enterprise use cases, industry standards will need to keep pace.

The Explosion of Data Clouds and Our Increasing Vulnerability 

The AI Model Sprawl comes hand in hand with an explosion of Data Clouds as data keeps pouring into Cloud and SaaS platforms. Cloud Native data-first applications have a need to upload, process and analyze massive amounts of data in Data Clouds. Surging demand for analytics-ready data that distills business insights and delivers revenue generating services translates to the next phase of enterprise application innovation that is powered by AI.  There are over 400 million SaaS datasets siloed globally, according to Snowflake.  Data Clouds eliminate these silos, democratizing data access across cloud and SaaS platforms and enabling developers to build rich applications such as CRM services, sales and marketing automation, product analytics and business intelligence platforms.  

AI-based applications that leverage cloud data and algorithmic learning to execute critical business functions, however, leave us exposed to adversarial attacks, which attempt to fool AI models by exploiting structural vulnerabilities within these systems.  Many traditional security vulnerabilities, such as code injection and cross-site attacks, occur through the submission of maliciously crafted input.  Deceptive inputs, therefore, are nothing new.  However, if AI models are involved, this fundamentally raises the stakes. When AI models are used for a variety of applications, those applications can be attacked through maliciously crafted input given to the AI models.  

Because AI systems do not simply process data, but make decisions, deceptive inputs present a much broader, and deeper, threat than they did traditionally.  David Danks describes a few cases in IEEE Spectrum: “Perhaps the most widely-discussed attack cases involve image classification algorithms that are deceived into ‘seeing’ images in noise, or are easily tricked by pixel-level changes so they classify, say, a turtle as a rifle. Similarly, game-playing systems that outperform any human (e.g., AlphaGo) can suddenly fail if the game structure or rules are even slightly altered in ways that would not affect a human. Autonomous vehicles that function reasonably well in ordinary conditions can, with the application of a few pieces of tape, be induced to swerve into the wrong lane or speed through a stop sign.”  Adversarial AI poses risks of data poisoning, online system manipulation, transfer learning attacks and data confidentiality, to name a few. Given the pervasiveness of the decisions in question, it’s not too difficult to imagine these attacks leading to life-threatening consequences.  

The Pace of Change

It took twenty six years for cybersecurity practices to move from academia to commercialization, but Trustworthy AI should emerge much faster. From the first malicious hack at MIT in 1961 (when Tech Model Railroad Club members hacked their high-tech train sets in order to modify their functions) to the introduction of the first commercial antivirus program by McAfee in 1987, an entire industry had to awaken to the threats, invest in new solutions, and adopt best practices.  Yet, today, enterprises spend billions of dollars every year on cybersecurity solutions, and those investments require little justification.  We envision a similar adoption cycle in the AI Governance space, only in a dramatically trimmed time horizon.

Although it takes six months to validate and release an AI model, an average of one in three models are abandoned or shelved due to lack of transparency or reward vs risk over standard technologies.  Statista estimates that over $7B worth of AI model investments are currently locked due to gaps in model governance.  All of these examples highlight the cost, brand and liability issues associated with lack of AI governance controls.  Adversarial AI, by contrast, has received less attention outside of academia.  Although the research community has been paying attention to Adversarial AI for some years now, there appears to be an awakening of sorts beyond that small niche, as evidenced by the 30X spike in published papers on the topic since 2018 (Source: Nicholas Carlini).

Increasing headlines around AI governance issues is trickling down to awareness of the scaffolding required to build that automated data landscape, and the vulnerabilities therein.  While Adversarial AI is in its infancy, we believe it may well require more high profile attacks to capture headlines, generate press and stimulate further interest in this space before broader commercial investment efforts take off.

Conclusions

We are optimistic that innovations in AI Governance and Adversarial AI are poised to see growth and commercial traction over the next 10 years, for several reasons: 

  • An independent AI Governance layer is required in lifecycle between training, deployment and operations, to establish a set of guard rails that improves the trust and confidence in AI systems. 
  • While the debate around ethical AI is only in its nascent stages, it is moving swiftly, and regulation is at the doorstep.  Already, the EU has issued a White Paper on Artificial Intelligence, and will put forward a horizontal regulatory proposal on the topic this year. We expect the US to follow the example being set by the EU.  
  • Large technology companies like Google, Microsoft and IBM, amongst others, are keenly aware of the need to improve the trustworthiness of AI at their enterprises and are setting up AI ethics boards within their organizations.  Awareness of the challenge, therefore, has already reached critical levels amongst key audiences.  
  • The surface area of attack, for adversarial AI while new, is substantial, and worthy of focus for innovation, given that $40B was invested in AI in 2019 alone.  Contrast this with aggregate $45B of total investments made into cybersecurity over the 16 year period from 2003 to 2019.  
  • Last but not least early VC funding has begun to trickle into these areas.  Arthur.ai raised $15M series A funding to tackle AI model monitoring and performance optimization, bias detection, and explainability.  BGV portfolio company Zelros sets a new standard in AI Governance covering model fairness, explainability and model drift.  Recently Resistant AI raised $2.75 million in seed funding to develop an artificial intelligence system that protects algorithms from automated attacks.  

AI governance is rapidly becoming a must have that will only accelerate further with future regulatory pressures.  Consequently, we are bullish on the prospects for this new frontier, and we look forward to seeing how this investment and innovation landscape takes shape.