Harsh criticisms of the scientific foundations of AI are accumulating and the trend sped up steeply in 2020. This may delay AI deployment by some years but it will do the technology (and you) well.

The year that ends has seen many successes of AI. But heavy criticism has also been received along with them, especially on the scientific front and the related base technology.

The Dark Side of AI

The Analyst Syndicate already reported what has been found. Here is a partial list sorted by increasing severity:

  • scientific reports by Machine/Deep Learning researchers overstate results and are too often irreproducible. (A less worrying fact than is usually believed: the vice is widespread, and the reproducibility of AI research is only slightly worse than overall Computer Science’s, which in turn is about twice as better than the reproducibility of scientific research backing drugs and medical treatments);
  • not unlike rule-based AI systems of the 1980s, modern systems can occasionally fail abruptly and without warning (the so-called ‘edge cases’);
  • AI systems can perform poorly when deployed in production contexts that do not fully reflect the training data set. For example, ‘learning’ algorithms can use unintended shortcut strategies which, while superficially successful, can fail under slightly different circumstances;
  • other fundamental flaws may undermine Machine Learning radically, like the ‘underspecification’ stigma recently unveiled.
Not all evil comes to harm

I regard those criticisms as welcome symptoms of the continuing, long, everlasting maturation of AI.

Like edge cases or shortcut learning, the underspecification problem strikes deep, in fact, possibly deeper. It tells us that the same algorithm can ‘learn’ different lessons after every new run on the same training set. Generally speaking, it will consequently behave in unexpected ways at production (‘inference’) time. While currently discussed specifically in supervised learning settings only, the effect on other styles of ML cannot be ruled out at this point.

Possible workarounds are outlined already in the still unreviewed paper that fully described the problem (already alluded to in the past) in November 2020.

This is encouraging: it means that the stigma will start being removed over the next few years. However, the proposed solutions seem to require a substantial increase in costs for training AI systems and, most crucially, heavy architectural reforms which could result in perturbations of the industry.

As important figures, like computer scientists Judea Pearl and Melanie Mitchell or cognitive scientist Gray Marcus, have been suggesting for a decade, an invocation seems to be mounting even from within the Machine Learning / Deep Learning community: ‘learning’ models ought to no longer be just fine algorithms that optimize complex mathematical functions, they must incorporate more domain knowledge.

Impact on the industry

Subject to final proof that the underspecification problem has been correctly framed, this seems to suggest an evolution of AI toward the development of vertical or industry-specific software, which I believe will take the form of domain-aware application development frameworks for client organizations to use with the aid of intermediate service providers.

AI is likely about to abandon the dream of generalized supervised-learning algorithms that can be trained to do anything.

Most organizations who have “adopted AI” have never actually put their hands on it directly: they are just using embedded-AI solutions as it typically happens with voice assistants, user authentication, suggestions to the online customer, AI-enhanced analytics, or AI-enhanced product lifecycle management.

The next wave will be when user organizations can customize AI engines, especially Machine/Deep Learning, to work on their specific needs. This entails acquiring full control of training and the ability to complement the neural network with more context knowledge than just examples. E.g., in a real-time Quality Control application or in a Predictive Maintenance one, an effective software needs to know more than how a picture shows deterioration in a product’s component: it also needs to know in what relation the component stands with other components or the whole product.

Satisfying customization needs is going to slow down the current spreading of AI projects because three to four years will be needed to architect the first new solutions and transform the business models with which “AI” is delivered to end-user organizations: no longer just embedded Machine Learning / Deep Learning but native tools as well, like is customary today with high-tech companies and perhaps another 3% of organizations.

So what?

Scientific labs and a tiny minority of user organizations will continue to take advantage of “AI” to push the appearance of marvelous technical applications.

But the vast majority are more eager to wait for reliable and usable technology that does not require top-notch scientific competence to achieve business results.

This will boost the market of AI as a great software tool directly in the hands of companies and society. Today, its use is essentially confined to embedded-AI products.