Deep Learning Is Hitting a Wall

Black-box and White-Box Models towards Explainable AI by Orhan G Yalçın

These were all the rage in the 1980s, with organisations clamouring to build their own expert systems, and they remain a useful part of AI today. However, this also required much manual effort from experts tasked with deciphering the chain of thought processes that connect various symptoms to diseases or purchasing patterns to fraud. This downside is not a big issue with deciphering the meaning of children’s stories or linking common knowledge, but it becomes more expensive with specialized knowledge. And mind, this is a basketball, a simple, spherical object that retains its shape regardless of the angle.

NetHack probably seemed to many like a cakewalk for deep learning, which has mastered everything from Pong to Breakout to (with some aid from symbolic algorithms for tree search) Go and Chess. But in December, a pure symbol-manipulation based system crushed the best deep learning entries, by a score of 3 to 1—a stunning upset. Usama Fayyad, chairman of technology and strategic consulting firm Open Insights, described machine learning as an iterative improvement of adaptive algorithms based on training data. For instance, in the shape example I started this article with, a neuro-symbolic system would use a neural network’s pattern recognition capabilities to identify objects. What is new is that the latest crop of generative AI apps sounds more coherent on the surface. But this combination of humanlike language and coherence is not synonymous with human intelligence, and there currently is great debate about whether generative AI models can be trained to have reasoning ability.

Researchers must therefore look beyond deep learning, he argues, and combine it with classical, or symbolic, AI—systems that encode knowledge and are capable of reasoning.
A developer of a machine learning system creates a model and then “trains” it by providing it with many examples.
Although the synthetic proof lengths are skewed towards shorter proofs, a small number of them still have lengths up to 30% longer than the hardest problem in the IMO test set.
This is important because all AI systems in the real world deal with messy data.

Theorem proving is difficult for learning-based methods because training data of human proofs translated into machine-verifiable languages are scarce in most mathematical domains. Geometry stands out among other olympiad domains because it has very few proof examples in general-purpose mathematical languages such as Lean9 owing to translation difficulties unique to geometry1,5. Geometry-specific languages, on the other hand, are narrowly ChatGPT defined and thus unable to express many human proofs that use tools beyond the scope of geometry, such as complex numbers (Extended Data Figs. 3 and 4). Overall, this creates a data bottleneck, causing geometry to lag behind in recent progress that uses human demonstrations2,3,4. Current approaches to geometry, therefore, still primarily rely on symbolic methods and human-designed, hard-coded search heuristics10,11,12,13,14.

Algebraic reasoning

AlphaGeometry is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. On a test set of 30 latest olympiad-level problems, AlphaGeometry solves 25, outperforming the previous best method that only solves ten problems and approaching the performance of an average International Mathematical ChatGPT App Olympiad (IMO) gold medallist. Notably, AlphaGeometry produces human-readable proofs, solves all geometry problems in the IMO 2000 and 2015 under human expert evaluation and discovers a generalized version of a translated IMO theorem in 2004. Research in automated theorem proving has a long history dating back to the 1950s (refs. 6,42,43), resulting in highly optimized first-order logic solvers such as E (ref. 44) or Vampire45.

Previous work has shown that while pre-trained models (without instruction tuning) can, to some extent, follow flipped labels presented in-context, instruction tuning degraded this ability. However, when combined, symbolic AI and neural networks can establish a solid foundation for enterprise AI development. In the 1960s and 1970s, technological advances inspired researchers to investigate the relationship between machines and nature. They believed that symbolic techniques would eventually result in an intelligent machine, which was viewed as their discipline’s long-term objective. “Hybrid AI is a compromise. It turns out that deep learning, for all its power, is not universally better,” said Liberty Mutual’s Gorlin.

Can AI pose a threat to human life and social stability?

Even if you take a million pictures of your cat, you still won’t account for every possible case. A change in the lighting conditions or the background of the image will change the pixel value and cause the program to fail. Many of the concepts and tools you find in computer science are the results of these efforts. Symbolic AI programs are based on creating explicit structures and behavior rules.

We present novel measures that indicate how close a formula is to a derivable formula, when the model is provably non-derivable, and we calculate the values of these measures using our reasoning system. In earlier work combining machine learning with reasoning, Marra et al.12 use a logic-based description to constrain the output of a GAN neural architecture for generating images. Scott et al.13 and Ashok et al.14 combine machine-learning tools and reasoning engines to search for functional forms that satisfy prespecified constraints.

What’s missing in deep neural networks?

While LLMs can provide impressive results in some cases, they fare poorly in others. Improvements in symbolic techniques could help to efficiently examine LLM processes to identify and rectify the root cause of problems. The excitement within the AI community lies in finding better ways to tinker with the integration between symbolic and neural network aspects.

An expert system is a computer program that uses artificial intelligence (AI) technologies to simulate the judgment and behavior of a human or an organization that has expertise and experience in a particular field. This new model enters the realm of complex reasoning, with implications for physics, coding, and more. The most famous remains the Turing Test, in which a human judge interacts, sight unseen, with both humans and a machine, and must try and guess which is which. Two others, Ben Goertzel’s Robot College Student Test and Nils J. Nilsson’s Employment Test, seek to practically test an A.I.’s abilities by seeing whether it could earn a college degree or carry out workplace jobs.

In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit. For comparison, the previous state-of-the-art system solved 10 of these geometry problems, and the average human gold medalist solved 25.9 problems. Knowledge graph embedding (KGE) is a machine learning task of learning a latent, continuous vector space representation of the nodes and edges in a knowledge graph (KG) that preserves their semantic meaning. This learned embedding representation of prior knowledge can be applied to and benefit a wide variety of neuro-symbolic AI tasks. One task of particular importance is known as knowledge completion (i.e., link prediction) which has the objective of inferring new knowledge, or facts, based on existing KG structure and semantics. These new facts are typically encoded as additional links in the graph.

Once developers settle on a way to represent the world, they apply a particular neural network to generate new content in response to a query or prompt. Techniques such as GANs and variational autoencoders (VAEs) — neural networks with a decoder and encoder — are suitable for generating realistic human faces, synthetic data for AI training or even facsimiles of particular humans. Google’s search engine is a massive hybrid AI that combines state-of-the-art deep learning techniques such as Transformers and symbol-manipulation systems such as knowledge-graph navigation tools. This is a reality that many of the pioneers of deep learning and its main component, artificial neural networks, have acknowledged in various AI conferences in the past year. Geoffrey Hinton, Yann LeCun, and Yoshua Bengio, the three “godfathers of deep learning,” have all spoken about the limits of neural networks.

Learn more

We use the Meliad library35 for transformer training with its base settings. The transformer has 12 layers, embedding dimension of 1,024, eight heads of attention and an inter-attention dense layer of dimension 4,096 with ReLU activation. Overall, the transformer has 151 million parameters, excluding embedding layers at its input symbolic ai examples and output heads. Our customized tokenizer is trained with ‘word’ mode using SentencePiece36 and has a vocabulary size of 757. We limit the maximum context length to 1,024 tokens and use T5-style relative position embedding37. Sequence packing38,39 is also used because more than 90% of our sequences are under 200 in length.

Enterprise hybrid AI use is poised to grow – TechTarget

Enterprise hybrid AI use is poised to grow.

Posted: Wed, 02 Mar 2022 08:00:00 GMT [source]

We’ve had to drastically increase compute in order to unlock new deep-learning abilities. If you ask DALL-E to create a Roman sculpture of a bearded, bespectacled philosopher wearing a tropical shirt, it excels. If you ask it to draw a beagle in a pink harness chasing a squirrel, sometimes you get a pink beagle or a squirrel wearing a harness. It does well when it can assign all the properties to a single object, but it struggles when there are multiple objects and multiple properties. The attitude of many researchers is that this is a hurdle for DL — larger for some, smaller for others — on the path to more human-like intelligence. Machine learning systems are also strictly bound to the context of their training examples, which is why they’re called narrow AI.

Symbolic deduction engines, on the other hand, are based on formal logic and use clear rules to arrive at conclusions. They are rational and explainable, but they can be “slow” and inflexible – especially when dealing with large, complex problems on their own. Neuro-symbolic AI is a synergistic integration of knowledge representation (KR) and machine learning (ML) leading to improvements in scalability, efficiency, and explainability. The topic has garnered much interest over the last several years, including at Bosch where researchers across the globe are focusing on these methods. In this short article, we will attempt to describe and discuss the value of neuro-symbolic AI with particular emphasis on its application for scene understanding.

They can do some superficial logical reasoning and problem solving, but it really is superficial at the moment. But perhaps we should be surprised that they can do anything beyond natural language processing. They weren’t designed to do anything else, so anything else is a bonus — and any additional capabilities must somehow be implicit in the text that the system was trained on. They were not wrong—extensions of those techniques are everywhere (in search engines, traffic-navigation systems, and game AI).

This opens up additional possibilities for the symbolic engine to continue searching for a proof. This cycle continues, with the language model adding helpful elements and the symbolic engine testing new proof strategies, until a verifiable solution is found. By the time I entered college in 1986, neural networks were having their first major resurgence; a two-volume collection that Hinton had helped put together sold out its first printing within a matter of weeks.

Parallel to the development of discriminative models, the development of generative neural networks was proceeding. These models have the unique ability to create new content after being trained on large sets of existing examples. “Human interpretation and labeling are essential for learning systems ranging from machine-learned ranking in a core web search engine to autonomous vehicle training.”

Examples include reading facial expressions, detecting that one object is more distant than another and completing phrases such as “bread and…” As artificial intelligence (AI) continues to evolve, the integration of diverse AI technologies is reshaping industry standards for automation. You can foun additiona information about ai customer service and artificial intelligence and NLP. AI in automation is impacting every sector, including financial services, healthcare, insurance, automotive, retail, transportation and logistics, and is expected to boost the GDP by around 26% for local economies by 2030, according to PwC.

Clips on social media of the latest machinations from Boston Dynamics, a US-based robotics company, are often accompanied by jokey comments about a looming machine takeover. Which brings us to growing concern about the amount of misinformation online – and how AI is being used to generate it. Google’s rival to ChatGPT, called Bard, had an embarrassing debut this month when a video demo of the chatbot showed it giving the wrong answer to a question about the James Webb space telescope. This is a story about greed, ignorance, and the triumph of human curiosity.

In the 1980s, AI scientists tried this approach with expert systems, rule-based programs that tried to encode all the knowledge of a particular discipline such as medicine. Expert systems were successful for very narrow domains but failed as soon as they tried to expand their reach and address more general problems. They also required huge efforts by computer programmers and subject matter experts. Using these four ingredients and the algorithm described in the main text, one can generate synthetic data for any target domain. As shown in our paper, there are non-trivial engineering challenges in building each ingredient. For example, current formalizations of combinatorics are very nascent, posing challenges to (1) and (2).

Integrating neural networks with symbolic AI systems should bring a heightened focus on data privacy, fairness and bias prevention. This emphasis arises because neuro-symbolic AI combines vast data with rule-based reasoning, potentially amplifying biases present in the data or the rules. Both symbolic and neural network approaches date back to the earliest days of AI in the 1950s. On the symbolic side, the Logic Theorist program in 1956 helped solve simple theorems.

Now, though, a new study from six Apple engineers shows that the mathematical “reasoning” displayed by advanced large language models can be extremely brittle and unreliable in the face of seemingly trivial changes to common benchmark problems. To study the effect of beam search on top of the language model, we reduced the beam size and search depth separately during proof search and reported the results in Extended Data Fig. We find that, with a beam size of 8, that is, a 64 times reduction from the original beam size of 512, AlphaGeometry still solves 21 problems.