Illustration with collage of pictograms of clouds, pie chart, graph pictograms on the following

Strong artificial intelligence (AI), also known as artificial general intelligence (AGI) or general AI, is a theoretical form of AI used to describe a certain mindset of AI development.

If researchers are able to develop Strong AI, the machine would require an intelligence equal to humans; it would have a self-aware consciousness that has the ability to solve problems, learn, and plan for the future.

Strong AI aims to create intelligent machines that are indistinguishable from the human mind. But just like a child, the AI machine would have to learn through input and experiences, constantly progressing and advancing its abilities over time.

While AI researchers in both academia and private sectors are invested in the creation of artificial general intelligence (AGI), it only exists today as a theoretical concept versus a tangible reality. While some individuals, like Marvin Minsky, have been quoted as being overly optimistic in what we could accomplish in a few decades in the field of AI; others would say that Strong AI systems cannot even be developed. Until the measures of success, such as intelligence and understanding, are explicitly defined, they are correct in this belief. For now, many use the Turing test to evaluate intelligence of an AI system.

Learn how to leverage the right databases for applications, analytics and generative AI.

Register for the ebook on AI data stores

Turing Test

Alan Turing developed the Turing Test in 1950 and discussed it in his paper,  “Computing Machinery and Intelligence”  (link resides outside ibm.com). Originally known as the Imitation Game, the test evaluates if a machine’s behavior can be distinguished from a human. In this test, there is a person known as the “interrogator” who seeks to identify a difference between computer-generated output and human-generated ones through a series of questions. If the interrogator cannot reliably discern the machines from human subjects, the machine passes the test. However, if the evaluator can identify the human responses correctly, then this eliminates the machine from being categorized as intelligent.

While there are no set evaluation guidelines for the Turing Test, Turing did specify that a human evaluator will only have a 70% chance of correctly predicting a human vs computer-generated conversation after 5 minutes. The Turing Test introduced general acceptance around the idea of machine intelligence.

However, the original Turing Test only tests for one skill set — text output or chess as examples. Strong AI needs to perform a variety of tasks equally well, leading to the development of the Extended Turing Test. This test evaluates textual, visual, and auditory performance of the AI and compares it to human-generated output. This version is used in the famous Loebner Prize competition, where a human judge guesses whether the output was created by a human or a computer.

Chinese Room Argument (CRA)

The Chinese Room Argument was created by John Searle in 1980. In his paper, he discusses the definition of understanding and thinking, asserting that computers would never be able to do this. In this excerpt from his paper, from  Stanford’s website  (link resides outside ibm.com), summarizes his argument well,

“Computation is defined purely formally or syntactically, whereas minds have actual mental or semantic contents, and we cannot get from syntactical to the semantic just by having the syntactical operations and nothing else…A system, me, for example, would not acquire an understanding of Chinese just by going through the steps of a computer program that simulated the behavior of a Chinese speaker (p.17).”

The Chinese Room Argument proposes the following scenario:

Imagine a person, who does not speak Chinese, sits in a closed room. In the room, there is a book with Chinese language rules, phrases and instructions. Another person, who is fluent in Chinese, passes notes written in Chinese into the room. With the help of the language phrasebook, the person inside the room can select the appropriate response and pass it back to the Chinese speaker.

While the person inside the room was able to provide the correct response using a language phrasebook, he or she still does not speak or understand Chinese; it was just a simulation of understanding through matching question or statements with appropriate responses. Searle argues that Strong AI would require an actual mind to have consciousness or understanding. The Chinese Room Argument illustrates the flaws in the Turing Test, demonstrating differences in definitions of  artificial intelligence .

Weak AI, also known as narrow AI, focuses on performing a specific task, such as answering questions based on user input or playing chess. It can perform one type of task, but not both, whereas Strong AI can perform a variety of functions, eventually teaching itself to solve for new problems. Weak AI relies on human interference to define the parameters of its learning algorithms and to provide the relevant training data to ensure accuracy. While human input accelerates the growth phase of Strong AI, it is not required, and over time, it develops a human-like consciousness instead of simulating it, like Weak AI. Self-driving cars and virtual assistants, like Siri, are examples of Weak AI.  

While there are no clear examples of strong  artificial intelligence , the field of AI is rapidly innovating.  Another AI theory has emerged, known as artificial superintelligence (ASI), super intelligence, or Super AI. This type of AI surpasses strong AI in human intelligence and ability. However, Super AI is still purely speculative as we have yet to achieve examples of Strong AI.

With that said, there are fields where AI is playing a more important role, such as:

  • Cybersecurity:  Artificial intelligence will take over more roles in organizations’ cybersecurity measures, including breach detection, monitoring, threat intelligence, incident response, and risk analysis.
  • Entertainment and content creation:  Computer science programs are already getting better and better at producing content, whether it is copywriting, poetry, video games, or even movies. OpenAI’s GBT-3 text generation AI app is already creating content that is almost impossible to distinguish from copy that was written by humans.
  • Behavioral recognition and prediction:  Prediction algorithms will make AI stronger, ranging from applications in weather and stock market predictions to, even more interesting, predictions of human behavior. This also raises the questions around implicit biases and ethical AI. Some AI researchers in the AI community are pushing for a set of anti-discriminatory rules, which is often associated with the hashtag #responsibleAI.

The terms artificial intelligence, machine learning and deep learning are often used in the wrong context. These terms are frequently used in describing Strong AI, and so it’s worth defining each term briefly:

Artificial intelligence  defined by  John McCarthy  (link resides outside ibm.com), is "the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable."

Machine learning  is a sub-field of artificial intelligence. Classical (non-deep) machine learning models require more human intervention to segment data into categories (i.e. through feature learning).

Deep learning  is also a sub-field of machine learning, which attempts to imitate the interconnectedness of the human brain using neural networks. Its artificial neural networks are made up layers of models, which identify patterns within a given dataset. They leverage a high volume of training data to learn accurately, which subsequently demands more powerful hardware, such as GPUs or TPUs. Deep learning algorithms are most strongly associated with human-level AI.    

To read more about the nuanced differences between these technologies, read “ AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the Difference? ”

Deep learning can handle complex problems well, and as a result, it is utilized in many innovative and emerging technologies today. Deep learning algorithms have been applied in a variety of fields. Here are some examples:

  • Self-driving cars:  Google and Elon Musk have shown us that self-driving cars are possible. However, self-driving cars require more training data and testing due to the various activities that it needs to account for, such as giving right of way or identifying debris on the road. As the technology matures, it’ll then need to get over the human hurdle of adoption as polls indicate that many drivers are not willing to use one.
  • Speech recognition:  Speech recognition, like  AI chatbots  and  virtual agents , is a big part of natural language processing. Audio-input is much harder to process for an AI, as so many factors, such as background noise, dialects, speech impediments and other influences can make it much harder for the AI to convert the input into something the computer can work with.
  • Pattern recognition:  The use of deep neural networks improves pattern recognition in various applications. By discovering patterns of useful data points, the AI can filter out irrelevant information, draw useful correlations and improve the efficiency of big data computation that may typically be overlooked by human beings.
  • Computer programming:  Weak AI has seen some success in producing meaningful text, leading to advances within coding. Just recently, OpenAI released GPT-3, an open-source software that can actually write code and simple computer programs with very limited instructions, bringing automation to program development.
  • Image recognition:  Categorizing images can be very time consuming when done manually. However, special adaptions of deep neural networks, such as DenseNet, which connects each layer to every other layer in the neural network, have made image recognition much more accurate.
  • Contextual recommendations:  Deep learning apps can take much more context into consideration when making recommendations, including language understanding patterns and behavioral predictions.
  • Fact checking:  The University of Waterloo recently released a tool that can detect fake news by verifying the information in articles by comparing it with other news sources.

Watson Assistant is the AI chatbot for business. This enterprise artificial intelligence technology enables users to build conversational AI solutions.

IBM Watson Assistant provides customers with fast, consistent and accurate answers across any application, device or channel.

Read more about how you take AI Ethical principles into practice.

Build an AI strategy for your business on one collaborative AI and data platform—IBM watsonx. Train, validate, tune and deploy AI models to help you scale and accelerate the impact of AI with trusted data across your business.

logo image missing

  • > Artificial Intelligence

What is Strong AI?

  • Hrithik Saini
  • Dec 24, 2021

What is Strong AI? title banner

Strong Artificial Intelligence (AI) is a type of artificial intelligence that creates mental capacities, cognitive processes, and behaviours that mimic those of the human brain. It is more like a philosophical approach than a practical one. 

It is predicated on the assumption that computers may be enhanced with the mental process, personal beliefs, their perspective and views, and a variety of other strong or broad AI characteristics commonly present in humans.

What is Strong AI ?

Strong artificial intelligence is more of a concept than a method for producing AI. It is an alternative view of AI in which AI is equated with humans. 

It says that a computer may be taught to behave like a human mind, to be intellectual in every sense of the term, to have experience, beliefs, as well as other cognitive processes that are generally solely attributed to humans.

However, because mankind cannot adequately define intelligence, it is challenging to provide a clear criterion for what constitutes success in the creation of strong AI. 

Weak AI, on the other hand, is quite attainable due to the way it defines intelligence. Rather than attempting to fully replicate a human mind, weak AI focuses on creating intelligence focused on a certain job or subject of study.

(Must read - All About AIoT (Artificial Intelligence of Things) )

Strong AI vs Weak AI

Strong AI is different from Weak AI. Strong AI must go through the same developmental process that now the human brain goes through from life course. It should hone its common decency and communication abilities by observation of people, the environment, and a variety of other factors. However, it is impossible to predict when powerful AI will be built since it is difficult to define a point of agreement.

Weak AI is concerned with human cognitive emulation. Weak AI can be used to replace medium-level educated professionals. It cannot achieve the degree of strategic insight necessary for strategy creation. Weak AI is dependent on human intervention for several parameters and requires training data to achieve accuracy.

Strong AI Trends in 2022

While no apparent instances of strong artificial intelligence exist, the subject of AI is fast evolving. Another AI idea, known as artificial superintelligence (ASI), superintelligence, or Super AI, has arisen. In terms of human intelligence and ability, this sort of AI outperforms strong AI. 

However, Super AI is still totally hypothetical because examples of Strong AI have yet to be achieved. Having said that, there are several industries where AI is becoming increasingly essential, such as:

Cybersecurity is important

Artificial Super Intelligence is critical in detecting security breaches, enforcing protections, controlling, surveillance, and risk analysis and presentation. 

Since scientists are working on its advancement and advancement of capabilities, AI still has many dimensions to discover, particularly in privacy protection to ensure the integrity and confidentiality of the information. 

Entertainment and content creation

With the impactful use of artificial intelligence, the quality of visual media content is enhancing era by era. Movies, video games, musical streams, and a variety of other media have improved in quality. 

Chips with Artificial Intelligence

AI-Enabled Chips are specialised processors that are integrated with the CPU to allow it to provide users with effective and efficient output. Facial recognition , object tracking from images, language processing, and other tasks can be completed more quickly. These processors will significantly affect the efficiency of AI-based applications.

AI in conjunction with the Internet of Things

Using the information obtained by the IoT, IoT devices can start implementing the results provided by AI algorithms . The combination of AI and IoT has resulted in the creation of a number of smart cities. 

AI integration with cloud applications is another groundbreaking development that has the potential to do wonders for the technology age, assisting in the advancement of user security and privacy.

Practical Examples of Strong AI

Simply said, generic AI can perform any work that a human can. It has all of the capabilities of a human brain. It might address any problem or activity in any field, such as music creation or logistics—all of the various activities that people can execute. Some examples of Strong AI include :

strong ai hypothesis says that

Practical examples of Strong AI

Generalize information and apply it to new situations as needed

Humans learn by trial and error. They use what they've learned from past experiences and apply it to new circumstances when they come across them. This is an example of strong AI.

Make use of the information and experience you've gained to plan for the future

Another talent that people have is the ability to utilize their previous experiences to prepare for the future. As they gain additional experiences, they will be able to use those experiences to form a strategy and steer the future. Machines with narrow AI must rely on humans to programme activities. The robots are incapable of devising a long-term strategy. 

Change and adapt to changing conditions

General AI devices will be adaptable to new situations. Narrow AI can only respond to factors that have been encoded into algorithms . General AI is capable of making judgments on the fly.

Reasoning ability

In contrast to narrow AI, universal AI will be able to understand. General AI robots will be able to assess a situation and decide on a course of activity even if it is outside of what a human has taught it. 

Put together a puzzle

AI systems have certainly participated and won video games and chess tournaments. Those accomplishments are examples of AI following patterns and programming. 

There are certain intriguing challenges that do not yet have a definite path to achievement. When robots can answer this type of challenge, broad AI will be accomplished.

Show common sense

Common sense is another extremely human characteristic. When a machine cannot rely on programming for solutions, common sense must be used. Weak AI is devoid of common sense. Machines will have to display common sense in order to compete with human cognitive capabilities. 

Consciousness

For universal AI to be realised, machines must be sentient and self-aware. 

Aside from mathematical equations

Narrow AI demonstrates in a variety of ways that many of the problems we solve as humans are just arithmetic equations. Robotics will attain human intelligence when they can tackle broad issues rather than algebraic calculations.

(Recommended blog - Difference between Narrow AI and General AI )

Deep Learning Applications

Deep learning has several uses in today's fast-paced world, where customer demands are escalating. Visual recognition, fraud detection, identification of child delayed development, personalization, sound addition to silent movies, and many other examples of Strong AI have demonstrated how inherent challenges have enhanced subjective experience as well as his security.

Self-Driving Cars

Deep learning has enabled self-driving cars to become a reality. To construct a model, a system is fed several sets of instructions. It is learned on diverse datasets and evaluated for proper outcomes.

Fraud Detection Using Deep Learning

Deep learning aids in the development of identifiers that can distinguish between bogus and true news. It also alerts you to any violations of your privacy.

Virtual Assistants

Virtual assistants such as Google Assistant, Siri, and Alexa have demonstrated how deep learning can be used to construct powerful systems that work on human speech recognition.

(Must read - Business Benefits of Deep Learning )

Strong AI Terms

Several words are linked with strong AI, and it is critical to describe them correctly to avoid any misunderstanding.

Artificial Intelligence (AI): AI refers to the replication and integration of computers with human-like characteristics such as comprehending, analysing, and cognitive processes.

Machine learning: It is a notion in which a software application is taught to learn and adaptable to changes facts obtained on its coding without the use of a human interface. It aids in keeping the algorithms incorporated into computers up to speed with global economic trends.

Deep Learning: Deep learning is an AI functionality that mimics the working of the human brain in terms of adapting, analyzing, analysing, and forming patterns utilised in decision-making . It is a type of machine learning that aids in the identification and prevention of fraud and tax evasion.  

Evolution of Strong AI

AI, when employed correctly, has the potential to improve people's lives. As a result, regulated growth of AI is essential to build a technological environment that is safe. AI will continue to explore new aspects for its consequences in the future. 

Deep learning and machine learning can do wonders for technical enterprises, but they can also lead to a reduction in physical labour, which can be detrimental in densely populated nations. As a result, AI must be developed and applied in accordance with the demands of the demographic.

Share Blog :

strong ai hypothesis says that

Be a part of our Instagram community

Trending blogs

5 Factors Influencing Consumer Behavior

Elasticity of Demand and its Types

What is PESTLE Analysis? Everything you need to know about it

An Overview of Descriptive Analysis

What is Managerial Economics? Definition, Types, Nature, Principles, and Scope

5 Factors Affecting the Price Elasticity of Demand (PED)

6 Major Branches of Artificial Intelligence (AI)

Dijkstra’s Algorithm: The Shortest Path Algorithm

Scope of Managerial Economics

Different Types of Research Methods

Latest Comments

strong ai hypothesis says that

  • Tech Gift Ideas for Mom
  • Hot Tech Deals at Target Right Now

Strong AI vs. Weak AI: What's the Difference?

Strong AI can do anything a human can do, while weak AI is limited to a specific task

In This Article

Jump to a Section

Overall Findings

Weak ai pros and cons, strong ai pros and cons.

  • Where Is the Line?
  • Frequently Asked Questions

Strong artificial intelligence can do anything a human can, while weak AI is limited to a specific task. Here's everything you need to know about strong AI vs. weak AI, including how they relate, how they differ, as well as the advantages and limitations of each.

Andriy Onufriyenko / Getty

Performs one specific task.

Programmed for a certain purpose.

Learns how to perform tasks faster.

No self-awareness.

Performs any task a human can.

Learns how to perform brand new skills.

Uses creativity to solve problems.

Potentially sentient.

All AI uses machine learning to constantly improve as it takes in new information. The major difference between weak and strong AI is that weak AI is programmed to perform a single task. The task could be very complex, like driving a car, or as simple as recommending movies to watch. All real-world examples of AI fall under the category of weak AI.

Although AI chatbots like ChatGPT and Bing AI are very advanced, they are still considered examples of weak AI because they perform only one job (responding to written text prompts). Virtual assistants like Alexa also fall under the umbrella of weak AI since they only respond to voice commands.

Strong AI, also called artificial general intelligence (AGI), possesses the full range of human capabilities, including talking, reasoning, and emoting. So far, strong AI examples exist in sci-fi movies like A.I.: Artificial Intelligence , WALL-E , and 2001: A Space Odyssey .

Faster and more efficient than humans.

Capable of reasoning in limited situations.

Can improve human life in many ways.

Can't learn new skills own its own.

Requires human oversight.

Could replace many human jobs.

Weak AI may be capable of human-level reasoning to some extent, such as considering an ethical problem, but it doesn't possess the full range of human intellect. Nonetheless, weak AI can perform specific tasks faster and more accurately than humans.

Weak AI has many applications, including fraud detection, financial planning, transportation, image enhancement, medicine, and scientific research. Robotics use weak AI to recognize and manipulate objects, while services like Netflix use weak AI to recommend movies based on your tastes. Gmail and other email providers use AI to detect and filter spam.

Because weak AI can't learn new skills independently, it can't continuously adapt to change, so human oversight is always needed to some degree. For example, if there were a sudden change to traffic laws, self-driving cars wouldn't know about it unless a human updated the AI's algorithm.

There is understandable anxiety about weak AI taking jobs from humans, leading to increased unemployment and economic uncertainty. There's also concern about bias in AI and governments using AI for surveillance.

Can perform almost any task better than a human.

Capable of reasoning at a human level or even higher.

New technologies are hard to predict.

Limits of strong AI are a subject of strong debate.

Whereas weak AI is constrained in the type of tasks it can perform, strong AI can learn new skills to solve any problem. In addition to doing the job it was designed for, strong AI could theoretically develop its own goals, just like a human.

A real-life example that pushes the boundaries between weak and strong AI is a program called MuZero , which can master video games that it hasn't been taught how to play. MuZero is technically weak AI since it's limited to playing video games, yet it can identify and pursue new goals without human intervention, a feature of strong AI.

Presumably, strong AI could identify human emotions and motivations, but whether AI can experience and process emotions as humans do is unclear. For now, that remains a debate for philosophers and futurists.

Strong AI could have game-changing effects in security, healthcare, and robotics. On the other hand, AI engineers like Dr. Geoffrey Hinton have warned that strong AI could develop goals and behaviors that are harmful to humans .

Where Is the Line Between Strong vs Weak AI?

The standards for what constitutes artificial intelligence have shifted as computers have advanced, and the line between weak vs. strong AI will continue to blur. Weak AI is easily identified by its limitations, but strong AI remains theoretical since it should have few (if any) limitations.

Narrow AI is another term for weak AI. It describes systems that can only do a single, specialized task.

AI art samples images from all over the internet (often without the creators' permission) to create new pictures based on text prompts. Because it only does one thing – convert text prompts to images – AI art generators are an example of weak AI.

Get the Latest Tech News Delivered Every Day

  • What Is Strong AI?
  • What Is Google DeepMind?
  • What Is a Large Language Model?
  • What Is Generative AI?
  • Supervised vs. Unsupervised Learning: What's the Difference?
  • 10 Positive Impacts of Artificial Intelligence
  • ChatGPT vs. Gemini: What's the Difference?
  • What Is Transfer Learning?
  • Data Science vs. Artificial Intelligence: What's the Difference?
  • What Is Stable Diffusion? A Look at How an AI Model is Reshaping the Images You See
  • Machine Learning vs. Deep Learning: What's the Difference?
  • What Is Artificial Intelligence?
  • What Is the Turing Test?
  • The Four Types of Artificial Intelligence
  • What Is ChatGPT, and How Can I Use It?
  • Can AI Read Your Mind?

More From Forbes

Rethinking weak vs. strong ai.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Artificial intelligence has a broad range of ways in which it can be applied - from chatbots to predictive analytics, from recognition systems to autonomous vehicles, and many other patterns. However, there is also the big overarching goal of AI: to make a machine intelligent enough that it can handle any general cognitive task in any setting, just like our own human brains. The general AI ecosystem classifies these AI efforts into two major buckets: weak (narrow) AI that is focused on one particular problem or task domain, and strong (general) AI that focuses on building intelligence that can handle any task or problem in any domain. From the perspectives of researchers, the more an AI system approaches the abilities of a human, with all the intelligence, emotion, and broad applicability of knowledge of humans, the “stronger” that AI is. On the other hand the more narrow in scope, specific to a particular application the AI system is, the weaker it is in comparison. But do these terms mean anything? And does it matter whether we have strong or weak AI systems?

Defining strong AI

In order to understand what these terms actually mean, let’s look more closely at these term definitions . The term “strong” AI can alternatively be understood as broad or general AI. Artificial general intelligence (AGI) is focused on creating intelligent machines that can successfully perform any intellectual task that a human being can. This comes down to three aspects: (1) the ability to generalize knowledge from one domain to another and take knowledge from one area and apply it somewhere else; (2) the ability to make plans for the future based on knowledge and experiences; and (3) the ability to adapt to the environment as changes occur. Additionally, there are ancillary aspects that come with these main requirements, such as the ability to reason, solve puzzles, represent knowledge and common sense, and the ability to plan.

Some have argued that the above definition for strong AI is not enough to be classified as truly intelligent because just being able to perform tasks and communicate like a human is not really strong AI. Bolstering this definition of strong AI is the idea of systems in which humans are unable to distinguish between a human and a machine, much like a physical version of a Turing test. The Turing Test aims to test intelligence by putting a human, a machine, and an interrogator in a conversational setting. If the interrogator can’t distinguish between the human and the machine, then it passes the Turing Test. Nowadays, some very advanced chatbots ( and even the recent Google Duplex demo ) are seeming to pass the Turing Test. Is Google Duplex truly intelligent?

The second test that builds off the Turing Test is called the Chinese Room which was created by John Searle. It assumes that a machine has been built that passes the Turing test and convinces a human Chinese speaker that the program is itself a live Chinese speaker. The question Searle wants to answer is this: does the machine literally "understand" Chinese? Or is it merely simulating the ability to understand Chinese? The test occurs by a human sitting in a closed off room with a book of instructions that is in English. Chinese characters are passed through a slot, the human in the room reads the instructions in English, and provides output in Chinese characters. Searle believes that there is no difference between the roles of the computer and the person in the experiment because each follow a program with step by step instructions and produce a behavior that is deemed intelligent. 

However, Searle does not really deem this as intelligent because the person still doesn’t understand Chinese even if their output is considered intelligent. Searle argues that by this logic, the computer also doesn’t understand Chinese. Without "understanding" he says you can’t say that the machine is "thinking”and that in order to think you must have a "mind". From Searle’s perspective, a Strong AI system must have understanding, otherwise it’s just a less-intelligent simulation.  There are others that say that even this doesn’t go far enough in the definition of strong AI. Rather, philosophers and researchers say that strong AI is defined as the ability to experience consciousness. So where is the line of general AI and is that goal ever achievable?

Paris 2024 Olympics Morocco Miss Out On Historic Qualification Zambia Qualifies

2025 4runner rounds out toyota’s off-road suv lineup, joker 2 trailer joaquin phoenix and lady gaga cause mass mayhem.

Defining weak AI

Now that we have defined strong, or general AI, how do we define weak AI? Well, by definition narrow or weak AI anything that isn’t strong or general AI. But this definition isn’t really helpful because we haven’t yet been able to successfully build strong AI by any of the definitions above. So does that mean everything we’ve built so far is weak? The short answer is yes.

However weak AI isn’t a particularly useful term, because weak implies that these AI systems aren’t powerful or able to perform useful tasks which isn’t the case. Rather than the pejorative term “weak” AI, it is much more preferable to use the term “narrow” AI. Narrow AI is exemplified by technologies such as image and speech recognition, AI powered chatbots, or even self driving cars. We’re slowly creeping our way up the ladder of intelligence and as technology continues to advance our definitions and expectations of “smart” systems does as well. A few decades ago OCR was considered AI, but today many people no longer define OCR as AI. The meaning will change over time and continues to evolve, and doesn’t even really give us any measurable specificity as to how intelligence a system is since there is a disagreement as to how strong a system should be. 

Given all of the above, is there any value to having such a stark contrast between narrow and general AI? After all, if narrow and general AI are just relative terms, it may be better to define the intelligence of systems in terms of a spectrum of maturity of how intelligent it is against the sort of tasks or range of tasks to be done. At one end of the spectrum we have AI at its most narrow application, to a single task and barely above what you could do with straight-forward programming and rules. At the other end, the AI is so mature that we’ve created a new kind of sentient being. But between the two we have many degrees of intelligence and applicability. We should probably move away from narrow vs. general AI terminology and adopt a more graduated approach to intelligence. After all, if everything we’re doing now is narrow AI, and general AI might be a long time coming, if ever, then having these all-or-nothing terms has very limited value.

It’s easy to get lost in the philosophy but we should keep in mind how AI maturity is changing and how that maturity can be applied to meet new needs. Given the ambiguity of the above terms we should rather look at the capabilities of what these AI systems can do and map them across the spectrum while keeping in mind the ever pushing boundaries of AI technologies.

Kathleen Walch

  • Editorial Standards
  • Reprints & Permissions

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 17 June 2020

Why general artificial intelligence will not be realized

  • Ragnar Fjelland 1  

Humanities and Social Sciences Communications volume  7 , Article number:  10 ( 2020 ) Cite this article

191k Accesses

118 Citations

755 Altmetric

Metrics details

  • Science, technology and society

The modern project of creating human-like artificial intelligence (AI) started after World War II, when it was discovered that electronic computers are not just number-crunching machines, but can also manipulate symbols. It is possible to pursue this goal without assuming that machine intelligence is identical to human intelligence. This is known as weak AI. However, many AI researcher have pursued the aim of developing artificial intelligence that is in principle identical to human intelligence, called strong AI. Weak AI is less ambitious than strong AI, and therefore less controversial. However, there are important controversies related to weak AI as well. This paper focuses on the distinction between artificial general intelligence (AGI) and artificial narrow intelligence (ANI). Although AGI may be classified as weak AI, it is close to strong AI because one chief characteristics of human intelligence is its generality. Although AGI is less ambitious than strong AI, there were critics almost from the very beginning. One of the leading critics was the philosopher Hubert Dreyfus, who argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all. One of Dreyfus’ main arguments was that human knowledge is partly tacit, and therefore cannot be articulated and incorporated in a computer program. However, today one might argue that new approaches to artificial intelligence research have made his arguments obsolete. Deep learning and Big Data are among the latest approaches, and advocates argue that they will be able to realize AGI. A closer look reveals that although development of artificial intelligence for specific purposes (ANI) has been impressive, we have not come much closer to developing artificial general intelligence (AGI). The article further argues that this is in principle impossible, and it revives Hubert Dreyfus’ argument that computers are not in the world.

Similar content being viewed by others

strong ai hypothesis says that

Highly accurate protein structure prediction with AlphaFold

John Jumper, Richard Evans, … Demis Hassabis

strong ai hypothesis says that

Generative AI for designing and validating easily synthesizable and structurally novel antibiotics

Kyle Swanson, Gary Liu, … Jonathan M. Stokes

strong ai hypothesis says that

Invalid SMILES are beneficial rather than detrimental to chemical language models

Michael A. Skinnider

Introduction

The idea of machines that can perform tasks that require intelligence goes at least back to Descartes and Leibniz. However, the project made a major step forward when in the early 1950s it was recognized that electronic computers are not only number-crunching devices, but may be made to manipulate symbols. This was the birth of artificial intelligence (AI) research. It is possible to pursue this goal without assuming that machine intelligence is identical to human intelligence. For example, one of the pioneers in the field, Marvin Minsky, defined AI as: “… the science of making machines do things that would require intelligence if done by men” (quoted from Bolter, 1986 , p. 193). This is sometimes called weak AI. However, many AI researcher have pursued the aim of developing AI that is in principle identical to human intelligence, called strong AI. This entails that “…the appropriately programmed computer is a mind, in the sense that computers can be literally said to understand and have other cognitive states” (Searle, 1980 , p. 417).

In this paper, I shall use a different terminology, which is better adapted to the issues that I discuss. Because human intelligence is general, human-like AI is therefore often called artificial general intelligence (AGI). Although AGI possesses an essential property of human intelligence, it may still be regarded as weak AI. It is nevertheless different from traditional weak AI, which is restricted to specific tasks or areas. Traditional weak AI is therefore sometimes called artificial narrow intelligence (ANI) (Shane, 2019 , p. 41). Although I will sometimes refer to strong AI, the basic distinction in this article is between AGI and ANI. It is important to keep the two apart. Advances in ANI are not advances in AGI.

In 1976 Joseph Weizenbaum, at that time professor of informatics at MIT and the creator of the famous program Eliza , published the book Computer Power and Human Reason (Weizenbaum, 1976 ). As the title indicates, he made a distinction between computer power and human reason. Computer power is, in today’s terminology, the ability to use algorithms at a tremendous speed, which is ANI. Computer power will never develop into human reason, because the two are fundamentlly different. “Human reason” would comprise Aristotle’s prudence and wisdom. Prudence is the ability to make right decisions in concrete situations, and wisdom is the ability to see the whole. These abilities are not algorithmic, and therefore, computer power cannot—and should not—replace human reason. The mathematician Roger Penrose a few years later wrote two major books where he showed that human thinking is basically not algorithmic (Penrose, 1989 , 1994 ).

However, my arguments will be slightly different from Weizenbaum’s and Penrose’s. I shall pursue a line of arguments that was originally presented by the philosopher Hubert Dreyfus. He got into AI research more or less by accident. He had done work related to the two philosophers Martin Heidegger and Ludwig Wittgenstein. These philosophers represented a break with mainstream Western philosophy, as they emphasized the importance of the human body and practical activity as primary compared to the world of science. For example, Heidegger argued that we can only have a concept of a hammer or a chair because we belong to a culture where we grow up and are able to handle these objects. Dreyfus therefore thought that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all (Dreyfus and Dreyfus, 1986 , p. 5).

One of the important places for AI research in the 1950s and 1960s was Rand Corporation. Strangely enough, they engaged Dreyfus as a consultant in 1964. The next year he submitted a critical report titled: “Alchemy and Artificial Intelligence”. However, the leaders of the AI project at Rand argued that the report was nonsense, and should not be published. When it was finally released, it became the most demanded report in the history of Rand Corporation. Dreyfus later expanded the report to the book What Computers Can’t Do (Dreyfus, 1972 ). In the book he argued that an important part of human knowledge is tacit. Therefore, it cannot be articulated and implemented in a computer program.

Although Dreyfus was fiercely attacked by some AI researchers, he no doubt pointed to a serious problem. But during the 1980s another paradigm became dominant in AI research. It was based on the idea of neural networks . Instead of taking manipulation of symbols as model, it took the processes in our nervous system and brain as model. A neural network can learn without receiving explicit instructions. Thus it looked as if Dreyfus’ arguments for what computers cannot do were obsolete.

The latest off-spring is Big Data. Big Data is the application of mathematical methods to huge amounts of data to find correlations and infer probabilities (Najafabadi et al., 2015 ). Big Data poses an interesting challenge: I mentioned previously that AGI is not part of strong AI. However, although Big Data does not represent the ambition of developing strong AI, advocates argued that this is not necessary. We do not have to develop computers with human-like intelligence. On the contrary, we may change our thinking to be like the computers. Implicitly this is the message of Viktor Mayer-Schönberger and Kenneth Cukier’s book: Big Data: A Revolution That Will Transform How We Live, Work, and Think (Mayer-Schönberger and Cukier, 2014 ). The book is optimistic about what Big Data can accomplish and its positive effects on our personal lives and society as a whole.

Some even argue that the traditional scientific method of using hypotheses, causal models, and tests is obsolete. Causality is an important part of human thinking, particularly in science, but according to this view we do not need causality. Correlations are enough. For example, based on criminal data we can infer where crimes will occur, and use it to allocate police resources. We may even be able to predict crimes before they are committed, and thus prevent them.

If we look at some of the literature on AI research it looks as if there are no limits to what the research can accomplish within a few decades. One example is Mayer-Schönberger and Cukier’s book that I referred to above. Here is one quotation:

In the future—and sooner than we may think – many aspects of our world will be augmented or replaced by computer systems that today are the sole purview of human judgment (Mayer-Schönberger and Cukier, 2014 , p. 12).

An example that supports this view is the Obama Administration, which in 2012 announced a “Big Data Research and Development Initiative” to “help solve some of the Nations’s most pressing challenges” (quoted from Chen and Lin, 2014 , p. 521).

However, when one looks at what has actually been accomplished compared to what is promised, the discrepancy is striking. I shall later give some examples. One explanation for this discrepancy may be that profit is the main driving force, and, therefore, many of the promises should be regarded as marketing. However, although commercial interests no doubt play a part, I think that this explanation is insufficient. I will add two factors: First, one of the few dissidents in Silicon Valley, Jerone Lanier, has argued that the belief in scientific immortality, the development of computers with super-intelligence, etc., are expressions of a new religion, “expressed through an engineering culture” (Lanier, 2013 , p. 186). Second, when it is argued that computers are able to duplicate a human activity, it often turns out that the claim presuppose an account of that activity that is seriously simplified and distorted. To put it simply: The overestimation of technology is closely connected with the underestimation of humans.

I shall start with Dreyfus’ main argument that AGI cannot be realized. Then I shall give a short account of the development of AI research after his book was published. Some spectacular breakthroughs have been used to support the claim that AGI is realizable within the next few decades, but I will show that very little has been achieved in the realization of AGI. I will then argue that it is not just a question of time, that what has not been realized sooner, will be realized later. On the contrary, I argue that the goal cannot in principle be realized, and that the project is a dead end. In the second part of the paper I restrict myself to arguing that causal knowledge is an important part of humanlike intelligence, and that computers cannot handle causality because they cannot intervene in the world. More generally, AGI cannot be realized because computers are not in the world. As long as computers do not grow up, belong to a culture, and act in the world, they will never acquire human-like intelligence.

Finally, I will argue that the belief that AGI can be realized is harmful. If the power of technology is overestimated and human skills are underestimated, the result will in many cases be that we replace something that works well with something that is inferior.

Tacit knowledge

Dreyfus placed AI into a philosophical tradition going back to Plato. Plato’s theory of knowledge was constructed on the ideal of mathematics, in particular geometry. Geometry is not about material bodies, but ideal bodies. We can only acquire real knowledge, episteme, by turning the attention away from the material world, and direct it “upwards”, to the world of ideal objects. Plato even criticized the geometers for not understanding their own trade, because they thought they were “… doing something and their reasoning had a practical end, and the subject were not, in fact, pursued for the sake of knowledge” (Plato, 1955 , p. 517). Skills are merely opinion, doxa, and are relegated to the bottom of his knowledge hierarchy.

According to this view, a minimum requirement for something to be regarded as knowledge is that it can be formulated explicitly. Western philosophy has by and large followed Plato and only accepted propositional knowledge as real knowledge. An exception is what Dreyfus called the “anti-philosophers” Merleau-Ponty, Heidegger, and Wittgenstein. He also referred to the scientist and philosopher Michael Polanyi. In his book, Personal Knowledge Polanyi introduced the expression tacit knowledge Footnote 1 . Most of the knowledge we apply in everyday life is tacit. In fact, we do not know which rules we apply when we perform a task. Polanyi used swimming and bicycle riding as examples. Very few swimmers know that what keeps them afloat is how they regulate their respiration: When they breathe out, they do not empty their lungs, and when they breathe in, they inflate their lungs more than normal.

Something similar applies to bicycle riding. The bicycle rider keeps his balance by turning the handlebar of the bicycle. To avoid falling to the left, he moves the handlebar to the left, and to avoid falling to the right he turns the handlebar to the right. Thus he keeps his balance by moving along a series of small curvatures. According to Polanyi a simple analysis shows that for a given angle of unbalance, the curvature of each winding is inversely proportional to the square of the speed of the bicycle. But the bicycle rider does not know this, and it would not help him become a better bicycle rider (Polanyi, 1958 , p. 50). Later Polanyi formulated this insight as “… we can know more than we can tell ” (Polanyi, 2009 , p. 4, italics in original).

However, the important thing in Polanyi’s contribution is that he argued that skills are a precondition for articulate knowledge in general, and scientific knowledge in particular. For example, to carry out physical experiments requires a high degree of skills. These skills cannot just be learned from textbooks. They are acquired by instruction from someone who knows the trade.

Similarly, Hubert Dreyfus, in cooperation with his brother Stuart, developed a model for acquisition of skills. At the lowest level the performer follows explicit rules. The highest level, expert performance, is similar to Polanyi’s account of scientific practice. An important part of expertise is tacit. The problem facing the development of expert systems, that is, systems that enable a computer to simulate expert performance (for example medical diagnostics) is that an important part of the expert knowledge is tacit. If experts try to articulate the knowledge they apply in their performance, they normally regress to a lower level. Therefore, according to Hubert and Stuart Dreyfus, expert systems are not able to capture the skills of an expert performer (Dreyfus and Dreyfus, 1986 , p. 36). We know this phenomenon from everyday life. Most of us are experts on walking. However, if we try to articulate how we walk, we certainly give a description that does not capture the skills involved in walking.

Three “milestones” in AI research

However, after Hubert Dreyfus published What Computers Can’t Do , AI has made tremendous progress. I will mention three “milestones” that have received public attention and contributed to the impression that AGI is just “around the corner”.

The first “milestone” is IBM’s chess-playing computer Deep Blue , which is often regarded as a breakthrough when it in 1997 defeated the world champion of chess, Garri Kasparov. However, Deep Blue was an example of ANI; it was made for a specific purpose. Although it did extremely well in an activity that requires intelligence when performed by humans, no one would claim that Deep Blue had acquired general intelligence.

The second is IBM’s computer Watson . It was developed with the explicit goal of joining the quiz show Jeopardy! . This is a competition where the participants are given the answers, and are then supposed to find the right questions. They may for example be presented the answer: “This ‘Father of Our Country’ didn’t really chop down a cherry tree”. The correct question the participants are supposed to find is: ”Who was George Washington?” Footnote 2

Jeopardy! requires a much larger repertoir of knowledge and skills than chess. The tasks cover a variety of areas, such as science, history, culture, geography, and sports, and may contain analogies and puns. It has three participants, competing to answer first. If you answer incorrectly, you will be drawn and another of the participants will have the opportunity to answer. Therefore, the competition requires both knowledge, speed, but also the ability to limit oneself. The program has enjoyed tremendous popularity in the United States since it began in 1964, and is viewed by an average of seven million people (Brynjolfson and McAfee, 2014 , p. 24).

Watson communicates using natural language. When it participated in Jeopardy! it was not connected to the Internet, but had access to 200 million pages of information (Susskind and Susskind, 2015 , p. 165; Ford, 2015 , p. 98ff). In 2011 it beat the two best participants in Jeopardy! , Ken Jennings and Brad Rutter. Jennings had won 74 times in a row in 2004, and had received over $3 million in total. Rutter had won over Jennings in 2005, and he too had won over $3 million. In the 2-day competition, Watson won more than three times as much as each of its human competitors.

Although Watson was constructed to participate in Jeopardy! , IBM had further plans. Shortly after Watson had won Jeopardy! the company announced that they would apply the power of the computer to medicine: It should become an AI medical super-doctor, and revolutionize medicine. The basic idea was that if Watson had access to all medical literature (patients’ health records, textbooks, journal articles, lists of drugs, etc.) it should be able to offer a better diagnosis and treatment than any human doctor. In the following years IBM engaged in several projects, but the success has been rather limited. Some have just been closed down, and some have failed spectacularly. It has been much more difficult than originally assumed to construct an AI doctor. Instead of super-doctors IBM’s Watson Health has turned out AI assistants that can perform in routine tasks (Strickland, 2019 ).

The third “milestone” is Alphabet’s AlphaGo . Go is a board game invented more than 2000 years ago in China. The complexity of the game is regarded as even larger than chess, and it is played by millions of people, in particular in East Asia. In 2016, AlphaGo defeated the world champion Le Sedol in five highly publicized matches in Seoul, South Korea. The event was documented in the award-winning film AlphaGo (2017, directed by Greg Kohs).

AlphaGo is regarded as a milestone in AI research because it was an example of the application of a strategy called deep reinforcement learning . This is reflected in the name of the company, which is DeepMind. (After a reconstruction of Google, Google and DeepMind are subsidiaries of Alphabet.) It is an example of an approach to AI research that is based on the paradigm of artificial neural networks. An artificial neural network is modeled on neural networks. Our brain contains approximately one hundred billion neurons. Each neuron is connected to approximately 1000 neurons via synapses. This gives around a hundred trillion connections in the brain. An artificial neural network consists of artificial neurons, which are much simpler than natural neurons. However, it has been demonstrated that when many neurons are connected in a network, a large enough network can in theory carry out any computation. What is practically possible, is of course a different question (Minsky, 1972 , p. 55; Tegmark, 2017 , p. 74).

Neural networks are particularly good at pattern recognition. For example, to teach a neural network to identify a cat in a picture we do not have to program the criteria we use to identify a cat. Humans have normally no problems distinguishing between, say, cats and dogs. To some degree we can explain the differences, but very few, probably no one, will be able to give a complete list of all criteria used. It is for the most part tacit knowledge, learned by examples and counter-examples. The same applies to neural networks.

A deep learning neural network consists of different layers of artificial neurons. For example, a network may have four different layers. In analyzing a picture the first layer may identify pixels as light and dark. The second layer may identify edges and simple shapes. The third layer may identify more complex shapes and objects, and the fourth layer may learn which shapes can be used to identify an object (Jones, 2014 , p. 148).

The advantage is that one must not formulate explicitly the criteria used, for example, to identify a face. This is the crucial difference between the chess program Deep Blue and AlphaGo. Although a human chess player uses a mixture of calculation and intuition to evaluate a particular board position, Deep Blue was programmed to evaluate numerous possible board positions, and decide the best possible in a given situation. Go is different. In many cases expert players relied on intuition only, and were only able to describe a board position as having “good shape” (Nielsen, 2016 ). I have mentioned earlier that one of Hubert Dreyfus’ main arguments against AGI was that human expertise is partly tacit, and cannot be articulated. AlphaGo showed that computers can handle tacit knowledge, and it therefore looks as if Dreyfus’ argument is obsolete. However, I will later show that this “tacit knowledge” is restricted to the idealized “world of science”, which is fundamentally different from the human world that Dreyfus had in mind.

The advantage of not having to formulate explicit rules comes at a price, though. In a traditional computer program all the parameters are explicit. This guarantees full transparency. In a neural network this transparency is lost. One often does not know what parameters are used. Some years ago a team at University of Washington developed a system that was trained to distinguish between huskies and wolves. This is a task that requires considerable skill, because there is not much difference between them. In spite of this the system had an astonishing 90% accuracy. However, the team discovered that the system recognized wolves because there was snow on most of the wolf pictures. The team had invented a snow detector! (Dingli, 2018 ).

AlphaGo was developed by the researchers of DeepMind, and is regarded as a big success. DeepMind’s approach was also applied successfully to the Atari games Breakout and Space Invaders, and the computer game Starcraft. However, it turned out that the system lacks flexibility, and is not able to adapt to changes in the environment. It has even turned out to be vulnerable to tiny changes. Because real world problems take place in a changing world, deep reinforcement learning has so far found few commercial applications. Research and development is costly, but DeepMind’s losses of 154 million dollars in 2016, 341 million in 2017, and 572 million in 2018 are hardly a sign of success (Marcus, 2019 ).

The latest hype: Big Data

The challenge of neural networks is that they must be able to handle huge amounts of data. For example AlphaGo was first trained on 150,000 games played by competent Go players. Then it was improved by repeatedly playing against earlier versions of itself.

Computers’ increasing ability to process and store huge amounts of data has led to what is called the “data explosion”, or even “data deluge”. Already in 2012 it was estimated that Google processed around 24 petabytes (24 × 10 15 ) of data every day. This is thousands of times the amount of printed material in the US Library of Congress (Mayer-Schönberger and Cukier, 2014 , p. 8). At the same time it was estimated that 2.5 exabytes (2.5 × 10 18 bytes) were created in the world per day. This is estimated to be approximately half of all the words ever spoken by humans. This amount of data is beyond human imagination, and it is the background for the Big Data approach.

Although Big Data analysis may be regarded as a supplemental method for data analysis for large amounts of data, typically terabytes and petabytes, it is sometimes presented as a new epistemological approach. Viktor Mayer-Schönberger and Kenneth Cukier start their book Big Data with the example of a flu that was discovered in 2009. It combined elements from viruses that caused bird flu and swine flu, and was given the name H1N1. It spread quickly, and within a week public health agencies around the world feared a pandemic. Some even feared a pandemic of the same size as the 1918 Spanish flu that killed millions. There was no vaccine against the virus, and the only thing the health authorities could do was to try to slow it down. But to be able to do that, they had to know where it had already spread. Although doctors were requested to inform about new cases, this information would take 1–2 weeks to reach the authorities, primarily because most patients do not consult a doctor immediately after the appearance of the symptoms of the disease.

However, researchers at Google had just before this outbreak invented a method that could much better predict the spread of the flu. Google receives more than three billion search queries every day, and save them all. People who have symptoms of flu tend to search the internet for information on flu. Therefore, by looking at search items that are highly correlated with flu, the researchers could map the spread of flu much quicker than the health authorities (Mayer-Schönberger and Cukier, 2014 , p. 2).

Mayer-Schönberger and Cukier regard this a a success story. But this may be an example of what is sometimes called “the fallacy of initial success”. In 2013 the model reported twice as many doctor visits for influenza-like illnesses as the Centers for Disease Control and Prevention, which is regarded as a reliable source of information. The initial version of the model had probably included seasonal data that were correlated with the flu, but were causally unrelated. Therefore, the model was part a flu detector and part a winter detector. Although the model has been updated, its performance has been far below the initial promises (Lazer et al., 2014 ; Shane, 2019 , p. 171).

Correlations and causes

The previous examples just involved correlations. However, in the sciences and also in everyday life, we want to have causal relations. For example, one of the big questions of our time involves causal knowledge: Is the global warming that we observe caused by human activity (the release of greenhouse gases into the atmosphere), or is it just natural variations?

The nature of causal relationships has been discussed for centuries, in particular after David Hume criticized the old idea of a necessary relationship between cause and effect. According to Hume we have to be satisfied with the observation of regularities. His contemporary Immanuel Kant, on the contrary, argued that causal relationships are a prerequisite for the acquisition of knowledge. It is necessary that every effect has a cause.

However, instead of going into the philosophical discussion about causal relationships, which has continued until this day, it is more fruitful to see how we identify a causal relationship. The philosopher John Stuart Mill formulated some rules (he called them “canons”) that enable us to identify causal relationships. His “second canon” which he also called “the method of difference” is the following:

If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance in common save one, that one occurring only in the former; the circumstance in which alone the two instances differ, is the effect, or the cause, or an indispensable part of the cause, of the phenomenon (Mill, 1882 , p. 483).

From this quotation we see that the distinguishing mark of a causal relationship is a 100% correlation between cause and effect. But most correlations are not causal. For example, there is a high positive correlation between gasoline prices and my age, but there is obviously no causal relationship between the two. A correlation may therefore be an indication of a causal link, but it need not be.

Therefore, in the quotation above, Mill requires that the two cases be equal in all circumstances. But still we can only decide that the difference between the two is either the cause or the effect, because correlation is a symmetrical mathematical relationship: If A is correlated with B, B is correlated with A. In contrast, if C is the cause of E, E is not the cause of C. Therefore, correlations cannot distinguish between cause and effect. To make this distinction we need something more: The cause produces, or at least brings about, the effect. Therefore, we may remove the assumed cause, and see if the effect disappears.

We have a famous example of this procedure from the history of medicine (more specifically epidemiology). Around 1850 there was a cholera epidemic in London. John Snow was a practicing physician. He noted that there was a connection between what company people got the water from and the frequency of cholera. The company Southwark and Vauxhall, which had water intake at a polluted site in the Thames, had a high frequency of cholera cases. Another company, the Lambeth Company, had significantly lower numbers. Although this was before the theory of bacteria as the cause of disease, he assumed that the cause of the disease was found in the water. Here are Snow’s numbers:

After Snow had sealed a water pump that he believed contained infectious water, the cholera epidemic ended (Sagan, 1996 , p. 76).

If the effect always follows the cause, everything else equal, we have deterministic causality. However, many people smoke cigarettes without contracting cancer. The problem is that in practice some uncertainty is involved. Therefore, we need a definition of a causal relationship when we have <100% correlation between cause and effect. According to this definition a probabilistic cause is not always followed by the effect, but the frequency of the effect is higher than when the cause is not present. This can be written as P(E|C) > P(E|not-C). P(E|C) is a conditional probability, and can be read as “the probability of E, given C”.

However, although this looks straightforward, it is not. An example will show this. After World War II there were many indications that cigarette smoking might cause lung cancer. It looks as if this question might be decided in a straightforward way: One selects two groups of people that are similar in all relevant aspects. One group starts smoking cigarettes and another does not. This is a simple randomized, clinical trial. Then one checks, after 10 years, 20 years, 30 years, and so on, and see if there is a difference in the frequency of lung cancer in the two groups.

Of course, if cigarette smoking is as dangerous as alleged, one would not wait decades to find out. Therefore, one had to use the population at hand, and use correlations: One took a sample of people with lung cancer and another sample of the population that did not have cancer and looked at different background factors: Is there a higher frequency of cigarette smokers among the people who have contracted lung cancer than people who have not contracted lung cancer. The main criterium is “ceteris paribus”, everything else equal.

One thing is to acknowledge that we sometimes have to use correlations to find causal relations. It is quite another thing to argue that we do not need causes at all. Nevertheless, some argue that we can do without causal relationship. In 2008 the chief editor of Wired Magazine , Chris Anderson, wrote an article with the title: ”The End of Theory: The Data Deluge Makes the Scientific Method Obsolete”. In the article he argued that correlations are sufficient. We can use huge amount of data and let statistical algorithms find patterns that science cannot. He went even further, and argued that the traditional scientific method, of using hypotheses, causal models and tests, is becoming obsolete (Anderson, 2008 ).

According to Mayer-Schönberger and Cukier, Anderson’s article unleashed a furious debate, “… even though Anderson quickly backpedaled away from his bolder claims” (Mayer-Schönberger and Cukier, 2014 , p. 71). But even if Anderson modified his original claims, Mayer-Schönberger and Cukier agree that in most cases we can do without knowing causal relations: “Big Data is about what, not why. We don’t always need to know the cause of a phenomenon; rather, we can let data speak for itself” (Mayer-Schönberger and Cukier, 2014 , p. 14). Later they formulate it in this way: “Causality won’t be discarded, but it is being knocked off its pedestal as the primary fountain of meaning. Big data turbocharges non-causal analyses, often replacing causal investigations” (Mayer-Schönberger and Cukier, 2014 , p. 68). Pearl and Mackenzie put it this way: “The hope—and at present, it is usually a silent one—is that the data themselves will guide us to the right answers whenever causal questions come up” (Pearl and Mackenzie, 2018 , p. 16). I have to add that Pearl and Mackenzie are critical of this view.

The mini Turing test

Anderson was not the first to argue that science can do without causes. At the end of the 19th century one of the pioneers of modern statistics, Karl Pearson, argued that causes have no place in science (Pearl and Mackenzie, 2018 , p. 67) and at the beginning of the 20th century one of the most influential philosophers of that century, Bertrand Russell, wrote the article “On the Notion of Cause” where he called “the law of causality” a “relic of a bygone age” (Russell, 1963 , p. 132). For example, when bodies move under the mutual attraction of gravity, nothing can be called a cause, and nothing an effect according to Russell. There is “merely a formula” (Russell, 1963 , p. 141). He might have added that Newton’s mechanics had been reformulated by Joseph-Louis Lagrange and William Hamilton to an abstract theory without the concept of force.

However, Russell looked for causality at the wrong place. He took simply Newton’s theory for granted, and had forgotten that Newton himself subscribed to what in his time was called “experimental philosophy”. Physics is no doubt an experimental science, and to carry out experiments the physicist must be able to move around, to handle instruments, to read scales, and to communicate with other physicists. As the physicist Roger Newton has pointed out, a physicist “…effectively conducts experiments by jiggling one part of Nature and watching how other parts respond” (Newton 1997 , p. 142). To find out if A causes B, it is important for “A to be under our control ” (Newton, 1997 , p. 144, italics in original).

I have already quoted Pearl’s and Mackenzie’s book The Book of Why ( 2018 ). The main argument in the book is that to create humanlike intelligence in a computer, the computer must be able to master causality. They ask the question:

How can machines (and people) represent causal knowledge in a way that would enable them to access the necessary information swiftly, answer questions correctly, and do it with ease, as a three-year-old child can? (Pearl and Mackenzie, 2018 , p. 37).

They call this the “mini-Turing test”. It has the prefix “mini” because it is not a full Turing test, but is confined to causal relations.

Before I go into the mini-Turing test I will briefly recall the Turing test. In the article “Computing Machinery and Intelligence” (Turing, 1950 ). Alan Turing asked the question: How can we determine if computers have acquired general intelligence? He starts by saying that the question he tries to answer is: “Can machines think?”, but instead of going into the question of what intelligence is, he sets up a kind of game. In the game a questioner can communicate with a computer and a human being. He has to communicate through a key-board, so he does not know who is the computer and who is the human. The point is that the machine pretends to be human, and it is the job of the questioner to decide which of the two is the computer and who is the human. If the questioner is unable to distinguish, we can say that the computer is intelligent. Turing called this the “imitation game”, but it is later known as the “Turing test”. If the computer passes the test, it has, according to Turing, acquired general intelligence.

According to Pearl and Mackenzie a minimum requirement to pass the Turing test is that the computer is able to handle causal questions. From an evolutionary perspective this makes sense. Why Homo sapiens has been so successful in the history of evolution is of course a complex question. Many factors have been involved, and the ability to cooperate is probably one of the most important. However, a decisive step took place between 70,000 and 30,000 years ago, what the historian Yuval Harari calls the Cognitive Revolution (Harari, 2014 , p. 23). According to Harari the distinguishing mark of the Cognitive Revolution is the ability to imagine something that does not exist. Harari’s example is the ivory figurine “the lion man” (or “the lioness woman”) that was found in the Stadel Cave in Germany, and is approximately 32,000 years old. It consists of a human body and the head of a lion.

Pearl and Mackenzie refer to Harari, and add that the creation of the lion man is the precursor of philosophy, scientific discovery, and technological innovation. The fundamental precondition for this creation is the ability to ask and answer questions of the form: “What happens if I do ……?” (Pearl and Mackenzie, 2018 , p. 2).

The mini-Turing test is restricted to causal relationships. If computers can handle causal knowledge, they will pass this test. However, the problem is that in this regard computers have not made any progress for decades: “Just as they did 30 years ago, machine-learning programs (including those with deep neural networks) operate almost entirely in an associative mode…” (Pearl and Mackenzie, 2018 , p. 30). But this is insufficient. To answer causal questions we must be able to intervene in the world.

According to Pearl and Mackenzie the root of the problem is that computers do not have a model of reality. However, the problem is that nobody can have a model of reality. Any model can only depict simplified aspects of reality. The real problem is that computers are not in the world, because they are not embodied.

The real Turing test

Pearl and Mackenzie are right in arguing that computers cannot pass the mini-Turing test because they cannot answer causal question. And I shall argue that they cannot pass the full Turing test because they are not in the world, and, therefore, they have no understanding. A few examples from the Turing test will show this.

There is an annual competition where the award (Loebner Prize) is given to the AI program most similar to a human being. The competition is the Turing test, but gold or silver medals have never been awarded. The program Mitsuku won in 2013, 2016, 2017, 2018, and 2019.

The philosopher Luciano Floridi recounts how he joined the judging panel when the competition was first held in England in 2008. One of the judges started with the question: “If we take each other’s hand, whose hand am I holding then?” The computer replied by talking about something completely different: “We live in eternity. So, yeah, no. We don’t believe.” Another question was: “The four capital cities of England are three, Manchester and Liverpool. What’s wrong with this sentence?” The computer had no meaningful answer (Floridi, 2014 , p. 133).

The conversation that Floridi refers to, took place more than 10 years ago. I was curious to see if the program had improved. I have tried it out several times, and it has not improved much. Below is an example. When Mitsuku in 2008 was asked: “If we take each other’s hand, whose hand am I holding then?” the computer failed. This time it came out with the right answer. However, it did not take long before it failed. In the printout below “Millie” can either be the Mitsuku computer, or a human. At the end of the conversation one is supposed to guess if one talked to a computer or to a human being. It was not difficult to determine that I had talked to a computer.

Here is the “conversation”:

Mitsuku chatbot Welcome Ragnar, now talking. * Topic is ’Turing Test’ * Set by admin (1 day ago)
<Ragnar> If I shake your hand, whose hand am I holding?
<Millie> I will be holding yours adn you will be holding mine. What’s with the dumb questions?
<Ragnar> Why is the question dumb?
<Millie> I want to know everything about you.
<Ragnar> But you did not explain why you regarded the question as dumb.
<Millie> Sorry, I must have forgot. What was your question again?
<Ragnar> If we shake hands, whose hand am I holding?
<Millie> We would be holding each other’s hand.
<Ragnar> And why is that a dumb question?
<Millie> Why do I exist? ( MitsukuHomepage . I have kept Mitsuku’s misprinting)

Computers fail because they are not in the world. Mitsuku characterized the first question as dumb, but could not explain why. Any child would be able to do that.

However, the competition rules of the Loebner Prize have been changed. The aim of getting computers to pass the Turing test has been given up, because “… chatbots are typically so quickly unmasked that the prize was always one for ’best of’” ( LoebnerPrize ).

Conclusion: computers are not in the world

The main thesis of this paper is that we will not be able to realize AGI because computers are not in the world. However, it is crucial that we clarify what is meant by “world”.

As the historian of science Alexandre Koyré has pointed out, the most important achievement of the scientific revolution of the 17th century was the replacement of Aristotelian science by an abstract scientific ideal (“paradigm”) (Koyré 1978 , pp. 38–39). Koyré argued convincingly that Galileo was basically a Platonist (Koyré, 1968 ). As in the case of Plato, the key was mathematics. According to Galileo the book of nature is written in the language of mathematics (Galilei, 1970 , p. 237). Therefore, Galileo’s world is an abstract and idealized world, close to Plato’s world of ideas.

The system that comes closest to this ideal world is our solar system, what Isaac Newton called “the system of the world”. Newton’s mechanics became the model for all science. The best expression of this ideal was given by the French mathematician Pierre Simon de Laplace. He argued that there is in principle no difference between a planet and a molecule. If we had complete knowledge of the state of the universe at one time, we could in principle determine the state at any previous and successive time (Laplace, 1951 , p. 6). This means that the universe as a whole can be described by an algorithm. Turing referred to this passage from Laplace in his article “Computing Machinery and Intelligence”, and added that the predictions he (Turing) was considering, were nearer to practicability than the predictions considered by Laplace, which comprised the universe as a whole (Turing, 1950 , p. 440).

As Russell pointed out, in this world we cannot even speak about causes, only mathematical functions. Because most empirical sciences are causal, they are far from this ideal world. The sciences that come closest, are classical mechanics and theoretical physics.

Although this ideal world is a metaphysical idea that has not been realized anywhere, it has had a tremendous historical impact. Most philosophers and scientists after Galileo and Descartes have taken it to be the real world, which implies that everything that happens, “at the bottom” is governed by mathematical laws, algorithms. This applies to the organic world as well. According to Descartes all organisms, including the human body, are automata. Today we would call them robots or computers. Descartes made an exception for the human soul, which is not a part of the material world, and therefore is not governed by laws of nature. The immaterial soul accounts for man’s free will.

However, most advocates of AGI (and advocates of strong AI) will today exclude Descartes’ immaterial soul, and follow the arguments of Yuval Harari. In his latest book 21 Lessons for the 21st Century he refers to neuroscience and behavioral economics, which have allegedly shown that our decisions are not the result of “some mysterious free will”, but the result of “millions of neurons calculating probabilities within a split second” (Harari, 2018 , p. 20). Therefore, AI can do many things better than humans. He gives as examples driving a vehicle in a street full of pedestrians, lending money to strangers, and negotiating business deals. These jobs require the ability “to correctly assess the emotions and desires of other people.” The justification is this:

Yet if these emotions and desires are in fact no more than biochemical algorithms, there is no reason why computers cannot decipher these algorithms—and do so far better than any Homo sapiens (Harari, 2018 , p. 21).

This quotation echoes the words used by Francis Crick. In The Astonishing Hypothesis he explains the title of the book in the following way:

The Astonishing Hypothesis is that “You”, your joys and your sorrows, your memories and your ambitions, your sense of personal identity and free will, are in fact no more than the behavior of a vast assembly of nerve cells and their associated molecules (Crick, 1994 , p. 3).

However, there is a problem with both these quotations. If Harari and Crick are right, then the quotations are “nothing but” the result of chemical algorithms and “no more than” the behavior of a vast assembly of nerve cells. How can they then be true?

If we disregard the problem of self-reference, and take the ideal world of science that I have described above to be the (only) real world, then Harari’s argument makes sense. But the replacement of our everyday world by the world of science is based on a fundamental misunderstanding. Edmund Husserl was one of the first who pointed this out, and attributed this misunderstanding to Galileo. According to Husserl, Galileo was “…at once a discoverer and a concealing genius” (Husserl, 1970 , p. 52). Husserl called this misunderstanding “objectivism”. Today a more common name is “scientism”.

Contrary to this, Husserl insisted that the sciences are fundamentally a human endeavor. Even the most abstract theories are grounded in our everyday world, Husserl’s “lifeworld”. Husserl mentions Einstein’s theory of relativity, and argues that it is dependent on “Michelson’s experiments Footnote 3 and the corroborations of them by other researchers” (Husserl, 1970 , p. 125). To carry out this kind of experiments, the scientists must be able to move around, to handle instruments, to read scales and to communicate with other scientists.

There is a much more credible account of how we are able to understand other people than the one given by Harari. As Hubert Dreyfus pointed out, we are bodily and social beings, living in a material and social world. To understand another person is not to look into the chemistry of that person’s brain, not even into that person’s “soul”, but is rather to be in that person’s “shoes”. It is to understand the person’s lifeworld.

The American author Theodore Roszak has constructed a thought example to illustrate this point: Let us imagine that we are watching a psychiatrist at work. He is a hard working and skilled psychiatrist and obviously has a very good practice. The waiting room is full of patients with a variety of emotional and mental disorders. Some are almost hysterical, some have strong suicidal thoughts, some hallucinations, some have the cruelest nightmares and some are driven to madness by the thought that they are being watched by people who will hurt them. The psychiatrist listens attentively to each patient and does his best to help them, but without much success. On the contrary, they all seem to be getting worse, despite the psychiatrist’s heroic efforts.

Now Roszak asks us to put this into a larger context. The psychiatrist’s office is in a building, and the building is in a place. This place is Buchenwald and the patients are prisoners in the concentration camp (Roszak, 1992 , p. 221). Biochemical algorithms would not help us to understand the patients. What does help, in fact, what is imperative, is to know the larger context . The example simply does not make sense if we do not know that the psychiatrist’s office is in a concentration camp.

Only few of us are able to put ourselves in the shoes of a prisoner of a concentration camp. Therefore, we cannot fully understand people in situations that are very different from what we have ourselves experienced. But to some degree we can understand, and we can understand because we are also in the world.

Computers are not in our world. I have earlier said that neural networks need not be programmed, and therefore can handle tacit knowledge. However, it is simply not true, as some of the advocates of Big Data argue, that the data “speak for themselves”. Normally, the data used are related to one or more models, they are selected by humans, and in the end they consist of numbers.

If we think, for example like Harari, that the world is “at the bottom” governed by algorithms, then we will have a tendency to overestimate the power of AI and underestimate human accomplishments. The expression “nothing but” that appears in the quotation from Harari may lead to a serious oversimplification in the description of human and social phenomena. I think this is at least a part of the explanation of the failure of both IBM Watson Health and Alphabet’s DeepMind. “IBM has encountered a fundamental mismatch between the way machines learn and the way doctors work” (Strickland, 2019 ) and DeepMind has discovered that “what works for Go may not work for the challenging problems that DeepMind aspires to solve with AI, like cancer and clean energy” (Marcus, 2019 ).

The overestimation of the power of AI may also have detrimental effects on science. In their frequently quoted book The Second Machine Age Erik Brynjolfson and Andrew McAfee argue that digitization can help us to understand the past. They refer to a project that analyzed more than five million books published in English since 1800. Some of the results from the project was that “the number of words in English has increased by more than 70% between 1950 and 2000, that fame now comes to people more quickly than in the past but also fades faster, and that in the 20th century interest in evolution was declining until Watson and Crick discovered the structure of DNA.” This allegedly leads to “better understanding and prediction—in other words, of better science—via digitization” (Brynjolfson and McAfee, 2014 , p. 69). In my opinion it is rather an illustration of Karl Popper’s insight: “Too many dollars may chase too few ideas” (Popper, 1981 , p. 96).

My conclusion is very simple: Hubert Dreyfus’ arguments against general AI are still valid.

Polanyi normally uses “knowing” instead of “knowledge” to emphasize the personal dimension. However, I will use the more traditional “knowledge”.

The example is taken from the Wikipedia article on Jeopardy! ( Wikipedia: Jeopardy ).

I have given a detailed description of Michelson’s instruments in Fjelland ( 1991 ).

Anderson C (2008) The end of theory: the data deluge makes the scientific method obsolete. Wired Magazine

Bolter D (1986) Turing’s man. Western culture in the computer age. Penguin Books

Brynjolfson E, McAfee A (2014) The second machine age. Norton & Company

Chen X-W, Lin X (2014) Big Data deep learning: challenges and perspectives. IEEE Access 2:514–525

Article   Google Scholar  

Crick F (1994). The astonishing hypothesis. The scientific search for the soul. Macmillan Publishing Company

Dingli A (2018) “Its Magic….I Owe You No Explanation!” 2018. https://becominghuman.ai/its-magic-i-owe-you-no-explanation-explainableai-43e798273a08 . Accessed 5 May 2020

Dreyfus HL (1972) What computers can’t do. Harper & Row, New York, NY

Google Scholar  

Dreyfus HL, Dreyfus SE (1986). Mind over machine. Basil Blackwell

Fjelland R (1991) The Theory-Ladenness of observations, the role of scientific instruments, and the Kantian a priori. Int Stud Philos Sci 5(3):269–80

Floridi L (2014) The 4th revolution. How the infosphere is reshaping human reality. Oxford University Press, Oxford

Ford M (2015) The rise of the robots. technology and the threat of mass unemployment. Oneworld Publications, London

Galilei G (1970) Dialogue concerning the two chief world systems (1630). University of California Press, Berkeley, Los Angeles

Harari YN (2014). Sapiens. A brief history of humankind. Vintage

Harari YN (2018) 21 Lessons for the 21st century. Jonathan Cape, London

Husserl E (1970) The crisis of european sciences and transcendental phenomenology. Northwestern University Press, Evanston

Jones N (2014) The learning machines. Nature 505:146–48

Article   ADS   CAS   Google Scholar  

Koyré A (1968) Galileo and Plato. In: Metaphysics and measurement (1943). John Hopkins Press, Baltimore, London.

Koyré A (1978) Galileo studies (1939). Harvester, London

Lanier J (2013) Who owns the future? Allen Lane, London

Laplace PS (1951) Philosophical essay on probabilities (1814). Dover Publications, New York

Lazer D, Kennnedy R, King G, Vespignani A (2014) The parable of google flu: traps in big data analysis. Science 343:1203–1205

LoebnerPrize. https://artistdetective.wordpress.com/2019/09/21/loebner-prize-2019 . Accessed 5 May 2020

Marcus G (2019) DeepMind’s losses and the future of artificial intelligence. Wired 14.8.2019. https://www.wired.com/story/deepminds-losses-future-artificial-intelligence/ . Accessed 6 Jan 2020

Mayer-Schönberger V, Cukier K (2014) Big data: a revolution that will transform how we live, work, and think. Eamon Dolan/Mariner Books

Mill JS (1882) A system of logic, ratiocinative and inductive, being a connected view of the principles of evidence, and the methods of scientific investigation, 8th edn. Harper & Brothers, New York

Minsky M (1972) Computation: finite and infinite machines. Prentice-Hall International

MitsukuHomepage. http://www.square-bear.co.uk/mitsuku/home.htm . Accessed 12 Sept 2017

Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1). https://doi.org/10.1186/s40537-014-0007-7

Newton RG (1997) The truth of science. Physical theories and reality. Harvard University Press, Cambridge

Nielsen M (2016). Is alphago really such a big deal? Quanta Magazine, March 29, 2016. https://www.quantamagazine.org/is-alphago-really-such-a-big-deal-20160329/ . Accessed 7 Jan 2020

Pearl J, Mackenzie D (2018) The book of why. The new science of cause and effect. Basic Books, New York

MATH   Google Scholar  

Penrose R (1989) The emperor’s new mind. Concerning computers, minds, and the laws of physics. Oxford University Press, Oxford

Book   Google Scholar  

Penrose R (1994) Shadows of the mind. A search for the missing science of consciousness. Oxford University Press, Oxford

Plato (1955) The republic. Penguin Books, Harmondsworth, p. 1955

Polanyi M (1958) Personal knowledge. Routledge & Kegan Paul, London

Polanyi M (2009) The tacit dimension (1966). The University of Chicago Press, Chicago

Popper KR (1981) The rationality of scientific revolutions (1975). In: Hacking Ian ed Scientific revolutions. Oxford University Press, Oxford, pp. 80–106

Roszak T (1992) The voice of the earth. Simon & Shuster, New York

Russell B (1963). On the notion of a cause (1912). In: Mysticism and logic. Unwin Books, London

Sagan L (1996) Electric and magnetic fields: invisible risks? Gordon and Breach Publishers, Amsterdam

Searle J (1980) Minds, brains, and programs. Behav Brain Sci 3(3):417–57

Shane J (2019) You look like a thing and I love you. Wildfire, London

Strickland E (2019) How Ibm Watson overpromised and underdelivered on Ai Health Care. IEEE Spectrum, 2 April 2019. https://sprectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpromised-and-underdelivered-on-ai-health-care . Accessed 5 Jan 2020

Susskind R, Susskind D (2015) The future of the professions. Oxford University Press, Oxford

Tegmark M (2017) Life 3.0. Being human in the age of artificial intelligence. Alfred A. Knopf, New York

Turing A (1950) Computing machinery and intelligence. Mind LIX 236:433–60

Article   MathSciNet   Google Scholar  

Weizenbaum J (1976) Computer power and human reason. Freeman & Company, San Francisco

Wikipedia: Jeopardy!, https://en.wikipedia.org/wiki/Jeopardy! Accessed 2 Feb 2017

Download references

Acknowledgements

I want to thank the participants of the workshop Ethics of Quantification , University of Bergen 5.12.2012, and Adam Standring and Rune Vabø, for useful comments.

Author information

Authors and affiliations.

Centre for the Study of the Sciences and the Humanities, University of Bergen, Bergen, Norway

Ragnar Fjelland

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ragnar Fjelland .

Ethics declarations

Competing interests.

The author declares no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Fjelland, R. Why general artificial intelligence will not be realized. Humanit Soc Sci Commun 7 , 10 (2020). https://doi.org/10.1057/s41599-020-0494-4

Download citation

Received : 19 January 2020

Accepted : 07 May 2020

Published : 17 June 2020

DOI : https://doi.org/10.1057/s41599-020-0494-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

New regulatory thinking is needed for ai-based personalised drug and cell therapies in precision oncology.

  • Bouchra Derraz
  • Gabriele Breda
  • Stephen Gilbert

npj Precision Oncology (2024)

Prediction of salinity intrusion in the east Upputeru estuary of India using hybrid metaheuristic algorithms

  • Sireesha Mantena
  • Vazeer Mahammood
  • Kunjam Nageswara Rao

Modeling Earth Systems and Environment (2024)

The application of chatbot on Vietnamese misgrant workers’ right protection in the implementation of new generation free trade agreements (FTAS)

  • Quoc Nguyen Phan
  • Chin-Chin Tseng
  • Thi Bich Ngoc Nguyen

AI & SOCIETY (2023)

Let us make man in our image-a Jewish ethical perspective on creating conscious robots

AI and Ethics (2023)

A transfer learning approach to interdisciplinary document classification with keyword-based explanation

  • Xiaoming Huang

Scientometrics (2023)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

strong ai hypothesis says that

What’s the Purpose of Genetic Testing?

Towards a common sense europe (part i), openmind books, scientific anniversaries, water, the strangest liquid in the universe, featured author, latest book, the future of ai: toward truly intelligent artificial intelligences.

This article contains some reflections about artificial intelligence (AI). First, the distinction between strong and weak AI and the related concepts of general and specific AI is made, making it clear that all existing manifestations of AI are weak and specific. The main existing models are briefly described, insisting on the importance of corporality as a key aspect to achieve AI of a general nature. Also discussed is the need to provide common-sense knowledge to the machines in order to move toward the ambitious goal of building general AI. The paper also looks at recent trends in AI based on the analysis of large amounts of data that have made it possible to achieve spectacular progress very recently, also mentioning the current difficulties of this approach to AI. The final part of the article discusses other issues that are and will continue to be vital in AI and closes with a brief reflection on the risks of AI.

The final goal of artificial intelligence (AI)—that a machine can have a type of  general  intelligence similar to a human’s—is one of the most ambitious ever proposed by science. In terms of difficulty, it is comparable to other great scientific goals, such as explaining the origin of life or the Universe, or discovering the structure of matter. In recent centuries, this interest in building intelligent machines has led to the invention of models or metaphors of the human brain. In the seventeenth century, for example, Descartes wondered whether a complex mechanical system of gears, pulleys, and tubes could possibly emulate thought. Two centuries later, the metaphor had become telephone systems, as it seemed possible that their connections could be likened to a neural network. Today, the dominant model is computational and is based on the digital computer. Therefore, that is the model we will address in the present article.

The Physical Symbol System Hypothesis: Weak AI Versus Strong AI

In a lecture that coincided with their reception of the prestigious Turing Prize in 1975, Allen Newell and Herbert Simon (Newell and Simon, 1976) formulated the “Physical Symbol System” hypothesis, according to which “a physical symbol system has the necessary and sufficient means for general intelligent action.” In that sense, given that human beings are able to display intelligent behavior in a general way, we, too, would be physical symbol systems. Let us clarify what Newell and Simon mean when they refer to a Physical Symbol System (PSS). A PSS consists of a set of entities called symbols that, through relations, can be combined to form larger structures—just as atoms combine to form molecules—and can be transformed by applying a set of processes. Those processes can create new symbols, create or modify relations among symbols, store symbols, detect whether two are the same or different, and so on. These symbols are physical in the sense that they have an underlying physical-electronic layer (in the case of computers) or a physical-biological one (in the case of human beings). In fact, in the case of computers, symbols are established through digital electronic circuits, whereas humans do so with neural networks. So, according to the PSS hypothesis, the nature of the underlying layer (electronic circuits or neural networks) is unimportant as long as it allows symbols to be processed. Keep in mind that this is a hypothesis, and should, therefore, be neither accepted nor rejected a priori. Either way, its validity or refutation must be verified according to the scientific method, with experimental testing. AI is precisely the scientific field dedicated to attempts to verify this hypothesis in the context of digital computers, that is, verifying whether a properly programmed computer is capable of general intelligent behavior.

Specifying that this must be general intelligence rather than specific intelligence is important, as human intelligence is also general. It is quite a different matter to exhibit specific intelligence. For example, computer programs capable of playing chess at Grand-Master levels are incapable of playing checkers, which is actually a much simpler game. In order for the same computer to play checkers, a different, independent program must be designed and executed. In other words, the computer cannot draw on its capacity to play chess as a means of adapting to the game of checkers. This is not the case, however, with humans, as any human chess player can take advantage of his knowledge of that game to play checkers perfectly in a matter of minutes. The design and application of artificial intelligences that can only behave intelligently in a very specific setting is related to what is known as  weak AI , as opposed to  strong AI . Newell, Simon, and the other founding fathers of AI refer to the latter. Strictly speaking, the PSS hypothesis was formulated in 1975, but, in fact, it was implicit in the thinking of AI pioneers in the 1950s and even in Alan Turing’s groundbreaking texts (Turing, 1948, 1950) on intelligent machines.

This distinction between weak and strong AI was first introduced by philosopher John Searle in an article criticizing AI in 1980 (Searle, 1980), which provoked considerable discussion at the time, and still does today. Strong AI would imply that a properly designed computer does not simulate a mind but  actually is one , and should, therefore, be capable of an intelligence equal, or even superior to human beings. In his article, Searle sought to demonstrate that strong AI is impossible, and, at this point, we should clarify that general AI is not the same as strong AI. Obviously they are connected, but only in one sense: all strong AI will necessarily be general, but there can be general AIs capable of multitasking but not strong in the sense that, while they can emulate the capacity to exhibit general intelligence similar to humans, they do not experience states of mind.

The final goal of AI—that a machine can have a type of general intelligence similar to a human’s—is one of the most ambitious ever proposed by science. In terms of difficulty, it is comparable to other great scientific goals, such as explaining the origin of life or the Universe, or discovering the structure of matter

According to Searle, weak AI would involve constructing programs to carry out specific tasks, obviously without need for states of mind. Computers’ capacity to carry out specific tasks, sometimes even better than humans, has been amply demonstrated. In certain areas, weak AI has become so advanced that it far outstrips human skill. Examples include solving logical formulas with many variables, playing chess or Go, medical diagnosis, and many others relating to decision-making. Weak AI is also associated with the formulation and testing of hypotheses about aspects of the mind (for example, the capacity for deductive reasoning, inductive learning, and so on) through the construction of programs that carry out those functions, even when they do so using processes totally unlike those of the human brain. As of today, absolutely all advances in the field of AI are manifestations of weak and specific AI.

The Principal Artificial Intelligence Models: Symbolic, Connectionist, Evolutionary, and Corporeal

The  symbolic  model that has dominated AI is rooted in the PSS model and, while it continues to be very important, is now considered classic (it is also known as GOFAI, that is,  Good Old-Fashioned AI ). This top-down model is based on logical reasoning and heuristic searching as the pillars of problem solving. It does not call for an intelligent system to be part of a body, or to be situated in a real setting. In other words, symbolic AI works with abstract representations of the real world that are modeled with representational languages based primarily on mathematical logic and its extensions. That is why the first intelligent systems mainly solved problems that did not require direct interaction with the environment, such as demonstrating simple mathematical theorems or playing chess—in fact, chess programs need neither visual perception for seeing the board, nor technology to actually move the pieces. That does not mean that symbolic AI cannot be used, for example, to program the reasoning module of a physical robot situated in a real environment, but, during its first years, AI’s pioneers had neither languages for representing knowledge nor programming that could do so efficiently. That is why the early intelligent systems were limited to solving problems that did not require direct interaction with the real world. Symbolic AI is still used today to demonstrate theorems and to play chess, but it is also a part of applications that require perceiving the environment and acting upon it, for example learning and decision-making in autonomous robots.

The symbolic model that has dominated AI is rooted in the PSS model and, while it continues to be very important, is now considered classic (it is also known as GOFAI, that is, Good Old-Fashioned AI). This top-down model is based on logical reasoning and heuristic searching as the pillars of problem solving

At the same time that symbolic AI was being developed, a biologically based approach called  connectionist  AI arose. Connectionist systems are not incompatible with the PSS hypothesis but, unlike symbolic AI, they are modeled from the bottom up, as their underlying hypothesis is that intelligence emerges from the distributed activity of a large number of interconnected units whose models closely resemble the electrical activity of biological neurons. In 1943, McCulloch and Pitts (1943) proposed a simplified model of the neuron based in the idea that it is essentially a logic unit. This model is a mathematical abstraction with inputs (dendrites) and outputs (axons). The output value is calculated according to the result of a weighted sum of the entries in such a way that if that sum surpasses a preestablished threshold, it functions as a “1,” otherwise it will be considered a “0.” Connecting the output of each neuron to the inputs of other neurons creates an artificial neural network. Based on what was then known about the reinforcement of synapses among biological neurons, scientists found that these artificial neural networks could be trained to learn functions that related inputs to outputs by adjusting the weights used to determine connections between neurons. These models were hence considered more conducive to learning, cognition, and memory than those based on symbolic AI. Nonetheless, like their symbolic counterparts, intelligent systems based on connectionism do not need to be part of a body, or situated in real surroundings. In that sense, they have the same limitations as symbolic systems. Moreover, real neurons have complex dendritic branching with truly significant electrical and chemical properties. They can contain ionic conductance that produces nonlinear effects. They can receive tens of thousands of synapses with varied positions, polarities, and magnitudes. Furthermore, most brain cells are not neurons, but rather  glial  cells that not only regulate neural functions but also possess electrical potentials, generate calcium waves, and communicate with others. This would seem to indicate that they play a very important role in cognitive processes, but no existing connectionist models include glial cells so they are, at best, extremely incomplete and, at worst, erroneous. In short, the enormous complexity of the brain is very far indeed from current models. And that very complexity also raises the idea of what has come to be known as  singularity , that is, future artificial superintelligences based on replicas of the brain but capable, in the coming twenty-five years, of far surpassing human intelligence. Such predictions have little scientific merit.

strong ai hypothesis says that

Another biologically inspired but non-corporeal model that is also compatible with the PSS hypothesis is  evolutionary computation  (Holland, 1975). Biology’s success at evolving complex organisms led some researchers from the early 1960s to consider the possibility of imitating evolution. Specifically, they wanted computer programs that could evolve, automatically improving solutions to the problems for which they had been programmed. The idea being that, thanks to mutation operators and crossed “chromosomes” modeled by those programs, they would produce new generations of modified programs whose solutions would be better than those offered by the previous ones. Since we can define AI’s goal as the search for programs capable of producing intelligent behavior, researchers thought that evolutionary programming might be used to find those programs among all possible programs. The reality is much more complex, and this approach has many limitations although it has produced excellent results in the resolution of optimization problems.

The human brain is very far removed indeed from AI models, which suggests that so-called singularity—artificial superintelligences based on replicas of the brain that far surpass human intelligence—are a prediction with very little scientific merit

One of the strongest critiques of these non-corporeal models is based on the idea that an intelligent agent needs a body in order to have direct experiences of its surroundings (we would say that the agent is “situated” in its surroundings) rather than working from a programmer’s abstract descriptions of those surroundings, codified in a language for representing that knowledge. Without a body, those abstract representations have no semantic content for the machine, whereas direct interaction with its surroundings allows the agent to relate signals perceived by its sensors to symbolic representations generated on the basis of what has been perceived. Some AI experts, particularly Rodney Brooks (1991), went so far as to affirm that it was not even necessary to generate those internal representations, that is, that an agent does not even need an internal representation of the world around it because the world itself is the best possible model of itself, and most intelligent behavior does not require reasoning, as it emerged directly from interaction between the agent and its surroundings. This idea generated considerable argument, and some years later, Brooks himself admitted that there are many situations in which an agent requires an internal representation of the world in order to make rational decisions.

In 1965, philosopher Hubert Dreyfus affirmed that AI’s ultimate objective—strong AI of a general kind—was as unattainable as the seventeenth-century alchemists’ goal of transforming lead into gold (Dreyfus, 1965). Dreyfus argued that the brain processes information in a global and continuous manner, while a computer uses a finite and discreet set of deterministic operations, that is, it applies rules to a finite body of data. In that sense, his argument resembles Searle’s, but in later articles and books (Dreyfus, 1992), Dreyfus argued that the body plays a crucial role in intelligence. He was thus one of the first to advocate the need for intelligence to be part of a body that would allow it to interact with the world. The main idea is that living beings’ intelligence derives from their situation in surroundings with which they can interact through their bodies. In fact, this need for corporeality is based on Heidegger’s phenomenology and its emphasis on the importance of the body, its needs, desires, pleasures, suffering, ways of moving and acting, and so on. According to Dreyfus, AI must model all of those aspects if it is to reach its ultimate objective of strong AI. So Dreyfus does not completely rule out the possibility of strong AI, but he does state that it is not possible with the classic methods of symbolic, non-corporeal AI. In other words, he considers the Physical Symbol System hypothesis incorrect. This is undoubtedly an interesting idea and today it is shared by many AI researchers. As a result, the corporeal approach with internal representation has been gaining ground in AI and many now consider it essential for advancing toward general intelligences. In fact, we base much of our intelligence on our sensory and motor capacities. That is, the body shapes intelligence and therefore, without a body general intelligence cannot exist. This is so because the body as hardware, especially the mechanisms of the sensory and motor systems, determines the type of interactions that an agent can carry out. At the same time, those interactions shape the agent’s cognitive abilities, leading to what is known as  situated cognition . In other words, as occurs with human beings, the machine is situated in real surroundings so that it can have interactive experiences that will eventually allow it to carry out something similar to what is proposed in Piaget’s cognitive development theory (Inhelder and Piaget, 1958): a human being follows a process of mental maturity in stages and the different steps in this process may possibly work as a guide for designing intelligent machines. These ideas have led to a new sub-area of AI called  development robotics  (Weng et al., 2001).

Specialized AI’s Successes

strong ai hypothesis says that

All of AI’s research efforts have focused on constructing specialized artificial intelligences, and the results have been spectacular, especially over the last decade. This is thanks to the combination of two elements: the availability of huge amounts of data, and access to high-level computation for analyzing it. In fact, the success of systems such as AlphaGO (Silver et al., 2016), Watson (Ferrucci et al., 2013), and advances in autonomous vehicles or image-based medical diagnosis have been possible thanks to this capacity to analyze huge amounts of data and efficiently detect patterns. On the other hand, we have hardly advanced at all in the quest for general AI. In fact, we can affirm that current AI systems are examples of what Daniel Dennet called “competence without comprehension” (Dennet, 2018).

All of AI’s research efforts have focused on constructing specialized artificial intelligences, and the results have been spectacular, especially over the last decade. This is thanks to the combination of two elements: the availability of huge amounts of data, and access to high-level computation for analyzing it

Perhaps the most important lesson we have learned over the last sixty years of AI is that what seemed most difficult (diagnosing illnesses, playing chess or Go at the highest level) have turned out to be relatively easy, while what seemed easiest has turned out to be the most difficult of all. The explanation of this apparent contradiction may be found in the difficulty of equipping machines with the knowledge that constitutes “common sense.” without that knowledge, among other limitations, it is impossible to obtain a deep understanding of language or a profound interpretation of what a visual perception system captures. Common-sense knowledge is the result of our lived experiences. Examples include: “water always flows downward;” “to drag an object tied to a string, you have to pull on the string, not push it;” “a glass can be stored in a cupboard, but a cupboard cannot be stored in a glass;” and so on. Humans easily handle millions of such common-sense data that allow us to understand the world we inhabit. A possible line of research that might generate interesting results about the acquisition of common-sense knowledge is the development robotics mentioned above. Another interesting area explores the mathematical modeling and learning of cause-and-effect relations, that is, the learning of causal, and thus asymmetrical, models of the world. Current systems based on deep learning are capable of learning symmetrical mathematical functions, but unable to learn asymmetrical relations. They are, therefore, unable to distinguish cause from effects, such as the idea that the rising sun causes a rooster to crow, but not vice versa (Pearl and Mackenzie, 2018; Lake et al., 2016).

The Future: Toward Truly Intelligent Artificial Intelligences

The most complicated capacities to achieve are those that require interacting with unrestricted and not previously prepared surroundings. Designing systems with these capabilities requires the integration of development in many areas of AI. We particularly need knowledge-representation languages that codify information about many different types of objects, situations, actions, and so on, as well as about their properties and the relations among them—especially, cause-and-effect relations. We also need new algorithms that can use these representations in a robust and efficient manner to resolve problems and answer questions on almost any subject. Finally, given that they will need to acquire an almost unlimited amount of knowledge, those systems will have to be able to learn continuously throughout their existence. In sum, it is essential to design systems that combine perception, representation, reasoning, action, and learning. This is a very important AI problem as we still do not know how to integrate all of these components of intelligence. We need cognitive architectures (Forbus, 2012) that integrate these components adequately. Integrated systems are a fundamental first step in someday achieving general AI.

The most complicated capacities to achieve are those that require interacting with unrestricted and not previously prepared surroundings. Designing systems with these capabilities requires the integration of development in many areas of AI

Among future activities, we believe that the most important research areas will be hybrid systems that combine the advantages of systems capable of reasoning on the basis of knowledge and memory use (Graves et al., 2016) with those of AI based on the analysis of massive amounts of data, that is, deep learning (Bengio, 2009). Today, deep-learning systems are significantly limited by what is known as “catastrophic forgetting.” This means that if they have been trained to carry out one task (playing Go, for example) and are then trained to do something different (distinguishing between images of dogs and cats, for example) they completely forget what they learned for the previous task (in this case, playing Go). This limitation is powerful proof that those systems do not learn anything, at least in the human sense of learning. Another important limitation of these systems is that they are “black boxes” with no capacity to explain. It would, therefore, be interesting to research how to endow deep-learning systems with an explicative capacity by adding modules that allow them to explain how they reached the proposed results and conclusion, as the capacity to explain is an essential characteristic of any intelligent system. It is also necessary to develop new learning algorithms that do not require enormous amounts of data to be trained, as well as much more energy-efficient hardware to implement them, as energy consumption could end up being one of the main barriers to AI development. Comparatively, the brain is various orders of magnitude more efficient than the hardware currently necessary to implement the most sophisticated AI algorithms. One possible path to explore is memristor-based neuromorphic computing (Saxena et al., 2018).

strong ai hypothesis says that

Other more classic AI techniques that will continue to be extensively researched are multiagent systems, action planning, experience-based reasoning, artificial vision, multimodal person-machine communication, humanoid robotics, and particularly, new trends in  development robotics , which may provide the key to endowing machines with common sense, especially the capacity to learn the relations between their actions and the effects these produce on their surroundings. We will also see significant progress in biomimetic approaches to reproducing animal behavior in machines. This is not simply a matter of reproducing an animal’s behavior, it also involves understanding how the brain that produces that behavior actually works. This involves building and programming electronic circuits that reproduce the cerebral activity responsible for this behavior. Some biologists are interested in efforts to create the most complex possible artificial brain because they consider it a means of better understanding that organ. In that context, engineers are seeking biological information that makes designs more efficient. Molecular biology and recent advances in optogenetics will make it possible to identify which genes and neurons play key roles in different cognitive activities.

Development robotics may provide the key to endowing machines with common sense, especially the capacity to learn the relations between their actions and the effects these produce on their surroundings

As to applications: some of the most important will continue to be those related to the Web, video-games, personal assistants, and autonomous robots (especially autonomous vehicles, social robots, robots for planetary exploration, and so on). Environmental and energy-saving applications will also be important, as well as those designed for economics and sociology. Finally, AI applications for the arts (visual arts, music, dance, narrative) will lead to important changes in the nature of the creative process. Today, computers are no longer simply aids to creation; they have begun to be creative agents themselves. This has led to a new and very promising AI field known as  computational creativity  which is producing very interesting results (Colton et al., 2009, 2015; López de Mántaras, 2016) in chess, music, the visual arts, and narrative, among other creative activities.

Some Final Thoughts

No matter how intelligent future artificial intelligences become—even general ones—they will never be the same as human intelligences. As we have argued, the mental development needed for all complex intelligence depends on interactions with the environment and those interactions depend, in turn, on the body—especially the perceptive and motor systems. This, along with the fact that machines will not follow the same socialization and culture-acquisition processes as ours, further reinforces the conclusion that, no matter how sophisticated they become, these intelligences will be different from ours. The existence of intelligences unlike ours, and therefore alien to our values and human needs, calls for reflection on the possible ethical limitations of developing AI. Specifically, we agree with Weizenbaum’s affirmation (Weizenbaum, 1976) that no machine should ever make entirely autonomous decisions or give advice that call for, among other things, wisdom born of human experiences, and the recognition of human values.

No matter how intelligent future artificial intelligences become, they will never be the same as human intelligence: the mental development needed for all complex intelligence depends on interactions with the environment and those interactions depend, in turn, on the body—especially the perceptive and motor systems

The true danger of AI is not the highly improbable technological singularity produced by the existence of hypothetical future artificial superintelligences; the true dangers are already here. Today, the algorithms driving Internet search engines or the recommendation and personal-assistant systems on our cellphones, already have quite adequate knowledge of what we do, our preferences and tastes. They can even infer what we think about and how we feel. Access to massive amounts of data that we generate voluntarily is fundamental for this, as the analysis of such data from a variety of sources reveals relations and patterns that could not be detected without AI techniques. The result is an alarming loss of privacy. To avoid this, we should have the right to own a copy of all the personal data we generate, to control its use, and to decide who will have access to it and under what conditions, rather than it being in the hands of large corporations without knowing what they are really doing with our data.

AI is based on complex programming, and that means there will inevitably be errors. But even if it were possible to develop absolutely dependable software, there are ethical dilemmas that software developers need to keep in mind when designing it. For example, an autonomous vehicle could decide to run over a pedestrian in order to avoid a collision that could harm its occupants. Outfitting companies with advanced AI systems that make management and production more efficient will require fewer human employees and thus generate more unemployment. These ethical dilemmas are leading many AI experts to point out the need to regulate its development. In some cases, its use should even be prohibited. One clear example is autonomous weapons. The three basic principles that govern armed conflict: discrimination (the need to distinguish between combatants and civilians, or between a combatant who is surrendering and one who is preparing to attack), proportionality (avoiding the disproportionate use of force), and precaution (minimizing the number of victims and material damage) are extraordinarily difficult to evaluate and it is therefore almost impossible for the AI systems in autonomous weapons to obey them. But even if, in the very long term, machines were to attain this capacity, it would be indecent to delegate the decision to kill to a machine. Beyond this kind of regulation, it is imperative to educate the citizenry as to the risks of intelligent technologies, and to insure that they have the necessary competence for controlling them, rather than being controlled  by  them. Our future citizens need to be much more informed, with a greater capacity to evaluate technological risks, with a greater critical sense and a willingness to exercise their rights. This training process must begin at school and continue at a university level. It is particularly necessary for science and engineering students to receive training in ethics that will allow them to better grasp the social implications of the technologies they will very likely be developing. Only when we invest in education will we achieve a society that can enjoy the advantages of intelligent technology while minimizing the risks. AI unquestionably has extraordinary potential to benefit society, as long as we use it properly and prudently. It is necessary to increase awareness of AI’s limitations, as well as to act collectively to guarantee that AI is used for the common good, in a safe, dependable, and responsible manner.

The road to truly intelligent AI will continue to be long and difficult. After all, this field is barely sixty years old, and, as Carl Sagan would have observed, sixty years are barely the blink of an eye on a cosmic time scale. Gabriel García Márquez put it more poetically in a 1936 speech (“The Cataclysm of Damocles”): “Since the appearance of visible life on Earth, 380 million years had to elapse in order for a butterfly to learn how to fly, 180 million years to create a rose with no other commitment than to be beautiful, and four geological eras in order for us human beings to be able to sing better than birds, and to be able to die from love.”

Select Bibliography

—Bengio, Y. 2009. “Learning deep architectures for AI.”  Foundations and Trends in Machine Learning  2(1): 1–127.

—Brooks, R. A. 1991. “Intelligence without reason.”  IJCAI-91 Proceedings of the Twelfth International Joint Conference on Artificial intelligence  1: 569–595.

—Colton, S., Lopez de Mantaras, R., and Stock, O. 2009. “Computational creativity: Coming of age.”  AI Magazine  30(3): 11–14.

—Colton, S., Halskov, J., Ventura, D., Gouldstone, I., Cook, M., and Pérez-Ferrer, B. 2015. “The Painting Fool sees! New projects with the automated painter.”  International Conference on Computational Creativity (ICCC 2015) : 189–196.

—Dennet, D. C. 2018.  From Bacteria to Bach and Back: The Evolution of Minds . London: Penguin.

—Dreyfus, H. 1965.  Alchemy and Artificial Intelligence . Santa Monica: Rand Corporation.

—Dreyfus, H. 1992.  What Computers Still Can’t Do . New York: MIT Press.

—Ferrucci, D. A., Levas, A., Bagchi, S., Gondek, D., and Mueller, E. T. 2013. “Watson: Beyond jeopardy!”  Artificial Intelligence  199: 93–105.

—Forbus, K. D. 2012. “How minds will be built.”  Advances in Cognitive Systems  1: 47–58.

—Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., Gómez-Colmenarejo, S., Grefenstette, E., Ramalho, T., Agapiou, J., Puigdomènech-Badia, A., Hermann, K. M., Zwols, Y., Ostrovski, G., Cain, A., King, H., Summerfield, C., Blunsom, P., Kavukcuoglu, K., and Hassabis, D. 2016. “Hybrid computing using a neural network with dynamic external memory.”  Nature  538: 471–476.

—Holland, J. H. 1975.  Adaptation in Natural and Artificial Systems . Ann Arbor: University of Michigan Press.

—Inhelder, B., and Piaget, J. 1958.  The Growth of Logical Thinking from Childhood to Adolescence . New York: Basic Books.

—Lake, B. M., Ullman, T. D., Tenenbaum, J. B., and Gershman, S. J. 2017. “Building machines that learn and think like people.”  Behavioral and Brain Sciences  40:e253.

—López de Mántaras, R. 2016. “Artificial intelligence and the arts: Toward computational creativity.” In  The Next Step: Exponential Life . Madrid: BBVA Open Mind, 100–125.

—McCulloch, W. S., and Pitts, W. 1943. “A logical calculus of ideas immanent in nervous activity.”  Bulletin of Mathematical Biophysics  5: 115–133.

—Newell, A., and Simon, H. A. 1976. “Computer science as empirical inquiry: Symbols and search.”  Communications of the ACM  19(3): 113–126.

—Pearl, J., and Mackenzie, D. 2018.  The Book of Why: The New Science of Cause and Effect . New York: Basic Books.

—Saxena, V., Wu, X., Srivastava, I., and Zhu, K. 2018. “Towards neuromorphic learning machines using emerging memory devices with brain-like energy efficiency.” Preprints:  https://www.preprints.org/manuscript/201807.0362/v1 .

—Searle, J. R. 1980. “Minds, brains, and programs,”  Behavioral and Brain Sciences  3(3): 417–457.

—Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., ven den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J.,Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., and Hassabis, D. 2016. “Mastering the game of Go with deep neural networks and tree search.”  Nature  529(7587): 484–489.

—Turing, A. M. 1948. “Intelligent machinery.” National Physical Laboratory Report. Reprinted in:  Machine Intelligence 5 , B. Meltzer and D. Michie (eds.). Edinburgh: Edinburgh University Press, 1969.

—Turing, A. M. 1950. “Computing machinery and intelligence.”  Mind  LIX(236): 433–460.

—Weizenbaum, J. 1976.  Computer Power and Human Reasoning: From Judgment to Calculation . San Francisco: W .H. Freeman and Co.

—Weng, J., McClelland, J., Pentland, A., Sporns, O., Stockman, I., Sur, M., and Thelen, E. 2001. “Autonomous mental development by robots and animals.”  Science  291: 599–600.

Related publications

  • The Future of Human-Machine Communications: The Turing Test
  • The Future of Artificial Intelligence and Cybernetics
  • Artificial Intelligence and the Arts: Toward Computational Creativity

Download Kindle

Download epub, download pdf, more publications related to this article, more about technology, artificial intelligence, digital world, visionaries, comments on this publication.

Morbi facilisis elit non mi lacinia lacinia. Nunc eleifend aliquet ipsum, nec blandit augue tincidunt nec. Donec scelerisque feugiat lectus nec congue. Quisque tristique tortor vitae turpis euismod, vitae aliquam dolor pretium. Donec luctus posuere ex sit amet scelerisque. Etiam sed neque magna. Mauris non scelerisque lectus. Ut rutrum ex porta, tristique mi vitae, volutpat urna.

Sed in semper tellus, eu efficitur ante. Quisque felis orci, fermentum quis arcu nec, elementum malesuada magna. Nulla vitae finibus ipsum. Aenean vel sapien a magna faucibus tristique ac et ligula. Sed auctor orci metus, vitae egestas libero lacinia quis. Nulla lacus sapien, efficitur mollis nisi tempor, gravida tincidunt sapien. In massa dui, varius vitae iaculis a, dignissim non felis. Ut sagittis pulvinar nisi, at tincidunt metus venenatis a. Ut aliquam scelerisque interdum. Mauris iaculis purus in nulla consequat, sed fermentum sapien condimentum. Aliquam rutrum erat lectus, nec placerat nisl mollis id. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Nam nisl nisi, efficitur et sem in, molestie vulputate libero. Quisque quis mattis lorem. Nunc quis convallis diam, id tincidunt risus. Donec nisl odio, convallis vel porttitor sit amet, lobortis a ante. Cras dapibus porta nulla, at laoreet quam euismod vitae. Fusce sollicitudin massa magna, eu dignissim magna cursus id. Quisque vel nisl tempus, lobortis nisl a, ornare lacus. Donec ac interdum massa. Curabitur id diam luctus, mollis augue vel, interdum risus. Nam vitae tortor erat. Proin quis tincidunt lorem.

The Past Decade and Future of AI’s Impact on Society

Do you want to stay up to date with our new publications.

Receive the OpenMind newsletter with all the latest contents published on our website

OpenMind Books

  • The Search for Alternatives to Fossil Fuels
  • View all books

About OpenMind

Connect with us.

  • Keep up to date with our newsletter

Quote this content

Enago Academy

AI-Driven Hypotheses: Real world examples exploring the potential and challenges of AI-generated hypotheses in science

' src=

Artificial intelligence (AI) is no longer confined to mere automation; it is now an active participant in the pursuit of knowledge and understanding. However, in the realm of scientific research, the integration of AI marks a significant paradigm shift, ushering in an era where machines and human actively collaborate to formulate research hypotheses and questions. While AI systems have traditionally served as powerful tools for data analysis, their evolution now allows them to go beyond analysis and generate hypotheses, prompting researchers to explore uncharted domains of research.

Let’s delve deeper in understanding this transformative capability of AI and the challenges established in research hypothesis formation, emphasizing the crucial role of human intervention throughout the AI integration process.

Table of Contents

Potential of AI-Generated Research Hypothesis: Is it enough?

The discerning ability of AI, particularly through machine learning algorithms, has demonstrated a unique capacity to identify patterns across vast datasets. This has given rise to AI systems not only proficient in analyzing existing data but also in formulating hypotheses based on patterns that may elude human observation alone. The synergy between machine-driven hypothesis generation and human expertise represents a promising frontier for scientific discovery, underscoring the importance of human oversight and interpretation.

The capability of AI to generate hypotheses raises thought-provoking questions about the nature of creativity in the research process. However, although AI can identify patterns within data, the question remains: can they exhibit true creativity in proposing hypotheses, or are they limited to recognizing patterns within existing data?

Furthermore, the intersection of AI and research transcends the generation of hypotheses to include the formulation of research questions. By actively engaging with data and recognizing gaps in knowledge, AI systems can propose insightful questions that guide researchers toward unexplored avenues. This collaborative approach between machines and researchers enhances the scope and depth of scientific inquiry, emphasizing the indispensable role of human insight in shaping the research agenda.

Challenges in AI-Driven Hypothesis Formation

Despite the immense potential, the integration of AI in hypothesis formation is not without its challenges. One significant concern is the “black box” nature of many advanced AI algorithms. As these systems become more complex, understanding the reasoning behind their generated hypotheses becomes increasingly challenging for human researchers. This lack of interpretability can hinder the acceptance of AI-driven hypotheses in the scientific community.

Moreover, biases inherent in the training data of AI models can influence the hypotheses generated. If not carefully addressed, this bias could lead to skewed perspectives and reinforce existing stereotypes. It is crucial to recognize that while AI can process vast amounts of information, it lacks the nuanced understanding and contextual awareness that human researchers bring to the table.

Real Concerns of the Scholarly Community Clouding the Integration of AI in Research Hypothesis Generation

Instance 1 In a paper published in JAMA Ophthalmology, researchers utilized GPT-4, the latest version of the language model powering ChatGPT, in conjunction with Advanced Data Analysis (ADA), a model incorporating Python for statistical analysis. The AI-generated data incorrectly suggested the superiority of one surgical procedure over another in treating keratoconus. The study aimed to demonstrate the ease with which AI could create seemingly authentic but fabricated datasets. Despite flaws detectable under scrutiny, the authors expressed concern over the potential misuse of AI in generating convincing yet false data, raising issues of research integrity. Experts emphasize the need for updated quality checks and automated tools to identify AI-generated synthetic data in scientific publishing.

Instance 2 In early October, a gathering of researchers, including a past Nobel laureate, convened in Stockholm to explore the evolving role of AI in scientific processes. Led by Hiroaki Kitano, a biologist and CEO of Sony AI, the workshop considered introducing awards for AIs and AI-human collaborations producing outstanding scientific contributions. The discussion revolved around the potential of AI in various scientific tasks, including hypothesis generation, a process traditionally requiring human creativity. While AI has long been involved in literature-based discovery and knowledge graph analysis, recent advancements, particularly in large language models, are enabling the generation of hypotheses and the exploration of unconventional ideas.

The potential for AI to make ‘alien’ hypotheses—those unlikely to be conceived by humans—has been demonstrated, raising questions about the interpretability and clarity of AI-generated hypotheses.

Ethical Considerations in AI-Driven Research Hypothesis

As AI takes on a more active role in hypothesis formation, ethical considerations become paramount. The responsible use of AI requires continuous vigilance to prevent unintended consequences. Researchers must be vigilant in identifying and mitigating biases in training data, ensuring that AI systems are not perpetuating or exacerbating existing inequalities.

Additionally, the ethical implications of AI-generated hypotheses, particularly in sensitive areas such as genetics or social sciences, demand careful scrutiny. Transparency in the decision-making process of AI algorithms is essential to build trust within the scientific community and society at large. Striking the right balance between innovation and ethical responsibility is a challenge that requires constant attention as the collaboration between humans and AI evolves.

The Human Touch: A crucial element in AI-driven research

Nuanced thinking, creativity, and contextual understanding that humans possess play a vital role in refining and validating the hypotheses generated by AI. Researchers must act as critical evaluators, questioning the assumptions made by AI algorithms and ensuring that the proposed hypotheses align with existing knowledge. Furthermore, the interpretability challenge can be addressed through interdisciplinary collaboration. Scientists working closely with experts in AI ethics, philosophy, and computer science can develop frameworks to enhance the transparency of AI-generated hypotheses. This not only fosters a better understanding of the underlying processes but also ensures that the generated hypotheses align with ethical and scientific standards.

What the Future Holds

The integration of AI in hypothesis formation is an ongoing journey with vast potential. The collaborative efforts of humans and machines hold the promise of accelerating scientific discovery, unlocking new insights, and addressing complex challenges facing humanity. However, this journey requires a balanced approach, acknowledging the strengths of AI while respecting the unique capabilities and ethical considerations that humans bring to the table.

To Conclude…

The transformative capability of AI in hypothesis formation is reshaping the landscape of scientific research. But this is not possible without the collaborative partnership between humans and machines, having the potential to drive unprecedented progress. Thus, it is imperative to navigate the challenges associated with AI integration and embrace a symbiotic relationship between human intellect and AI; with which we can unlock the full potential of this dynamic collaboration and usher in a new era of scientific exploration and understanding.

' src=

Interesting article

Rate this article Cancel Reply

Your email address will not be published.

strong ai hypothesis says that

Enago Academy's Most Popular Articles

AI in journal selection

  • AI in Academia
  • Trending Now

Using AI for Journal Selection — Simplifying your academic publishing journey in the smart way

Strategic journal selection plays a pivotal role in maximizing the impact of one’s scholarly work.…

  • Old Webinars
  • Webinar Mobile App

Using AI for Plagiarism Prevention and AI-Content Detection in Academic Writing

Watch Now   In today’s digital age, where information is readily accessible and content creation…

Measure to Authenticate AI-Generated Scholarly Output

  • Thought Leadership

Authenticating AI-Generated Scholarly Outputs: Practical approaches to take

The rapid advancements in artificial intelligence (AI) have given rise to a new era of…

AI Summarization Tools

Simplifying the Literature Review Journey — A comparative analysis of 6 AI summarization tools

Imagine having to skim through and read mountains of research papers and books, only to…

AI Detection

6 Leading AI Detection Tools for Academic Writing — A comparative analysis

The advent of AI content generators, exemplified by advanced models like ChatGPT, Claude AI, and…

Using AI for Journal Selection — Simplifying your academic publishing journey in the…

Simplifying the Literature Review Journey — A comparative analysis of 6 AI…

strong ai hypothesis says that

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

strong ai hypothesis says that

What should universities' stance be on AI tools in research and academic writing?

strong ai hypothesis says that

Advocates of Artificial Intelligence as Behaviourists

Paul Austin Murphy

Paul Austin Murphy

Becoming Human: Artificial Intelligence Magazine

In extremely general terms, it can said that behaviourism was a response to the Cartesian (or, even more widely, Western) philosophical tradition in which behaviour, actions and what is done by persons was seen as the outward expression of what goes on in the mind. Thus, in that sense, many of those who were initially involved in artificial intelligence (AI) were following in behaviourism’s footsteps in that they believed that if a computer (or robot) behaved as if it had intelligence (or had a mind), then, by definition, it must actually be intelligent (or have a mind).

Many other currents in post-World War Two philosophy played-down the innards of the mind and, consequently, played-up behaviour. We had the work of the late Wittgenstein in which private mental states were seen as nothing more than “beetles in boxes”. We also had Gilbert Ryle’s The Concept of Mind and Quine saying that all there is to meaning is “overt behaviour”. And then functionalism (in the philosophy of mind) followed all that..

Trending AI Articles:

1. Predicting buying behavior using Machine Learning 2. Understanding and building Generative Adversarial Networks(GANs) 3. Building a Django POST face-detection API using OpenCV and Haar Cascades 4. Learning from mistakes with Hindsight Experience Replay

Specifically in terms of AI: it can fairly safely be said that many of the defenders of AI denied (or simply played-down) the distinction between actions (or behaviour) and what’s supposed to be “behind” action (or behaviour). Thus if that “binary opposition” is rejected, then all we have to go on are the actions (or behaviour) of computers. And if computers pass the Turning test, then they’re intelligent. Full stop. Indeed it’s only a few behavioural steps forward from this to argue that computers actually have minds.

Of course if we follow this line to the letter, then it can be said that Zombies have minds too; as well as consciousness. And a thermostat has a little bit of a mind.

If you think my last inclusion of a thermostat is ridiculous, then here’s John Searle talking about the inventor of the term “artificial intelligence”, John McCarthy. Searle writes :

“McCarthy says ‘even a machine as simple as a thermostat can be said to have beliefs.’ I admire McCarthy’s courage. I once asked him ‘What beliefs does your thermostat have?’ And he said ‘My thermostat has three beliefs — it believes it’s too hot in here, it’s too cold in here, and it’s just right in here.’…”

Weak and Strong AI

This is where the distinction between strong and weak AI comes into play.

Weak AI proponents argue that it’s unquestionably the case that some computers (or all computers?) act as if they’re intelligent (or have minds). Though the operative words here are “as if”. Thus, they continue, it may take a little bit more time to develop computers which have “genuine intelligence” (whatever that is) or have minds. In other words, there has to be more than behaviour (or actions) to intelligence or mind.

Alan Turing himself put the weak AI position when he argued that it doesn’t matter if a machine has a mind in the human sense: what matters is whether or not it can act in the way that human beings act — i.e. intelligently . (In those days that basically meant answering questions and solving mathematical problems.) In fact that was the crux of the Turing test which resulted in the Dartmouth proposal . Namely:

“Every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.”

John Searle states the strong AI hypothesis (with all its behaviourist trappings) in the following way :

“The other minds reply (Yale). ‘How do you know that other people understand Chinese or anything else? Only by their behaviour. Now the computer can pass the behavioural tests as well as they can (in principle), so if you are going to attribute cognition to other people you must in principle also attribute it to computers.’…”

Strong AI bites the bullet and denies the distinction between behaviour and mind/intelligence:

If a computer acts (or behaves) as if it’s intelligent (or has a mind), then it is intelligent (or has a mind).

In other words, even though I’ve just written the words “as if”, there’s no actual as if about it.

So why worry our pretty little heads about what must lie behind these expressions of mind or intelligence? In true behaviourist fashion, all we really need (or have!) is behaviour.

Sentience and Sapience

When it’s said that there’s no way that we can know (or tell) that a computer is sentient, it seems incredible. This is usually said about animals or even about other human beings. However, logically the same thing can indeed be said about computers; though, admittedly, not with the same force or implications.

Of course other human beings can tell us that they’re sentient (even if they don’t use the words “I’m sentient”). Animals, on the other hand, can hint (as it were) at their sentience. Then again, it’s also possible that a future computer could do the same.

So let’s get a little but more concrete about all this.

I just mentioned that the display of intelligence (or mind) is deemed to be intelligence (or mind). And computers certainly display intelligence. For example, computers can solve problems, play games (e.g., chess), prove mathematical theorems, diagnose medical problems, use language and so on. What more do we want?

All these things are undoubtedly displays of intelligence; though are they also displays of mind? However, just as I mentioned the mind-behaviour binary opposition; so we have the intelligence-mind opposition too. That means we can construct an argument which takes us from behaviour to intelligence; and then from intelligence to mind. Thus:

i) If a computer behaves intelligently, ii) then it is intelligent. iii) If computer is intelligent, v) then it must have a mind.

Prima facie, it does seem to be the case that when other people do intelligent things, then we (as good behaviourists) say that they’re intelligent ; whereas when the same actions are done by a computer it rarely evokes the same response (or, at the least, not exactly the same kind of response). After all, doesn’t winning a game of chess, etc. match most people’s criteria of a genuine display of intelligence?

Don’t forget to give us your 👏 !

Paul Austin Murphy

Written by Paul Austin Murphy

MY PHILOSOPHY: https://paulaustinmurphypam.blogspot.com/ My Flickr Account: https://www.flickr.com/photos/193304911@N06/

More from Paul Austin Murphy and Becoming Human: Artificial Intelligence Magazine

Bernardo Kastrup: The Idealist Cult Leader Who Endlessly Abuses Others

Paul Austin Murphy’s Essays on Philosophy

Bernardo Kastrup: The Idealist Cult Leader Who Endlessly Abuses Others

Yes. that’s right. there is indeed a self-referential problem with the title above. it contains the words “cult leader”, which can be….

Downloadable: Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Data Science…

Stefan Kojouharov

Downloadable: Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Data Science…

Downloadable pdf of best ai cheat sheets in super high definition.

130 Machine Learning Projects Solved and Explained

Aman Kharwal

130 Machine Learning Projects Solved and Explained

Machine learning projects solved and explained for free.

Two Contemporary Wittgensteinians Fight Against Scientism in Philosophy

Two Contemporary Wittgensteinians Fight Against Scientism in Philosophy

The british philosophers p.m.s. hacker (peter hacker) and paul horwich speak out against scientism in contemporary analytic philosophy., recommended from medium.

The Math Behind Neural Networks

Cristian Leo

Towards Data Science

The Math Behind Neural Networks

Dive into neural networks, the backbone of modern ai, understand its mathematics, implement it from scratch, and explore its applications.

The Quandary of Model Interpretability: Bridging the Gap Between Accuracy and Explainability

Iain Brown, PhD

Artificial Intelligence in Plain English

The Quandary of Model Interpretability: Bridging the Gap Between Accuracy and Explainability

Navigating the complex landscape of ai accountability.

strong ai hypothesis says that

Generative AI Recommended Reading

strong ai hypothesis says that

AI Regulation

strong ai hypothesis says that

What is ChatGPT?

strong ai hypothesis says that

ChatGPT prompts

Roadmap to Learn AI in 2024

Benedict Neo

bitgrit Data Science Publication

Roadmap to Learn AI in 2024

A free curriculum for hackers and programmers to learn ai.

BrainChip’s Akida: Neuromorphic Processor Bringing AI to the Edge

NeuroCortex.AI

BrainChip’s Akida: Neuromorphic Processor Bringing AI to the Edge

We learn about brainchip’s akida ip which are meant for bringing ai to the edge and low powered devices..

A robot working on a computer writing something sloppily

Bits and Behavior

Large language models will change programming… a little

The Definitive Guide to LLM Writing Styles

Viktor Bezdek

The Pythoneers

The Definitive Guide to LLM Writing Styles

Language is a powerful tool for shaping thought, feeling and identity. advances in large language models (llms) have unlocked unprecedented….

Text to speech

Advertisement

Supported by

Need a Hypothesis? This A.I. Has One

Slowly, machine-learning systems are beginning to generate ideas, not just test them.

  • Share full article

strong ai hypothesis says that

By Benedict Carey

Machine-learning algorithms seem to have insinuated their way into every human activity short of toenail clipping and dog washing, although the tech giants may have solutions in the works for both. If Alexa knows anything about such projects, she’s not saying.

But one thing that algorithms presumably cannot do, besides feel heartbreak, is formulate theories to explain human behavior or account for the varying blend of motives behind it. They are computer systems; they can’t play Sigmund Freud or Carl Jung, at least not convincingly. Social scientists have used the algorithms as tools, to number-crunch and test-drive ideas, and potentially predict behaviors — like how people will vote or who is likely to engage in self-harm — secure in the knowledge that ultimately humans are the ones who sit in the big-thinking chair.

Enter a team of psychologists intent on understanding human behavior during the pandemic. Why do some people adhere more closely than others to Covid-19 containment measures such as social distancing and mask wearing? The researchers suspected that people who resisted such orders had some set of values or attitudes in common, regardless of their age or nationality, but had no idea which ones.

The team needed an interesting, testable hypothesis — a real idea. For that, they turned to a machine-learning algorithm.

“We decided, let’s try to think outside the box and get some actionable ideas from a machine-learning model,” said Krishna Savani, a psychologist at Nanyang, Technological University’s business school, in Singapore, and an author of the resulting study. His co-authors were Abhishek Sheetal, the lead author, who is also at Nanyang; and Zhiyu Feng, at Renmin University of China. “It was Abhishek’s idea,” Dr. Savani said.

The paper, posted in a recent issue of Psychological Science, may or may not presage a shift in how social science is done. But it provides a good primer, experts said, in using a machine to generate ideas rather than merely test them.

“This study highlights that a theory-blind, data-driven search of predictors can help generate novel hypotheses,” said Wiebke Bleidorn, a psychologist at the University of California, Davis. “And that theory can then be tested and refined.”

The researchers effectively worked backward. They reasoned that people who choose to flout virus containment measures were violating social norms, a kind of ethical lapse. Previous research had not provided clear answers about shared attitudes or beliefs that were associated with ethical standards — for example, a person’s willingness to justify cutting corners — in various scenarios. So the team had a machine-learning algorithm synthesize data from the World Values Survey, a project initiated by the University of Michigan in which some 350,000 people from nearly 100 countries answer ethics-related questions, as well as more than 900 other items.

The machine-learning program pitted different combinations of attitudes and answers against one another to see which sets were most associated with high or low scores on the ethics questionnaires.

They found that the top 10 sets of attitudes linked to having strict ethical beliefs included views on religion, views about crime and confidence in political leadership. Two of those 10 stood out, the authors wrote: the belief that “humanity has a bright future” was associated with a strong ethical code, and the belief that “humanity has a bleak future” was associated with a looser one.

“We wanted something we could manipulate, in a study, and that applied to the situation we’re in right now — what does humanity’s future look like?” Dr. Savani said.

In a subsequent study of some 300 U.S. residents, conducted online, half of the participants were asked to read a relatively dire but accurate accounting of how the pandemic was proceeding: China had contained it, but not without severe measures and some luck; the northeastern U.S. had also contained it, but a second wave was underway and might be worse, and so on.

This group, after its reading assignment, was more likely to justify violations of Covid-19 etiquette, like hoarding groceries or going maskless, than the other participants, who had read an upbeat and equally accurate pandemic tale: China and other nations had contained outbreaks entirely, vaccines are on the way, and lockdowns and other measures have worked well.

“In the context of the Covid-19 pandemic,” the authors concluded, “our findings suggest that if we want people to act in an ethical manner, we should give people reasons to be optimistic about the future of the epidemic” through government and mass-media messaging, emphasizing the positives.

That’s far easier said than done. No psychology paper is going to drive national policies, at least not without replication and more evidence, outside experts said. But a natural test of the idea may be unfolding: Based on preliminary data, two vaccines now in development are around 95 percent effective, scientists reported this month. Will that optimistic news spur more-responsible behavior?

“Our findings would suggest that people are likely to be more ethical in their day-to-day lives, like wearing masks, with the news of all the vaccines,” Dr. Savani said in an email.

One common knock against machine-learning programs is that they are “black boxes”: They find patterns in large pools of complex data, but no one knows what those patterns mean. The computer cannot stop and explain why, for instance, combat veterans of a certain age, medical history and home ZIP code are at elevated risk for suicide, only that that’s what the data reveal. The systems provide predictions, but no real insight. The “deep” learners are shallow indeed.

But by having the machine start with a hypothesis it has helped form, the box is wedged open just a crack. After all, the vast banks of computers already running our lives may have discovered this optimism-ethics connection long ago, but who would know?

For that matter, who knows what other implicit, “learned” psychology theories all those machines are using, besides the obvious ad-driven, commercial ones? The machines may already have cracked hidden codes behind many human behaviors, but it will require live brains to help tease those out.

[ Like the Science Times page on Facebook. | Sign up for the Science Times newsletter. ]

Benedict Carey has been a science reporter for The Times since 2004. He has also written three books, “How We Learn” about the cognitive science of learning; “Poison Most Vial” and “Island of the Unknowns,” science mysteries for middle schoolers. More about Benedict Carey

Explore Our Coverage of Artificial Intelligence

News  and Analysis

U.S. clinics are starting to offer patients a new service: having their mammograms read not just by a radiologist, but also by an A.I. model .

OpenAI unveiled Voice Engine , an A.I. technology that can recreate a person’s voice from a 15-second recording.

Amazon said it had added $2.75 billion to its investment in Anthropic , an A.I. start-up that competes with companies like OpenAI and Google.

The Age of A.I.

Teen girls are confronting an epidemic of deepfake nudes in schools  across the United States, as middle and high school students have used A.I. to fabricate explicit images of female classmates.

A.I. is peering into restaurant garbage pails  and crunching grocery-store data to try to figure out how to send less uneaten food into dumpsters.

David Autor, an M.I.T. economist and tech skeptic, argues that A.I. is fundamentally different  from past waves of computerization.

Economists doubt that A.I. is already visible in productivity data . Big companies, however, talk often about adopting it to improve efficiency.

The Caribbean island Anguilla made $32 million last year, more than 10& of its G.D.P., from companies registering web addresses that end in .ai .

When it comes to the A.I. that powers chatbots, China trails the United States. But when it comes to producing the scientists behind a new generation of humanoid technologies, China is pulling ahead .

Advertisement

Advertisement

Artificial intelligence and its natural limits

  • Original Article
  • Published: 30 May 2020
  • Volume 36 , pages 9–18, ( 2021 )

Cite this article

  • Karl D. Stephan   ORCID: orcid.org/0000-0003-0967-3191 1 &
  • Gyula Klima   ORCID: orcid.org/0000-0002-1597-7039 2  

2336 Accesses

6 Citations

8 Altmetric

Explore all metrics

An argument with roots in ancient Greek philosophy claims that only humans are capable of a certain class of thought termed conceptual, as opposed to perceptual thought, which is common to humans, the higher animals, and some machines. We outline the most detailed modern version of this argument due to Mortimer Adler, who in the 1960s argued for the uniqueness of the human power of conceptual thought. He also admitted that if conceptual thought were ever manifested by machines, such an achievement would contradict his conclusion. We revisit Adler’s criterion in the light of the past five decades of artificial-intelligence (AI) research, and refine it in view of the classical definitions of perceptual and conceptual thought. We then examine two well-publicized examples of creative works (prose and art) produced by AI systems and show that evidence for conceptual thought appears to be lacking in them. Although clearer evidence for conceptual thought on the part of AI systems may arise in the near future, especially if the global neuronal workspace theory of consciousness prevails over its rival, integrated information theory, the question of whether AI systems can engage in conceptual thought appears to be still open.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

strong ai hypothesis says that

Similar content being viewed by others

strong ai hypothesis says that

Instrumentality Versus Awareness: Rethinking the Reverse Side of Artificial Intelligence

strong ai hypothesis says that

Artificial Intelligence

strong ai hypothesis says that

Consciousness: Just Another Technique?

Ulrike Barthelmeß & Ulrich Furbach

As a justification of our attribution of the roots of the immateriality of the intellect to Aristotle, we should note here that the traditional division of Aristotle’s De Anima into three books reflects the fact that Aristotle devoted the arguments of the second book to perceptual functions and those of the third book to intellective functions (concept formation, judgment formation and reasoning), among which he also provides an argument for the immateriality of the intellect on account of its ability to represent all material natures. For a thorough discussion of that argument in Aquinas’ interpretation, see (Klima and Hall 2011 ), pp. 25-59. To be sure, there have been materialistic interpretations of Aristotle’s doctrine of human thought even in ancient and medieval times (by Alexander, Averroës and the Latin Averroists), but even those strictly distinguished sensitive (perceptual) from intellective (conceptual) functions, attributing the latter to a separate, immaterial substance, and arguing that (apparent) human intellective functions are due to the material human soul’s special “uplink” to that substance (often identified with God). In any case, in developing the argument of the paper we explicitly relied only on Aristotle’s Thomistic interpretation. This is only a matter of Aristotelian scholarship, not affecting the substance of our argument.

Adler MJ (1967) The difference of man and the difference it makes. Holt, Rinehart and Winston, New York

Google Scholar  

Adler MJ (1985) Ten philosophical mistakes. Collier Books, New York

Augros M (2017) The immortal in you. Ignatius Press, San Francisco

de Vio Cajetan T (1964) Commentary on being and essence (Trans. L. H. Kendzierski and F. C. Wade). Marquette University Press, Milwaukee

Elgammal A, Liu B, Elhoseiny M, Mazzone M (2017) CAN: creative adversarial networks generating ‘art’ by learning about styles and deviating from style norms. arXiv:1706.0768v1 , 22

Klima G (2009) Aquinas on the materiality of the human soul and the immateriality of the human intellect. Philos Investig 32(2):163–182. https://doi.org/10.1111/j.1467-9205.2008.01368.x

Article   Google Scholar  

Klima G (2018) Aquinas’ balancing act: balancing the soul between the realms of matter and pure spirit. Bochumer Philosophisches Jahrbuch Für Antike Und Mittelalter, sec. 8 21:29–48. https://doi.org/10.1075/bpjam.00022.kli

Klima G, Hall A (eds) (2011) The immateriality of the human mind, the semantics of analogy, and the conceivability of god. In: Proceedings of the society for medieval logic and metaphysics, vol 1. Cambridge Scholars Publishing, Newcastle upon Tyne. https://www.researchgate.net/publication/303565907_The_Immateriality_of_the_Human_Mind_the_Semantics_of_Analogy_and_the_Conceivability_of_God . Accessed 25 May 2020

Koch C (2019) Proust among the machines. Sci Am 321(6):46–49

Seabrook J (2019) The next word. The New Yorker, October, 52–63

Searle JR (1980) Minds, brains, and programs. Behav Brain Sci 3(3):417–424. https://doi.org/10.1017/S0140525X00005756

Tononi G (2014) Integrated information theory. http://Www.Scholarpedia.Org/Article/Integrated_information_theory . Accessed 26 May 2020

Turing AM (1950) Computing machinery and intelligence. Mind 59(236):433–460

Article   MathSciNet   Google Scholar  

Wouk H (1951) The Caine mutiny. Doubleday & Company, Garden City

Download references

Acknowledgements

We thank the reviewers Massimo Negrotti and especially Albert Borgmann for their helpful reviews of an earlier draft of this paper.

Author information

Authors and affiliations.

Ingram School of Engineering, Texas State University, San Marcos, TX, 78666, USA

Karl D. Stephan

Department of Philosophy, Fordham University, Rose Hill Campus, Bronx, NY, 10458, USA

Gyula Klima

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Karl D. Stephan .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Stephan, K.D., Klima, G. Artificial intelligence and its natural limits. AI & Soc 36 , 9–18 (2021). https://doi.org/10.1007/s00146-020-00995-z

Download citation

Received : 08 January 2020

Accepted : 15 May 2020

Published : 30 May 2020

Issue Date : March 2021

DOI : https://doi.org/10.1007/s00146-020-00995-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Consciousness
  • Conceptual thought
  • Perceptual thought
  • Turing test
  • Find a journal
  • Publish with us
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Artif Intell

Logo of frontai

Artificial Intelligence: A Clarification of Misconceptions, Myths and Desired Status

Frank emmert-streib.

1 Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland

Olli Yli-Harja

2 Institute of Biosciences and Medical Technology, Tampere, Finland

3 Computational System Biology, Faculty of Medicine and Health Technology, Tampere University, Finland

4 Institute for Systems Biology, Seattle, WA, United States

Matthias Dehmer

5 Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, IL, Austria

6 Department of Computer Science, Swiss Distance University of Applied Sciences, Brig, Switzerland

7 College of Artificial Intelligence, Nankai University, Tianjin, China

Lihua Feng , Central South University, China

The field artificial intelligence (AI) was founded over 65 years ago. Starting with great hopes and ambitious goals the field progressed through various stages of popularity and has recently undergone a revival through the introduction of deep neural networks. Some problems of AI are that, so far, neither the “intelligence” nor the goals of AI are formally defined causing confusion when comparing AI to other fields. In this paper, we present a perspective on the desired and current status of AI in relation to machine learning and statistics and clarify common misconceptions and myths. Our discussion is intended to lift the veil of vagueness surrounding AI to reveal its true countenance.

1. Introduction

Artificial intelligence (AI) has a long tradition going back many decades. The name artificial intelligence was coined by McCarthy at the Dartmouth conference in 1956 that started a concerted endeavor of this research field which continues to date ( McCarthy et al., 2006 ). The initial focus of AI was on symbolic models and reasoning followed by the first wave of neural networks (NN) and expert systems (ES) ( Rosenblatt, 1957 ; Newel and Simon, 1976 ; Crevier, 1993 ). The field experienced a severe setback when Minsky and Papert demonstrated problems with perceptrons in learning non-linear separable functions, e.g., the exclusive OR (XOR) ( Minsky and Papert, 1969 ). This significantly affected the progression of AI in the following years especially in neural networks . However, in the 1980s neural networks made a comeback through invention of the back-propagation algorithm ( Rumelhart et al., 1986 ). Later in the 1990s research about intelligent agents garnered broad interest ( Wooldridge and Jennings, 1995 ) exploring for instance coupled effects of perceptions and actions ( Wolpert and Kawato, 1998 ; Emmert-Streib, 2003 ). Finally, in the early 2000s big data became available and led to another revival of neural networks in the form of deep neural networks (DNN) ( Hochreiter and Schmidhuber, 1997 ; Hinton et al., 2006 ; O’Leary, 2013 ; LeCun et al., 2015 ).

During these years, AI has achieved great success in many different fields including robotics, speech recognition, facial recognition, healthcare, and finance ( Bahrammirzaee, 2010 ; Brooks, 1991 ; Krizhevsky et al., 2012 ; Hochreiter and Schmidhuber, 1997 ; Thrun, 2002 ; Yu et al., 2018 ). Importantly, these problems do not all fall within one field, e.g., computer science, but span a multitude of disciplines including psychology, neuroscience, economy, and medicine. Given the breath of AI applications and the variety of different methods used, it is no surprise that seemingly simple questions, e.g., regarding the aims and goals of AI are obscured especially for those scientists who did follow the field since its inception. For this reason, in this paper, we discuss the desired and current status of AI regarding its definition and provide a clarification for the discrepancy. Specifically, we provide a perspective on AI in relation to machine learning and statistics and clarify common misconceptions and myths.

Our paper is organized as follows. In the next section, we discuss the desired and current status of artificial intelligence including the definition of “intelligence” and strong AI. Then we clarify frequently encountered misconceptions about AI. Finally, we discuss characteristics of methods from artificial intelligence in relation to machine learning and statistics. The paper is completed with concluding remarks.

2. What Is Artificial Intelligence?

We begin our discussion by clarifying the meaning of artificial intelligence itself. We start by presenting attempts to define “intelligence” followed by informal characterizations of AI as the former problem is currently unresolvable.

2.1. Defining “Intelligence” in Artificial Intelligence

From the name “artificial intelligence” it seems obvious that AI dealswith an artificial–not natural - form of intelligence. Hence, defining ‘intelligence’ in a precise way will tell us what AI is about. Unfortunately, currently, there is no such definition available that would be generally accepted by the community. For an extensive discussion of the difficulties encountered when attempting to provide such a definition see, e.g., ( Legg and Hutter, 2007 ; Wang, 2019 ).

Despite the lack of such a generally accepted definition, there are various attempts. For instance, a recent formal measure has been suggested by ( Legg and Hutter, 2007 ). Interestingly, the authors start from several informal definitions of human intelligence to define machine intelligence formally. The resulting measure is given by

Here π is an agent, K the Kolmogorov complexity function, E the set of all environments, µ one particular environment, 2 − K ( μ ) the algorithmic probability distribution over an environment and V μ π a value function. Overall, ϒ ( π ) is called the universal intelligence of agent π ( Legg and Hutter, 2007 ). Informally, Eq 1 gives a measure for intelligence as the ability of an agent to achieve goals in a wide range of environments ( Legg and Hutter, 2007 ).

A general problem with the definition given in Eq 1 is that its form is rather cumbersome and unintuitive, and its exact practical evaluation is not possible because the Kolmogorov complexity function K is not computable but requires an approximation. A further problem is to perform intelligence tests because, e.g., a Turing test ( Turing, 1950 ) is insufficient ( Legg and Hutter, 2007 ), for instance, because an agent could appear intelligent without actually being intelligent ( Block, 1981 ).

A good summary of the problem in defining “intelligence” and AI is given in ( Winston and Brown, 1984 ), who state that “Defining intelligence usually takes a semester-long struggle, and even after that I am not sure we ever get a definition really nailed down. But operationally speaking, we want to make machines smart.” In summary, there is currently neither a generally accepted definition of “intelligence” nor tests that could be used to identify “intelligence” reliably.

In spite of this lack of a general definition of “intelligence,” there is a philosophical separation of AI systems based on this notion. The so called weak AI hypothesis states that “machines could act as if they were intelligent” whereas the strong AI hypothesis asserts “that machines that do so are actually thinking (not just simulating thinking)” ( Russell and Norvig, 2016 ). The latter in particular is very controversial and an argument against a strong AI is the Chinese room ( Searle, 2008 ). We would like to note that strong AI hasrecently been rebranded as artificial general intelligence (AGI) ( Goertzel and Pennachin, 2007 ; Yampolskiy and Fox, 2012 ).

2.2. Informal Characterizations of Artificial Intelligence

Since there is no generally accepted definition of “intelligence” AI has been characterized informally from its beginnings. For instance, in ( Winston and Brown, 1984 ) it is stated that “The primary goal of Artificial Intelligence is to make machines smarter. The secondary goals of Artificial Intelligence are to understand what intelligence is (the Nobel laureate purpose) and to make machines more useful (the entrepreneurial purpose)”. Kurzweil noted that artificial intelligence is “The art of creating machines that perform functions that require intelligence when performed by people” ( Kurzweil et al., 1990 ). Furthermore, Feigenbaum ( Feigenbaum, 1963 ) said “artificial intelligence research is concerned with constructing machines (usually programs for general-purpose computers) which exhibit behavior such that, if it were observed in human activity, we would deign to label the behavior ‘intelligent’.” The latter reminds one of a Turing test of intelligence and that a measure for intelligence is connected to such a test; see our discussion in the last section.

Feigenbaum further specifies that “One group of researchers is concerned with simulating human information-processing activity, with the quest for precise psychological theories of human cognitive activity” and “A second group of researchers is concerned with evoking intelligent behavior from machines whether or not the information processes employed have anything to do with plausible human cognitive mechanisms” ( Feigenbaum, 1963 ). Similar distinctions have been made in ( Simon, 1969 ; Pomerol, 1997 ). Interestingly, the first point addresses a natural–not artificial–form of cognition showing that some scientists even cross the boundary from artificial to biological phenomena.

From this follows, that from its beginnings, AI had high aspirations focusing on ultimate goals centered around intelligent and smart behavior rather than on simple questions as represented, e.g., by classification or regression problems as discussed in statistics or machine learning. This also means that AI is not explicitly data-focused but assumes the availability of data which would allow the studying of such high-hanging questions. This relates also, e.g., to probabilistic or symbolic approaches ( Koenig and Simmons, 1994 ; Hoehndorf and Queralt-Rosinach, 2017 ). Importantly, this is in contrast to data science which places data at the center of the investigation and develops estimation techniques for extracting the optimum of information contained in data set(s) possibly by applying more than one method ( Emmert-Streib and Dehmer, 2019a ).

2.3. Current Status

From the above discussion, it seems fair to assert that we neither have a generally accepted, formal (mathematical) definition of “intelligence” nor do we have one succinct informal definition of AI that would go beyond its obvious meaning. Instead, there are many different characterizations and opinions about what AI should be ( Wang, 2006 ).

Given this deficiency it is not surprising that there are many misconceptions and misunderstandings about AI in general. In the following section, we discuss some of these.

3. Common Misconceptions and Myths

In this section, we discuss some frequently encountered misconceptions about AI and clarify some false assumptions.

AI aims to explain how the brain works . No, because brains occur only in living (biological) beings and not in artificial machines. Instead, fields studying the molecular biological mechanisms of natural brains are neuroscience and neurobiology. Whether AI research can contribute to this question in some way is unclear but so far no breakthrough contribution has been made. Nevertheless, it is unquestionable that AI research wasinspired by neurobiology from its very beginnings ( Fan et al., 2020 ) and one prominent example for this is Hebbian learning ( Hebb, 1949 ) or extensions thereof ( Emmert-Streib, 2006 ).

AI methods work similar to the brain . No, this is not true; even if the most popular methods of AI are called neural networks which are inspired by biological brains. Importantly, despite the name “neural network” such models do not present physiological neural models because neither the model of a neuron nor the connectivity between the neurons in neural networks is biologically plausible nor realistic. That means neither the connectivity structure of convolutional neural networks nor that of deep feedforward neural networks or other deep learning architectures are biologically realistic. In contrast, a physiological model of a biological neuron is the Hodgkin-Huxley model ( Hodgkin and Huxley, 1952 ) or the FitzHugh-Nagumo model ( Nagumo et al., 1962 ) and the large-scale connectivity of the brain is to date largely unknown.

Methods from AI have a different purpose as methods from machine learning or statistics . No, the general purpose of all methods from these fields is to analyze data. However, each field introduced different methods with different underlying philosophies. Specifically, the philosophy of AI is to aim at ultimate goals, which are possibly unrealistic, rather than to answer simple questions. As a note, we would like to remark that any manipulation of data stored in a computer, is a form of data analysis. Interestingly, this is even true for agent-based systems, e.g., robotics, which incrementally gather data via the interaction with an environment. Kaplan and Haenlein phrased this nicely as “a system’s ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation”, when defining AI ( Kaplan and Haenlein, 2019 ).

AI is a technology. No, AI is a methodology. That means the methods behind AI are (mathematical) learning algorithms that adjust the parameters of methods via learning rules. However, when implementing AI methods certain problems may require an optimization of the method in combination with computer hardware, e.g., by using a GPU, in order to improve the computation time it takes to execute a task. The latter combination may give the impression that AI is a technology but by downscaling a problem one can always reduce the hardware requirements, demonstrating the principle workings of a method, potentially for toy examples. Importantly, in the above argument we emphasized the intellectual component of AI. It is clear that AI cannot be done with a pencil and piece of paper, hence, a computer is always required and a computer is a form of technology. However, the intellectual component of AI is not the computer itself but the software implementing learning rules.

AI makes computers think. From a scientific point of view, no, because similar to the problems of defining “intelligence” there is currently no definition of “thinking”. Also thinking is in general associated with humans who are biological beings rather than artificial machines. In general, this point is related to the goals of strong AI and the counter argument by Searle ( Searle, 2008 ).

Why does AI appear more mythical than machine learning or statistics? Considering the fact that those fields serve a similar purpose (see above) this is indeed strange. However, we think that the reason therefore is twofold. First, the vague definition of AI leaves much room for guesswork and wishful thinking which can be populated by a wide range of philosophical considerations. Second, the high aspirations of AI enable speculations about ultimate or futuristic goals like “making machines think” or “making machines human-like”.

Making machines behave like humans is optimal . At first, this sounds reasonable but let us consider an example. Suppose there is a group of people and the task is to classify handwritten numbers. This is a difficult problem because the hand writing can be difficult to read. For this reason, one cannot expect that all people will achieve the optimal score, but some people perform better than others. Hence, the behavior of every human is not optimal compared to the maximal score or even the best performing human. Also, if we give the same group of people, a number of different tasks to solve, then it is unlikely that the same person will always perform best. Altogether, it does not make sense to make a computer behave like humans because most people do not perform optimally, regardless of what task we consider. So, what it actually means is to make a computer perform like the best performing human. For one task this may actually mimic the behavior of one human, however, for several tasks this will correspond to the behavior of a different human for every task. Hence, such a super human does not exist. That means if a machine can solve more than one task it does not make sense to compare it to one human because such a person does not exist. Hence, the goal is to make machines behave like an ideal super human.

When will the ultimate goals of AI be reached? Over the years there have been a number of predictions. For instance, Simon predicted in 1965 that “Machines will be capable, within twenty years, of doing any work a man can do” ( Simon, 1965 ), Minsky stated in 1967 that “Within a generation … the problem of creating artificial intelligence will substantially be solved” ( Minsky, 1967 ) and Kurzweil predicted in 2005 that strong AI, which he calls singularity, will be realized by 2045 ( Kurzweil, 2005 ). Obviously, the former two predictions turned out to be wrong and the latter one remains in the future. However, predictions about undefined entities are vague (see our discussion about intelligence above) and cannot be evaluated systematically. Nevertheless, it is unquestionable that methods from AI make a continuous contribution to many areas in science and industry.

From the above discussion one realizes that metaphors are frequently used when talking about AI but those are not meant to be understood in a precise way but more as a motivation or stimulation. The origin of this might be related to the community behind AI which is considerably different from the more mathematics oriented communities in statistics or machine learning.

4. Discussion

In the previous sections, we discussed various aspects of AI and their limitations. Now we aimfor a general overview of the relations between methods in artificial intelligence, machine learning and statistics. In Table 1 we show a list of core methods from artificial intelligence, machine learning and statistics. Here “core” refers to methods that can be considered as characteristic for a field, e.g., hypothesis testing for statistics, support vector machines for machine learning or neural networks for artificial intelligence. Each of these methods has attributes with respect the capabilities of the methods. In the following, we consider three such attributes as the most important; 1) the complexity of questions to be studied, 2) the dimensionality of data to be processed and 3) the type of data that can be analyzed. In Figure 1 , we show a simplified, graphical overview of these properties (the acronyms are given in Table 1 ). We would like to highlight that these distinctions present our own, idealized perspective shared by many. However, alternative views and perspectives are possible.

List of popular, core artificial intelligence, machine learning and statistics methods representing characteristic models of those fields.

An external file that holds a picture, illustration, etc.
Object name is frai-03-524339-g001.jpg

A simplified, graphical overview of properties of core (and base) methods from artificial intelligence (AI), machine learning (ML) and statistics. The x -axis indicates simple (left) and complex (right) questions a method can study whereas the y -axis indicates low- and high-dimensional methods. In addition, there is an orange axis (top) indicating different data-types. Overall, one can distinguish three regions where either methods from artificial intelligence (blue), machine learning (green) or statistics (red) dominate. See Table 1 for acronyms.

In general, there are many properties of methods one can use for a distinction. However, we start by focusing on only two such features. Specifically, the x -axis in Figure 1 indicates the question-type that can be addressed by a method from simple (left-hand side) to complex (right-hand side) questions, whereas the y -axis indicates the input dimensionality of the data from low-to high-dimensional. Here the dimensionality of the data corresponds to the length of a feature vector used as the input for an analysis method which is different to the number of samples which gives the total number of different feature vectors. Overall, in Figure 1 one can distinguish three regions where methods from artificial intelligence (blue), machine learning (green), or statistics (red) dominate. Interestingly, before the introduction of deep learning neural networks, region II. was entirely dominated by machine learning methods. For this reason we added a star to neural networks (NN) to indicate it as modern AI method. As one can see, methods from statistics are generally characterized by simple questions that can be studied in low-dimensional settings. Here by “simple” we do not mean “boring” or “uninteresting” but rather “specific” or “well defined”. Hence, from Figure 1 one can conclude that AI tends to address complex questions that do not fit well into a conventional framework, e.g., as represented by statistics. The only exception is neural networks.

For most of the methods shown in Table 1 exist extensions to the “base” method. For instance, a classical statistical hypothesis test is conducted just once. However, modern problems in genomics or the social sciences require the testing of thousands or millions of hypotheses. For this reason multiple testing corrections have been introduced ( Farcomeni, 2008 ; Emmert-Streib and Dehmer, 2019c ). Similar extensions can be found for most other methods, e.g., regression. However, when considering only the original core methods one obtains a simplified categorization for the domains of AI, ML and statistics, which can be summarized as follows:

  • • Traditional domain of artificial intelligence ⇒ Complex questions
  • • Traditional domain of machine learning ⇒ High-dimensional data
  • • Traditional domain of statistics ⇒ Simple questions

In Figure 1 , we added one additional axis (orange) on top of the figure indicating different types of data. In contrast to the axes for the question-type and the dimensionality of input data, the scale of this axis is categorial, which means there is no smooth transition between the corresponding categories. Using this axis (feature) as an additional perspective, one can see that machine learning as well as statistics methods require data from designed experiments. This form of experiment corresponds to the conventional type of experiments in physics or biology, where the measurements follow a predefined plan called an experimental design. In contrast, AI methods frequently use actively generated data [also known as online learning ( Hoi et al., 2018 )] which become available in a sequential order. An example for this data type is the data a robot generates by exploring its environment or data generated by moves in games ( Mnih et al., 2013 ).

We think it is important to emphasize that (neither) methods from AI (nor from machine learning or statistics) can be mathematically derived from a common, underlying methodological framework but they have been introduced separately and independently. In contrast, physical theories, e.g., about statistical mechanics or quantum mechanics, can be derived from a Hamiltonian formalism or alternatively from Fisher Information ( Frieden and Frieden, 1998 ; Goldstein et al., 2013 ).

Maybe the most interesting insight from Figure 1 is that the current most successful AI methods, namely neural networks, do not address complex questions but simple ones (e.g., classification or regression) for high-dimensional data ( Emmert-Streib et al., 2020 ). This is notable because it goes counter the tradition of AI taking on novel and complex problems. Also considering the current interest in futuristic problems, e.g., self-driving cars, automatic trading or health diagnostics this seems even more curious because it means such complex questions are addressed reductionistically dissecting the original problem into smaller subproblems rather than addressing them as a whole. Metaphorically, this may be considered as maturing process of AI settling after a rebellious adolescence against the limitations of existing fields like control theory, signal processing or statistics ( Russell and Norvig, 2016 ). Whether it will remain this way, remains to be to be seen in the future.

Finally, if one considers also novel extensions for all base methods from AI, ML and statistics one can summarize the current state of these fields as follows:

  • Current domain of artificial intelligence, machine learning and statistics ⇒ Simple questions for high-dimensional data

This means that all fields seem to converge to simple question for high-dimensional data.

5. Conclusions

In this paper, we discussed the desired and current status of AI and clarified its goals. Furthermore, we put AI into perspective alongside machine learning and statistics and identified similarities and differences. The most important results can be summarized as follows:

  • (1) currently, no generally accepted definition of “intelligence” is available. ⇒ AI remains mathematically undefined, almost 65 years after its formal inception.
  • (2) the aspirations of AI are very high focusing on ambitious goals. ⇒ AI is not explicitly data focused–in contrast to data science.
  • (3) general AI methods do not provide neurobiological models of brain functions. ⇒ AI methods are merely means to analyze data - similar to methods from machine learning and statistics.
  • (4) addition: Also deep neural networks also do not provide neurobiological models of brain functions. ⇒ They are merely means to analyze data.
  • (5) the current most successful AI methods, i.e., deep neural networks, focus on simple questions (classification, regression) and high-dimensional data. ⇒ This goes counter traditional AI but is similar to contemporary machine learning and statistics.
  • (6) AI methods are not derived from a common mathematical formalism but have been introduced separately and independently. ⇒ There is no common conceptual framework that would unite the ideas behind different AI methods.

Finally, we would like to note that the closeness to applications of AI is certainly good for making the field practically relevant and for achieving an impact in the real world. Interestingly, this is very similar to a commercial product. A downside is that AI also comes with slogans and straplines used for marketing reasons just as those used for regular commercial products. We hope our article can help people look beyond the marketing definition of AI to see what the field is actually about from a scientific perspective.

Author Contributions

FE-S conceived the study. All authors contributed to the writing of the manuscript and approved the final version.

MD thanks the Austrian Science Funds for supporting this work (project P30031).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

  • Bahrammirzaee A.. (2010). A comparative survey of artificial intelligence applications in finance: artificial neural networks. Expert System and Hybrid Intelligent Systems . Neural Comput. Applic. 19 ( 8 ), 1165–1195. 10.1007/s00521-010-0362-z [ CrossRef ] [ Google Scholar ]
  • Block N.. (1981). Psychologism and behaviorism . Phil. Rev. 90 ( 1 ), 5–43. 10.2307/2184371 [ CrossRef ] [ Google Scholar ]
  • Breiman L.. (2001). Random forests . Mach. Learn. 45 , 5–32. 10.1023/A:1010933404324 [ CrossRef ] [ Google Scholar ]
  • Brooks R.. (1991). New approaches to robotics . Science 253 , 1227–3210. [ PubMed ] [ Google Scholar ]
  • Chen T., Guestrin C.. (2016). “ Xgboost: a scalable tree boosting system ,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, Seattle, Washington, University of washington, 785–794. [ Google Scholar ]
  • Cox D. R.. (1972). Regression models and life-tables . J. Roy. Stat. Soc. B 34 ( 2 ), 187–202. 10.1007/978-1-4612-4380-9_37 [ CrossRef ] [ Google Scholar ]
  • Crevier D.. (1993). AI: the tumultuous history of the search for artificial intelligence . New York, NY: Basic Books, 432. [ Google Scholar ]
  • Dunn P. K., Smyth G. K.. (2018). Generalized linear models with examples in R . New York, NY: Springer, 562. [ Google Scholar ]
  • Emmert-Streib F., Dehmer M.. (2019a). Defining data science by a data-driven quantification of the community . Mach. Learn. Knowl. Extr. 1 ( 1 ), 235–251. 10.3390/make1010015 [ CrossRef ] [ Google Scholar ]
  • Emmert-Streib F., Dehmer M.. (2019b). High-dimensional lasso-based computational regression models: regularization, shrinkage, and selection . Mach. Learn. Knowl. Extr. 1 ( 1 ), 359–383. 10.3390/make1010021 [ CrossRef ] [ Google Scholar ]
  • Emmert-Streib F., Dehmer M.. (2019c). Large-scale simultaneous inference with hypothesis testing: multiple testing procedures in practice . Mach. Learn. Knowl. Extr. 1 ( 2 ), 653–683. 10.3390/make1020039 [ CrossRef ] [ Google Scholar ]
  • Emmert-Streib F., Dehmer M.. (2019d). Understanding statistical hypothesis testing: the logic of statistical inference . Mach. Learn. Knowl. Extr. 1 ( 3 ), 945–961. 10.3390/make1030054 [ CrossRef ] [ Google Scholar ]
  • Emmert-Streib F., Yang Z., Feng H., Tripathi S., Dehmer M.. (2020). An introductory review of deep learning for prediction models with big data . Front. Artif. Intell. 3 , 4. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Emmert-Streib F.. (2003). “ Aktive computation in offenen systemen ,” in. Lerndynamiken in biologischen systemen: vom netzwerk zum organismus . Ph.D. thesis. Bremen, (Germany): University of Bremen. [ Google Scholar ]
  • Emmert-Streib F.. (2006). A heterosynaptic learning rule for neural networks . Int. J. Mod. Phys. C 17 ( 10 ), 1501–1520. 10.1142/S0129183106009916 [ CrossRef ] [ Google Scholar ]
  • Fan J., Fang L., Wu J., Guo Y., Dai Q.. (2020). From brain science to artificial intelligence . Engineering 6 ( 3 ), 248–252. 10.1016/j.eng.2019.11.012 [ CrossRef ] [ Google Scholar ]
  • Farcomeni A.. (2008). A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion . Stat. Methods Med. Res. 17 ( 4 ), 347–388. 10.1177/0962280206079046 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Feigenbaum E. A.. (1963). Artificial intelligence research . IEEE Trans. Inf. Theor. 9 ( 4 ), 248–253. [ Google Scholar ]
  • Freund Y., Schapire R. E.. (1997). A decision-theoretic generalization of on-line learning and an application to boosting . J. Comput. Syst. Sci. 55 ( 1 ), 119–139. 10.1007/3-540-59119-2_166 [ CrossRef ] [ Google Scholar ]
  • Frieden B. R., Frieden R.. (1998). Physics from fisher information: a unification . Cambridge, England: Cambridge University Press, 318. [ Google Scholar ]
  • Goertzel B., Pennachin C.. (2007). Artificial general intelligence . New York, NY: Springer, 509. [ Google Scholar ]
  • Goldstein H., Poole C., Safko J.. (2013). Classical mechanics . London, United Kingdom: Pearson, 660. [ Google Scholar ]
  • Hayes-Roth F., Waterman D. A., Lenat D. B.. (1983). Building expert system . Boston, MA: Addison-Wesley Longman, 119. [ Google Scholar ]
  • Hebb D.. (1949). The organization of behavior . New York, NY: Wiley, 335. [ Google Scholar ]
  • Hinton G. E., Osindero S., Teh Y. W.. (2006). A fast learning algorithm for deep belief nets . Neural Comput. 18 ( 7 ), 1527–5410. [ PubMed ] [ Google Scholar ]
  • Hochreiter S., Schmidhuber J.. (1997). Long short-term memory , Neural Comput. 9 ( 8 ), 1735–8010. [ PubMed ] [ Google Scholar ]
  • Hoehndorf R., Queralt-Rosinach N.. (2017). Data science and symbolic ai: synergies, challenges and opportunities . Data Sci. 1 ( 1–2 ), 27–38. 10.3233/DS-170004 [ CrossRef ] [ Google Scholar ]
  • Hodgkin A., Huxley A.. (1952). A quantitative description of membrane current and its application to conduction and excitation in nerve . J. Physiol. 117 , 500–4410. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hoi S. C., Sahoo D., Lu J., Zhao P.. (2018). Online learning: a comprehensive survey . Preprint repository name [Preprint]. Available at: arXiv:1802.02871 (Accessed February 8, 2018).
  • Kaebeling L., Littman M., Moore A.. (1996). Reinforcement learning: a survey . J. Artif. Intell. Res. 237–285. [ Google Scholar ]
  • Kaplan A., Haenlein M.. (2019). Siri, Siri, in my hand: who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence . Bus. Horiz. 62 ( 1 ), 15–25. 10.1016/j.bushor.2018.08.004 [ CrossRef ] [ Google Scholar ]
  • Kleinbaum D., Klein M.. (2005). Survival analysis: a self-learning text, statistics for biology and health . New York, NY: Springer, 590. [ Google Scholar ]
  • Kleinbaum D. G., Dietz K., Gail M., Klein M., Klein M.. (2002). Logistic regression . New York, NY: Springer, 514. [ Google Scholar ]
  • Koenig S., Simmons R. G.. (1994). “ Principles of knowledge representation and reasoning ,” in Proceedings of the fourth international conference (KR ‘94), June 1, 1994, (Morgan Kaufmann Publishers; ), 363–373. [ Google Scholar ]
  • Krizhevsky A., Sutskever I., Hinton G. E.. (2012). ImageNet classification with deep convolutional neural networks . Adv. Neural Inform. Process. Syst. 25 ( 2 ), 1097–1105. 10.1145/3065386 [ CrossRef ] [ Google Scholar ]
  • Kurzweil R., Richter R., Kurzweil R., Schneider M. L.. (1990). The age of intelligent machines . Cambridge, MA: MIT press, Vol. 579 , 580. [ Google Scholar ]
  • Kurzweil R.. (2005). The singularity is near: when humans transcend biology . Westminster,England: Penguin, 672. [ Google Scholar ]
  • LeCun Y., Bengio Y., Hinton G.. (2015). Deep learning . Nature 521 , 436–4410. [ PubMed ] [ Google Scholar ]
  • Legg S., Hutter M.. (2007). Universal intelligence: a definition of machine intelligence . Minds Mach. 17 ( 4 ), 391–444. 10.1007/s11023-007-9079-x [ CrossRef ] [ Google Scholar ]
  • McCarthy J., Minsky M. L., Rochester N., Shannon C. E.. (2006). A proposal for the dartmouth summer research project on artificial intelligence, august 31, 1955 . AI Magazine. 27 ( 4 ), 12. 10.1609/aimag.v27i4.1904 [ CrossRef ] [ Google Scholar ]
  • Minsky M., Papert S.. (1969). Perceptrons . New York, NY: MIT Press, 258. [ Google Scholar ]
  • Minsky M. L.. (1967). Computation . Prentice-Hall Englewood Cliffs, 317. [ Google Scholar ]
  • Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., et al. (2013). Playing atari with deep reinforcement learning . Preprint repository name [Preprint]. Available at: arXiv:1312.5602 (Accessed December 19, 2013).
  • Nagumo J., Arimoto S., Yoshizawa S.. (1962). An active pulse transmission line simulating nerve axon . Proc. IRE. 50 , 2061–2071. 10.1109/JRPROC.1962.288235 [ CrossRef ] [ Google Scholar ]
  • Nelder J. A., Wedderburn R. W.. (1972). Generalized linear models . J. Roy. Stat. Soc. 135 ( 3 ), 370–384. 10.2307/2344614 [ CrossRef ] [ Google Scholar ]
  • Newel A., Simon H. A.. (1976). Completer science as emprical inquiry: symbols and search . Commun. ACM. 19 ( 3 ), 113–126. 10.1145/360018.360022 [ CrossRef ] [ Google Scholar ]
  • O’Leary D. E.. (2013). Artificial intelligence and big data . IEEE Intell. Syst. 28 ( 2 ), 96–99. 10.1109/MIS.2013.39 [ CrossRef ] [ Google Scholar ]
  • Pearl J.. (1988). Probabilistic reasoning in intelligent systems . Morgan-Kaufmann, 576. [ Google Scholar ]
  • Pomerol J.-C.. (1997). Artificial intelligence and human decision making . Eur. J. Oper. Res. 99 ( 1 ), 3–25. 10.1016/S0377-2217(96)00378-5 [ CrossRef ] [ Google Scholar ]
  • Rabiner L. R.. (1989). A tutorial on hidden Markov models and selected applications in speech recognition . Proc. IEEE. 77 ( 2 ), 257–286. 10.1109/5.18626 [ CrossRef ] [ Google Scholar ]
  • Rosenblatt F.. (1957). The perceptron, a perceiving and recognizing automaton project para . Buffalo, NY: Cornell Aeronautical Laboratory. [ Google Scholar ]
  • Roweis S. T., Saul L. K.. (2000). Nonlinear dimensionality reduction by locally linear embedding . Science 290 ( 5500 ), 2323–2610. [ PubMed ] [ Google Scholar ]
  • Rumelhart D., Hinton G., Williams R.. (1986). Learning representations by back-propagating errors . Nature 323 , 533–536. 10.1038/323533a0 [ CrossRef ] [ Google Scholar ]
  • Russell S. J., Norvig P.. (2016). Artificial intelligence: a modern approach . Harlow, England: Pearson, 1136. [ Google Scholar ]
  • Schmidhuber J.. (2015). Deep learning in neural networks: an overview , Neural Netw. 61 , 85–117. 10.1016/j.neunet.2014.09.003 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schölkopf B., Smola A.. (2002). Learning with kernels: support vector machines, regulariztion, optimization and beyond . Cambridge, MA: The MIT Press, 644. [ Google Scholar ]
  • Scutari M.. (2010). Learning bayesian networks with the bnlearn r package . J. Stat. Software. 35 ( 3 ), 1–22. 10.18637/jss.v035.i03 [ CrossRef ] [ Google Scholar ]
  • Searle J. R.. (2008). Mind, language and society: philosophy in the real world . New York, NY: Basic Books, 196. [ Google Scholar ]
  • Sheskin D. J.. (2004). Handbook of parametric and nonparametric statistical procedures . 3rd Edn, Boca Raton, FL: RC Press, 1193. [ Google Scholar ]
  • Simon H. A.. (1965). The shape of automation for men and management . New York,NY: Harper & Row, 13 , 211–212. [ Google Scholar ]
  • Simon H. A.. (1969). The sciences of the artificial . Cambridge, MA: MIT University Press, 123. [ Google Scholar ]
  • Sutton R., Barto A.. (1998). Reinforcement learning . Cambridge, MA: MIT Press, 344. [ Google Scholar ]
  • Thrun S.. (2002). Robotic mapping: a survey, Exploring artificial intelligence in the new millennium , 1–35. [ Google Scholar ]
  • Turing A.. (1950). Computing machinery and intelligence . Mind 59 , 433–460. 10.1093/mind/LIX.236.433 [ CrossRef ] [ Google Scholar ]
  • Vapnik V. N.. (1995). The nature of statistical learning theory . New York, NY: Springer, 188. [ Google Scholar ]
  • Wang P.. (2006). Rigid flexibility: the logic of intelligence . New York, NY: Springer Science & Business Media, Vol. 34 , 412. [ Google Scholar ]
  • Wang P.. (2019). On defining artificial intelligence . J. Artifi. Gen. Intell. 10 ( 2 ), 1–37. 10.2478/jagi-2019-0002 [ CrossRef ] [ Google Scholar ]
  • Weisberg S.. (2005). Applied linear regression . New York, NY: John Wiley & Sons, 528 , 352. [ Google Scholar ]
  • Winston P. H., Brown R. H.. (1984). Artificial intelligence, an MIT perspective . Cambridge, MA: MIT Press, 492. [ Google Scholar ]
  • Wolpert D. M., Kawato M.. (1998). Multiple paired forward and inverse models for motor control . Neural Netw. 11 ( 7–8 ), 1317–1329. 10.1016/S0893-6080(98)00066-5 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wooldridge M. J., Jennings N. R.. (1995). Intelligent agents: theory and practice . Knowl. Eng. Rev. 10 ( 2 ), 115–152. [ Google Scholar ]
  • Yampolskiy R. V., Fox J.. (2012). “ Artificial general intelligence and the human mental model ,” in Singularity hypotheses . New York, NY: Springer, 129–145. [ Google Scholar ]
  • Yu K. H., Beam A. L., Kohane I. S.. (2018). Artificial intelligence in healthcare . Nat. Biomed. Eng. 2 , ( 10 ), 719–731. 10.1038/s41551-018-0305-z [ PubMed ] [ CrossRef ] [ Google Scholar ]

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Artificial Intelligence

Artificial intelligence (AI) is the field devoted to building artificial animals (or at least artificial creatures that – in suitable contexts – appear to be animals) and, for many, artificial persons (or at least artificial creatures that – in suitable contexts – appear to be persons). [ 1 ] Such goals immediately ensure that AI is a discipline of considerable interest to many philosophers, and this has been confirmed (e.g.) by the energetic attempt, on the part of numerous philosophers, to show that these goals are in fact un/attainable. On the constructive side, many of the core formalisms and techniques used in AI come out of, and are indeed still much used and refined in, philosophy: first-order logic and its extensions; intensional logics suitable for the modeling of doxastic attitudes and deontic reasoning; inductive logic, probability theory, and probabilistic reasoning; practical reasoning and planning, and so on. In light of this, some philosophers conduct AI research and development as philosophy.

In the present entry, the history of AI is briefly recounted, proposed definitions of the field are discussed, and an overview of the field is provided. In addition, both philosophical AI (AI pursued as and out of philosophy) and philosophy of AI are discussed, via examples of both. The entry ends with some de rigueur speculative commentary regarding the future of AI.

1. The History of AI

2. what exactly is ai, 3.1 the intelligent agent continuum, 3.2 logic-based ai: some surgical points, 3.3 non-logicist ai: a summary, 3.4 ai beyond the clash of paradigms, 4.1 bloom in machine learning, 4.2 the resurgence of neurocomputational techniques, 4.3 the resurgence of probabilistic techniques, 5. ai in the wild, 6. moral ai, 7. philosophical ai, 8.1 “strong” versus “weak” ai, 8.2 the chinese room argument against “strong ai”, 8.3 the gödelian argument against “strong ai”, 8.4 additional topics and readings in philosophy of ai, 9. the future, online courses on ai, related entries.

The field of artificial intelligence (AI) officially started in 1956, launched by a small but now-famous DARPA -sponsored summer conference at Dartmouth College, in Hanover, New Hampshire. (The 50-year celebration of this conference, AI@50 , was held in July 2006 at Dartmouth, with five of the original participants making it back. [ 2 ] What happened at this historic conference figures in the final section of this entry.) Ten thinkers attended, including John McCarthy (who was working at Dartmouth in 1956), Claude Shannon, Marvin Minsky, Arthur Samuel, Trenchard Moore (apparently the lone note-taker at the original conference), Ray Solomonoff, Oliver Selfridge, Allen Newell, and Herbert Simon. From where we stand now, into the start of the new millennium, the Dartmouth conference is memorable for many reasons, including this pair: one, the term ‘artificial intelligence’ was coined there (and has long been firmly entrenched, despite being disliked by some of the attendees, e.g., Moore); two, Newell and Simon revealed a program – Logic Theorist (LT) – agreed by the attendees (and, indeed, by nearly all those who learned of and about it soon after the conference) to be a remarkable achievement. LT was capable of proving elementary theorems in the propositional calculus. [ 3 ] [ 4 ]

Though the term ‘artificial intelligence’ made its advent at the 1956 conference, certainly the field of AI, operationally defined (defined, i.e., as a field constituted by practitioners who think and act in certain ways), was in operation before 1956. For example, in a famous Mind paper of 1950, Alan Turing argues that the question “Can a machine think?” (and here Turing is talking about standard computing machines: machines capable of computing functions from the natural numbers (or pairs, triples, … thereof) to the natural numbers that a Turing machine or equivalent can handle) should be replaced with the question “Can a machine be linguistically indistinguishable from a human?.” Specifically, he proposes a test, the “ Turing Test ” (TT) as it’s now known. In the TT, a woman and a computer are sequestered in sealed rooms, and a human judge, in the dark as to which of the two rooms contains which contestant, asks questions by email (actually, by teletype , to use the original term) of the two. If, on the strength of returned answers, the judge can do no better than 50/50 when delivering a verdict as to which room houses which player, we say that the computer in question has passed the TT. Passing in this sense operationalizes linguistic indistinguishability. Later, we shall discuss the role that TT has played, and indeed continues to play, in attempts to define AI. At the moment, though, the point is that in his paper, Turing explicitly lays down the call for building machines that would provide an existence proof of an affirmative answer to his question. The call even includes a suggestion for how such construction should proceed. (He suggests that “child machines” be built, and that these machines could then gradually grow up on their own to learn to communicate in natural language at the level of adult humans. This suggestion has arguably been followed by Rodney Brooks and the philosopher Daniel Dennett (1994) in the Cog Project. In addition, the Spielberg/Kubrick movie A.I. is at least in part a cinematic exploration of Turing’s suggestion. [ 5 ] ) The TT continues to be at the heart of AI and discussions of its foundations, as confirmed by the appearance of (Moor 2003). In fact, the TT continues to be used to define the field, as in Nilsson’s (1998) position, expressed in his textbook for the field, that AI simply is the field devoted to building an artifact able to negotiate this test. Energy supplied by the dream of engineering a computer that can pass TT, or by controversy surrounding claims that it has already been passed, is if anything stronger than ever, and the reader has only to do an internet search via the string

turing test passed

to find up-to-the-minute attempts at reaching this dream, and attempts (sometimes made by philosophers) to debunk claims that some such attempt has succeeded.

Returning to the issue of the historical record, even if one bolsters the claim that AI started at the 1956 conference by adding the proviso that ‘artificial intelligence’ refers to a nuts-and-bolts engineering pursuit (in which case Turing’s philosophical discussion, despite calls for a child machine, wouldn’t exactly count as AI per se), one must confront the fact that Turing, and indeed many predecessors, did attempt to build intelligent artifacts. In Turing’s case, such building was surprisingly well-understood before the advent of programmable computers: Turing wrote a program for playing chess before there were computers to run such programs on, by slavishly following the code himself. He did this well before 1950, and long before Newell (1973) gave thought in print to the possibility of a sustained, serious attempt at building a good chess-playing computer. [ 6 ]

From the perspective of philosophy, which views the systematic investigation of mechanical intelligence as meaningful and productive separate from the specific logicist formalisms (e.g., first-order logic) and problems (e.g., the Entscheidungsproblem ) that gave birth to computer science, neither the 1956 conference, nor Turing’s Mind paper, come close to marking the start of AI. This is easy enough to see. For example, Descartes proposed TT (not the TT by name, of course) long before Turing was born. [ 7 ] Here’s the relevant passage:

If there were machines which bore a resemblance to our body and imitated our actions as far as it was morally possible to do so, we should always have two very certain tests by which to recognise that, for all that, they were not real men. The first is, that they could never use speech or other signs as we do when placing our thoughts on record for the benefit of others. For we can easily understand a machine’s being constituted so that it can utter words, and even emit some responses to action on it of a corporeal kind, which brings about a change in its organs; for instance, if it is touched in a particular part it may ask what we wish to say to it; if in another part it may exclaim that it is being hurt, and so on. But it never happens that it arranges its speech in various ways, in order to reply appropriately to everything that may be said in its presence, as even the lowest type of man can do. And the second difference is, that although machines can perform certain things as well as or perhaps better than any of us can do, they infallibly fall short in others, by which means we may discover that they did not act from knowledge, but only for the disposition of their organs. For while reason is a universal instrument which can serve for all contingencies, these organs have need of some special adaptation for every particular action. From this it follows that it is morally impossible that there should be sufficient diversity in any machine to allow it to act in all the events of life in the same way as our reason causes us to act. (Descartes 1637, p. 116)

At the moment, Descartes is certainly carrying the day. [ 8 ] Turing predicted that his test would be passed by 2000, but the fireworks across the globe at the start of the new millennium have long since died down, and the most articulate of computers still can’t meaningfully debate a sharp toddler. Moreover, while in certain focussed areas machines out-perform minds (IBM’s famous Deep Blue prevailed in chess over Gary Kasparov, e.g.; and more recently, AI systems have prevailed in other games, e.g. Jeopardy! and Go, about which more will momentarily be said), minds have a (Cartesian) capacity for cultivating their expertise in virtually any sphere. (If it were announced to Deep Blue, or any current successor, that chess was no longer to be the game of choice, but rather a heretofore unplayed variant of chess, the machine would be trounced by human children of average intelligence having no chess expertise.) AI simply hasn’t managed to create general intelligence; it hasn’t even managed to produce an artifact indicating that eventually it will create such a thing.

But what about IBM Watson’s famous nail-biting victory in the Jeopardy! game-show contest? [ 9 ] That certainly seems to be a machine triumph over humans on their “home field,” since Jeopardy! delivers a human-level linguistic challenge ranging across many domains. Indeed, among many AI cognoscenti, Watson’s success is considered to be much more impressive than Deep Blue’s, for numerous reasons. One reason is that while chess is generally considered to be well-understood from the formal-computational perspective (after all, it’s well-known that there exists a perfect strategy for playing chess), in open-domain question-answering (QA), as in any significant natural-language processing task, there is no consensus as to what problem, formally speaking, one is trying to solve. Briefly, question-answering (QA) is what the reader would think it is: one asks a question of a machine, and gets an answer, where the answer has to be produced via some “significant” computational process. (See Strzalkowski & Harabagiu (2006) for an overview of what QA, historically, has been as a field.) A bit more precisely, there is no agreement as to what underlying function, formally speaking, question-answering capability computes. This lack of agreement stems quite naturally from the fact that there is of course no consensus as to what natural languages are , formally speaking. [ 10 ] Despite this murkiness, and in the face of an almost universal belief that open-domain question-answering would remain unsolved for a decade or more, Watson decisively beat the two top human Jeopardy! champions on the planet. During the contest, Watson had to answer questions that required not only command of simple factoids ( Question 1 ), but also of some amount of rudimentary reasoning (in the form of temporal reasoning) and commonsense ( Question 2 ):

Question 1 : The only two consecutive U.S. presidents with the same first name.

Question 2 : In May 1898, Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

While Watson is demonstrably better than humans in Jeopardy! -style quizzing (a new human Jeopardy! master could arrive on the scene, but as for chess, AI now assumes that a second round of IBM-level investment would vanquish the new human opponent), this approach does not work for the kind of NLP challenge that Descartes described; that is, Watson can’t converse on the fly. After all, some questions don’t hinge on sophisticated information retrieval and machine learning over pre-existing data, but rather on intricate reasoning right on the spot. Such questions may for instance involve anaphora resolution, which require even deeper degrees of commonsensical understanding of time, space, history, folk psychology, and so on. Levesque (2013) has catalogued some alarmingly simple questions which fall in this category. (Marcus, 2013, gives an account of Levesque’s challenges that is accessible to a wider audience.) The other class of question-answering tasks on which Watson fails can be characterized as dynamic question-answering. These are questions for which answers may not be recorded in textual form anywhere at the time of questioning, or for which answers are dependent on factors that change with time. Two questions that fall in this category are given below (Govindarajulu et al. 2013):

Question 3 : If I have 4 foos and 5 bars, and if foos are not the same as bars, how many foos will I have if I get 3 bazes which just happen to be foos?

Question 4 : What was IBM’s Sharpe ratio in the last 60 days of trading?

Closely following Watson’s victory, in March 2016, Google DeepMind’s AlphaGo defeated one of Go’s top-ranked players, Lee Seedol, in four out of five matches. This was considered a landmark achievement within AI, as it was widely believed in the AI community that computer victory in Go was at least a few decades away, partly due to the enormous number of valid sequences of moves in Go compared to that in Chess. [ 11 ] While this is a remarkable achievement, it should be noted that, despite breathless coverage in the popular press, [ 12 ] AlphaGo, while indisputably a great Go player, is just that. For example, neither AlphaGo nor Watson can understand the rules of Go written in plain-and-simple English and produce a computer program that can play the game. It’s interesting that there is one endeavor in AI that tackles a narrow version of this very problem: In general game playing , a machine is given a description of a brand new game just before it has to play the game (Genesereth et al. 2005). However, the description in question is expressed in a formal language, and the machine has to manage to play the game from this description. Note that this is still far from understanding even a simple description of a game in English well enough to play it.

But what if we consider the history of AI not from the perspective of philosophy, but rather from the perspective of the field with which, today, it is most closely connected? The reference here is to computer science. From this perspective, does AI run back to well before Turing? Interestingly enough, the results are the same: we find that AI runs deep into the past, and has always had philosophy in its veins. This is true for the simple reason that computer science grew out of logic and probability theory, [ 13 ] which in turn grew out of (and is still intertwined with) philosophy. Computer science, today, is shot through and through with logic; the two fields cannot be separated. This phenomenon has become an object of study unto itself (Halpern et al. 2001). The situation is no different when we are talking not about traditional logic, but rather about probabilistic formalisms, also a significant component of modern-day AI: These formalisms also grew out of philosophy, as nicely chronicled, in part, by Glymour (1992). For example, in the one mind of Pascal was born a method of rigorously calculating probabilities, conditional probability (which plays a particularly large role in AI, currently), and such fertile philosophico-probabilistic arguments as Pascal’s wager , according to which it is irrational not to become a Christian.

That modern-day AI has its roots in philosophy, and in fact that these historical roots are temporally deeper than even Descartes’ distant day, can be seen by looking to the clever, revealing cover of the second edition (the third edition is the current one) of the comprehensive textbook Artificial Intelligence: A Modern Approach (known in the AI community as simply AIMA2e for Russell & Norvig, 2002).

cover of AIMA2e

Cover of AIMA2e (Russell & Norvig 2002)

What you see there is an eclectic collection of memorabilia that might be on and around the desk of some imaginary AI researcher. For example, if you look carefully, you will specifically see: a picture of Turing, a view of Big Ben through a window (perhaps R&N are aware of the fact that Turing famously held at one point that a physical machine with the power of a universal Turing machine is physically impossible: he quipped that it would have to be the size of Big Ben), a planning algorithm described in Aristotle’s De Motu Animalium , Frege’s fascinating notation for first-order logic , a glimpse of Lewis Carroll’s (1958) pictorial representation of syllogistic reasoning, Ramon Lull’s concept-generating wheel from his 13 th -century Ars Magna , and a number of other pregnant items (including, in a clever, recursive, and bordering-on-self-congratulatory touch, a copy of AIMA itself). Though there is insufficient space here to make all the historical connections, we can safely infer from the appearance of these items (and here we of course refer to the ancient ones: Aristotle conceived of planning as information-processing over two-and-a-half millennia back; and in addition, as Glymour (1992) notes, Artistotle can also be credited with devising the first knowledge-bases and ontologies, two types of representation schemes that have long been central to AI) that AI is indeed very, very old. Even those who insist that AI is at least in part an artifact-building enterprise must concede that, in light of these objects, AI is ancient, for it isn’t just theorizing from the perspective that intelligence is at bottom computational that runs back into the remote past of human history: Lull’s wheel, for example, marks an attempt to capture intelligence not only in computation, but in a physical artifact that embodies that computation. [ 14 ]

AIMA has now reached its the third edition, and those interested in the history of AI, and for that matter the history of philosophy of mind, will not be disappointed by examination of the cover of the third installment (the cover of the second edition is almost exactly like the first edition). (All the elements of the cover, separately listed and annotated, can be found online .) One significant addition to the cover of the third edition is a drawing of Thomas Bayes; his appearance reflects the recent rise in the popularity of probabilistic techniques in AI, which we discuss later.

One final point about the history of AI seems worth making.

It is generally assumed that the birth of modern-day AI in the 1950s came in large part because of and through the advent of the modern high-speed digital computer. This assumption accords with common-sense. After all, AI (and, for that matter, to some degree its cousin, cognitive science, particularly computational cognitive modeling, the sub-field of cognitive science devoted to producing computational simulations of human cognition) is aimed at implementing intelligence in a computer, and it stands to reason that such a goal would be inseparably linked with the advent of such devices. However, this is only part of the story: the part that reaches back but to Turing and others (e.g., von Neuman) responsible for the first electronic computers. The other part is that, as already mentioned, AI has a particularly strong tie, historically speaking, to reasoning (logic-based and, in the need to deal with uncertainty, inductive/probabilistic reasoning). In this story, nicely told by Glymour (1992), a search for an answer to the question “What is a proof?” eventually led to an answer based on Frege’s version of first-order logic (FOL): a (finitary) mathematical proof consists in a series of step-by-step inferences from one formula of first-order logic to the next. The obvious extension of this answer (and it isn’t a complete answer, given that lots of classical mathematics, despite conventional wisdom, clearly can’t be expressed in FOL; even the Peano Axioms, to be expressed as a finite set of formulae, require S OL) is to say that not only mathematical thinking, but thinking, period, can be expressed in FOL. (This extension was entertained by many logicians long before the start of information-processing psychology and cognitive science – a fact some cognitive psychologists and cognitive scientists often seem to forget.) Today, logic-based AI is only part of AI, but the point is that this part still lives (with help from logics much more powerful, but much more complicated, than FOL), and it can be traced all the way back to Aristotle’s theory of the syllogism. [ 15 ] In the case of uncertain reasoning, the question isn’t “What is a proof?”, but rather questions such as “What is it rational to believe, in light of certain observations and probabilities?” This is a question posed and tackled long before the arrival of digital computers.

So far we have been proceeding as if we have a firm and precise grasp of the nature of AI. But what exactly is AI? Philosophers arguably know better than anyone that precisely defining a particular discipline to the satisfaction of all relevant parties (including those working in the discipline itself) can be acutely challenging. Philosophers of science certainly have proposed credible accounts of what constitutes at least the general shape and texture of a given field of science and/or engineering, but what exactly is the agreed-upon definition of physics? What about biology? What, for that matter, is philosophy, exactly? These are remarkably difficult, maybe even eternally unanswerable, questions, especially if the target is a consensus definition. Perhaps the most prudent course we can manage here under obvious space constraints is to present in encapsulated form some proposed definitions of AI. We do include a glimpse of recent attempts to define AI in detailed, rigorous fashion (and we suspect that such attempts will be of interest to philosophers of science, and those interested in this sub-area of philosophy).

Russell and Norvig (1995, 2002, 2009), in their aforementioned AIMA text, provide a set of possible answers to the “What is AI?” question that has considerable currency in the field itself. These answers all assume that AI should be defined in terms of its goals: a candidate definition thus has the form “AI is the field that aims at building …” The answers all fall under a quartet of types placed along two dimensions. One dimension is whether the goal is to match human performance, or, instead, ideal rationality. The other dimension is whether the goal is to build systems that reason/think, or rather systems that act. The situation is summed up in this table:

Four Possible Goals for AI According to AIMA

Please note that this quartet of possibilities does reflect (at least a significant portion of) the relevant literature. For example, philosopher John Haugeland (1985) falls into the Human/Reasoning quadrant when he says that AI is “The exciting new effort to make computers think … machines with minds , in the full and literal sense.” (By far, this is the quadrant that most popular narratives affirm and explore. The recent Westworld TV series is a powerful case in point.) Luger and Stubblefield (1993) seem to fall into the Ideal/Act quadrant when they write: “The branch of computer science that is concerned with the automation of intelligent behavior.” The Human/Act position is occupied most prominently by Turing, whose test is passed only by those systems able to act sufficiently like a human. The “thinking rationally” position is defended (e.g.) by Winston (1992). While it might not be entirely uncontroversial to assert that the four bins given here are exhaustive, such an assertion appears to be quite plausible, even when the literature up to the present moment is canvassed.

It’s important to know that the contrast between the focus on systems that think/reason versus systems that act, while found, as we have seen, at the heart of the AIMA texts, and at the heart of AI itself, should not be interpreted as implying that AI researchers view their work as falling all and only within one of these two compartments. Researchers who focus more or less exclusively on knowledge representation and reasoning, are also quite prepared to acknowledge that they are working on (what they take to be) a central component or capability within any one of a family of larger systems spanning the reason/act distinction. The clearest case may come from the work on planning – an AI area traditionally making central use of representation and reasoning. For good or ill, much of this research is done in abstraction (in vitro, as opposed to in vivo), but the researchers involved certainly intend or at least hope that the results of their work can be embedded into systems that actually do things, such as, for example, execute the plans.

What about Russell and Norvig themselves? What is their answer to the What is AI? question? They are firmly in the “acting rationally” camp. In fact, it’s safe to say both that they are the chief proponents of this answer, and that they have been remarkably successful evangelists. Their extremely influential AIMA series can be viewed as a book-length defense and specification of the Ideal/Act category. We will look a bit later at how Russell and Norvig lay out all of AI in terms of intelligent agents , which are systems that act in accordance with various ideal standards for rationality. But first let’s look a bit closer at the view of intelligence underlying the AIMA text. We can do so by turning to Russell (1997). Here Russell recasts the “What is AI?” question as the question “What is intelligence?” (presumably under the assumption that we have a good grasp of what an artifact is), and then he identifies intelligence with rationality . More specifically, Russell sees AI as the field devoted to building intelligent agents , which are functions taking as input tuples of percepts from the external environment, and producing behavior (actions) on the basis of these percepts. Russell’s overall picture is this one:

'Percept history' to 'Agent function' to 'Behavior' to 'Environment' to 'Percept history'; also 'Environment' to/from 'State history' to 'Performance Measure'

The Basic Picture Underlying Russell’s Account of Intelligence/Rationality

Let’s unpack this diagram a bit, and take a look, first, at the account of perfect rationality that can be derived from it. The behavior of the agent in the environment \(E\) (from a class \(\bE\) of environments) produces a sequence of states or snapshots of that environment. A performance measure \(U\) evaluates this sequence; notice the box labeled “Performance Measure” in the above figure. We let \(V(f,\bE,U)\) denote the expected utility according to \(U\) of the agent function \(f\) operating on \(\bE\). [ 16 ] Now we identify a perfectly rational agent with the agent function:

According to the above equation, a perfectly rational agent can be taken to be the function \(f_{opt}\) which produces the maximum expected utility in the environment under consideration. Of course, as Russell points out, it’s usually not possible to actually build perfectly rational agents. For example, though it’s easy enough to specify an algorithm for playing invincible chess, it’s not feasible to implement this algorithm. What traditionally happens in AI is that programs that are – to use Russell’s apt terminology – calculatively rational are constructed instead: these are programs that, if executed infinitely fast , would result in perfectly rational behavior. In the case of chess, this would mean that we strive to write a program that runs an algorithm capable, in principle, of finding a flawless move, but we add features that truncate the search for this move in order to play within intervals of digestible duration.

Russell himself champions a new brand of intelligence/rationality for AI; he calls this brand bounded optimality . To understand Russell’s view, first we follow him in introducing a distinction: We say that agents have two components: a program, and a machine upon which the program runs. We write \(Agent(P, M)\) to denote the agent function implemented by program \(P\) running on machine \(M\). Now, let \(\mathcal{P}(M)\) denote the set of all programs \(P\) that can run on machine \(M\). The bounded optimal program \(P_{\opt,M}\) then is:

You can understand this equation in terms of any of the mathematical idealizations for standard computation. For example, machines can be identified with Turing machines minus instructions (i.e., TMs are here viewed architecturally only: as having tapes divided into squares upon which symbols can be written, read/write heads capable of moving up and down the tape to write and erase, and control units which are in one of a finite number of states at any time), and programs can be identified with instructions in the Turing-machine model (telling the machine to write and erase symbols, depending upon what state the machine is in). So, if you are told that you must “program” within the constraints of a 22-state Turing machine, you could search for the “best” program given those constraints. In other words, you could strive to find the optimal program within the bounds of the 22-state architecture. Russell’s (1997) view is thus that AI is the field devoted to creating optimal programs for intelligent agents, under time and space constraints on the machines implementing these programs. [ 17 ]

The reader must have noticed that in the equation for \(P_{\opt,M}\) we have not elaborated on \(\bE\) and \(U\) and how equation \eqref{eq1} might be used to construct an agent if the class of environments \(\bE\) is quite general, or if the true environment \(E\) is simply unknown. Depending on the task for which one is constructing an artificial agent, \(E\) and \(U\) would vary. The mathematical form of the environment \(E\) and the utility function \(U\) would vary wildly from, say, chess to Jeopardy! . Of course, if we were to design a globally intelligent agent, and not just a chess-playing agent, we could get away with having just one pair of \(E\) and \(U\). What would \(E\) look like if we were building a generally intelligent agent and not just an agent that is good at a single task? \(E\) would be a model of not just a single game or a task, but the entire physical-social-virtual universe consisting of many games, tasks, situations, problems, etc. This project is (at least currently) hopelessly difficult as, obviously, we are nowhere near to having such a comprehensive theory-of-everything model. For further discussion of a theoretical architecture put forward for this problem, see the Supplement on the AIXI architecture .

It should be mentioned that there is a different, much more straightforward answer to the “What is AI?” question. This answer, which goes back to the days of the original Dartmouth conference, was expressed by, among others, Newell (1973), one of the grandfathers of modern-day AI (recall that he attended the 1956 conference); it is:

AI is the field devoted to building artifacts that are intelligent, where ‘intelligent’ is operationalized through intelligence tests (such as the Wechsler Adult Intelligence Scale), and other tests of mental ability (including, e.g., tests of mechanical ability, creativity, and so on).

The above definition can be seen as fully specifying a concrete version of Russell and Norvig’s four possible goals. Though few are aware of this now, this answer was taken quite seriously for a while, and in fact underlied one of the most famous programs in the history of AI: the ANALOGY program of Evans (1968), which solved geometric analogy problems of a type seen in many intelligence tests. An attempt to rigorously define this forgotten form of AI (as what they dub Psychometric AI ), and to resurrect it from the days of Newell and Evans, is provided by Bringsjord and Schimanski (2003) [see also e.g. (Bringsjord 2011)]. A sizable private investment has been made in the ongoing attempt, now known as Project Aristo , to build a “digital Aristotle”, in the form of a machine able to excel on standardized tests such at the AP exams tackled by US high school students (Friedland et al. 2004). (Vibrant work in this direction continues today at the Allen Institute for Artificial Intelligence .) [ 18 ] In addition, researchers at Northwestern have forged a connection between AI and tests of mechanical ability (Klenk et al. 2005).

In the end, as is the case with any discipline, to really know precisely what that discipline is requires you to, at least to some degree, dive in and do, or at least dive in and read. Two decades ago such a dive was quite manageable. Today, because the content that has come to constitute AI has mushroomed, the dive (or at least the swim after it) is a bit more demanding.

3. Approaches to AI

There are a number of ways of “carving up” AI. By far the most prudent and productive way to summarize the field is to turn yet again to the AIMA text given its comprehensive overview of the field.

As Russell and Norvig (2009) tell us in the Preface of AIMA :

The main unifying theme is the idea of an intelligent agent. We define AI as the study of agents that receive percepts from the environment and perform actions. Each such agent implements a function that maps percept sequences to actions, and we cover different ways to represent these functions… (Russell & Norvig 2009, vii)

The basic picture is thus summed up in this figure:

agent with sensors and actuators receiving percepts and performing actions

Impressionistic Overview of an Intelligent Agent

The content of AIMA derives, essentially, from fleshing out this picture; that is, the above figure corresponds to the different ways of representing the overall function that intelligent agents implement. And there is a progression from the least powerful agents up to the more powerful ones. The following figure gives a high-level view of a simple kind of agent discussed early in the book. (Though simple, this sort of agent corresponds to the architecture of representation-free agents designed and implemented by Rodney Brooks, 1991.)

simple reflex agent with no internal model of the world interacting with environment

A Simple Reflex Agent

As the book progresses, agents get increasingly sophisticated, and the implementation of the function they represent thus draws from more and more of what AI can currently muster. The following figure gives an overview of an agent that is a bit smarter than the simple reflex agent. This smarter agent has the ability to internally model the outside world, and is therefore not simply at the mercy of what can at the moment be directly sensed.

reflex agent with model of world interacting with environment

A More Sophisticated Reflex Agent

There are seven parts to AIMA . As the reader passes through these parts, she is introduced to agents that take on the powers discussed in each part. Part I is an introduction to the agent-based view. Part II is concerned with giving an intelligent agent the capacity to think ahead a few steps in clearly defined environments. Examples here include agents able to successfully play games of perfect information, such as chess. Part III deals with agents that have declarative knowledge and can reason in ways that will be quite familiar to most philosophers and logicians (e.g., knowledge-based agents deduce what actions should be taken to secure their goals). Part IV of the book outfits agents with the power to handle uncertainty by reasoning in probabilistic fashion. [ 19 ] In Part V, agents are given a capacity to learn. The following figure shows the overall structure of a learning agent.

agent that can learn interacting with environment

A Learning Agent

The final set of powers agents are given allow them to communicate. These powers are covered in Part VI.

Philosophers who patiently travel the entire progression of increasingly smart agents will no doubt ask, when reaching the end of Part VII, if anything is missing. Are we given enough, in general, to build an artificial person, or is there enough only to build a mere animal? This question is implicit in the following from Charniak and McDermott (1985):

The ultimate goal of AI (which we are very far from achieving) is to build a person, or, more humbly, an animal. (Charniak & McDermott 1985, 7)

To their credit, Russell & Norvig, in AIMA ’s Chapter 27, “AI: Present and Future,” consider this question, at least to some degree. [ ] They do so by considering some challenges to AI that have hitherto not been met. One of these challenges is described by R&N as follows:

[M]achine learning has made very little progress on the important problem of constructing new representations at levels of abstraction higher than the input vocabulary. In computer vision, for example, learning complex concepts such as Classroom and Cafeteria would be made unnecessarily difficult if the agent were forced to work from pixels as the input representation; instead, the agent needs to be able to form intermediate concepts first, such as Desk and Tray, without explicit human supervision. Similar concepts apply to learning behavior: HavingACupOfTea is a very important high-level step in many plans, but how does it get into an action library that initially contains much simpler actions such as RaiseArm and Swallow? Perhaps this will incorporate deep belief networks – Bayesian networks that have multiple layers of hidden variables, as in the work of Hinton et al. (2006), Hawkins and Blakeslee (2004), and Bengio and LeCun (2007). … Unless we understand such issues, we are faced with the daunting task of constructing large commonsense knowledge bases by hand, and approach that has not fared well to date. (Russell & Norvig 2009, Ch. 27.1)

While there has seen some advances in addressing this challenge (in the form of deep learning or representation learning ), this specific challenge is actually merely a foothill before a range of dizzyingly high mountains that AI must eventually somehow manage to climb. One of those mountains, put simply, is reading . [ 21 ] Despite the fact that, as noted, Part V of AIMA is devoted to machine learning, AI, as it stands, offers next to nothing in the way of a mechanization of learning by reading. Yet when you think about it, reading is probably the dominant way you learn at this stage in your life. Consider what you’re doing at this very moment. It’s a good bet that you are reading this sentence because, earlier, you set yourself the goal of learning about the field of AI. Yet the formal models of learning provided in AIMA ’s Part IV (which are all and only the models at play in AI) cannot be applied to learning by reading. [ 22 ] These models all start with a function-based view of learning. According to this view, to learn is almost invariably to produce an underlying function \(\ff\) on the basis of a restricted set of pairs

For example, consider receiving inputs consisting of 1, 2, 3, 4, and 5, and corresponding range values of 1, 4, 9, 16, and 25; the goal is to “learn” the underlying mapping from natural numbers to natural numbers. In this case, assume that the underlying function is \(n^2\), and that you do “learn” it. While this narrow model of learning can be productively applied to a number of processes, the process of reading isn’t one of them. Learning by reading cannot (at least for the foreseeable future) be modeled as divining a function that produces argument-value pairs. Instead, your reading about AI can pay dividends only if your knowledge has increased in the right way, and if that knowledge leaves you poised to be able to produce behavior taken to confirm sufficient mastery of the subject area in question. This behavior can range from correctly answering and justifying test questions regarding AI, to producing a robust, compelling presentation or paper that signals your achievement.

Two points deserve to be made about machine reading. First, it may not be clear to all readers that reading is an ability that is central to intelligence. The centrality derives from the fact that intelligence requires vast knowledge. We have no other means of getting systematic knowledge into a system than to get it in from text, whether text on the web, text in libraries, newspapers, and so on. You might even say that the big problem with AI has been that machines really don’t know much compared to humans. That can only be because of the fact that humans read (or hear: illiterate people can listen to text being uttered and learn that way). Either machines gain knowledge by humans manually encoding and inserting knowledge, or by reading and listening. These are brute facts. (We leave aside supernatural techniques, of course. Oddly enough, Turing didn’t: he seemed to think ESP should be discussed in connection with the powers of minds and machines. See Turing, 1950.) [ 23 ]

Now for the second point. Humans able to read have invariably also learned a language, and learning languages has been modeled in conformity to the function-based approach adumbrated just above (Osherson et al. 1986). However, this doesn’t entail that an artificial agent able to read, at least to a significant degree, must have really and truly learned a natural language. AI is first and foremost concerned with engineering computational artifacts that measure up to some test (where, yes, sometimes that test is from the human sphere), not with whether these artifacts process information in ways that match those present in the human case. It may or may not be necessary, when engineering a machine that can read, to imbue that machine with human-level linguistic competence. The issue is empirical, and as time unfolds, and the engineering is pursued, we shall no doubt see the issue settled.

Two additional high mountains facing AI are subjective consciousness and creativity, yet it would seem that these great challenges are ones the field apparently hasn’t even come to grips with. Mental phenomena of paramount importance to many philosophers of mind and neuroscience are simply missing from AIMA . For example, consciousness is only mentioned in passing in AIMA , but subjective consciousness is the most important thing in our lives – indeed we only desire to go on living because we wish to go on enjoying subjective states of certain types. Moreover, if human minds are the product of evolution, then presumably phenomenal consciousness has great survival value, and would be of tremendous help to a robot intended to have at least the behavioral repertoire of the first creatures with brains that match our own (hunter-gatherers; see Pinker 1997). Of course, subjective consciousness is largely missing from the sister fields of cognitive psychology and computational cognitive modeling as well. We discuss some of these challenges in the Philosophy of Artificial Intelligence section below. For a list of similar challenges to cognitive science, see the relevant section of the entry on cognitive science . [ 24 ]

To some readers, it might seem in the very least tendentious to point to subjective consciousness as a major challenge to AI that it has yet to address. These readers might be of the view that pointing to this problem is to look at AI through a distinctively philosophical prism, and indeed a controversial philosophical standpoint.

But as its literature makes clear, AI measures itself by looking to animals and humans and picking out in them remarkable mental powers, and by then seeing if these powers can be mechanized. Arguably the power most important to humans (the capacity to experience) is nowhere to be found on the target list of most AI researchers. There may be a good reason for this (no formalism is at hand, perhaps), but there is no denying the state of affairs in question obtains, and that, in light of how AI measures itself, that it’s worrisome.

As to creativity, it’s quite remarkable that the power we most praise in human minds is nowhere to be found in AIMA . Just as in (Charniak & McDermott 1985) one cannot find ‘neural’ in the index, ‘creativity’ can’t be found in the index of AIMA . This is particularly odd because many AI researchers have in fact worked on creativity (especially those coming out of philosophy; e.g., Boden 1994, Bringsjord & Ferrucci 2000).

Although the focus has been on AIMA , any of its counterparts could have been used. As an example, consider Artificial Intelligence: A New Synthesis , by Nils Nilsson. As in the case of AIMA , everything here revolves around a gradual progression from the simplest of agents (in Nilsson’s case, reactive agents ), to ones having more and more of those powers that distinguish persons. Energetic readers can verify that there is a striking parallel between the main sections of Nilsson’s book and AIMA . In addition, Nilsson, like Russell and Norvig, ignores phenomenal consciousness, reading, and creativity. None of the three are even mentioned. Likewise, a recent comprehensive AI textbook by Luger (2008) follows the same pattern.

A final point to wrap up this section. It seems quite plausible to hold that there is a certain inevitability to the structure of an AI textbook, and the apparent reason is perhaps rather interesting. In personal conversation, Jim Hendler, a well-known AI researcher who is one of the main innovators behind Semantic Web (Berners-Lee, Hendler, Lassila 2001), an under-development “AI-ready” version of the World Wide Web, has said that this inevitability can be rather easily displayed when teaching Introduction to AI; here’s how. Begin by asking students what they think AI is. Invariably, many students will volunteer that AI is the field devoted to building artificial creatures that are intelligent. Next, ask for examples of intelligent creatures. Students always respond by giving examples across a continuum: simple multi-cellular organisms, insects, rodents, lower mammals, higher mammals (culminating in the great apes), and finally human persons. When students are asked to describe the differences between the creatures they have cited, they end up essentially describing the progression from simple agents to ones having our (e.g.) communicative powers. This progression gives the skeleton of every comprehensive AI textbook. Why does this happen? The answer seems clear: it happens because we can’t resist conceiving of AI in terms of the powers of extant creatures with which we are familiar. At least at present, persons, and the creatures who enjoy only bits and pieces of personhood, are – to repeat – the measure of AI. [ 25 ]

Reasoning based on classical deductive logic is monotonic; that is, if \(\Phi\vdash\phi\), then for all \(\psi\), \(\Phi\cup \{\psi\}\vdash\phi\). Commonsense reasoning is not monotonic. While you may currently believe on the basis of reasoning that your house is still standing, if while at work you see on your computer screen that a vast tornado is moving through the location of your house, you will drop this belief. The addition of new information causes previous inferences to fail. In the simpler example that has become an AI staple, if I tell you that Tweety is a bird, you will infer that Tweety can fly, but if I then inform you that Tweety is a penguin, the inference evaporates, as well it should. Nonmonotonic (or defeasible) logic includes formalisms designed to capture the mechanisms underlying these kinds of examples. See the separate entry on logic and artificial intelligence , which is focused on nonmonotonic reasoning, and reasoning about time and change. It also provides a history of the early days of logic-based AI, making clear the contributions of those who founded the tradition (e.g., John McCarthy and Pat Hayes; see their seminal 1969 paper).

The formalisms and techniques of logic-based AI have reached a level of impressive maturity – so much so that in various academic and corporate laboratories, implementations of these formalisms and techniques can be used to engineer robust, real-world software. It is strongly recommend that readers who have an interest to learn where AI stands in these areas consult (Mueller 2006), which provides, in one volume, integrated coverage of nonmonotonic reasoning (in the form, specifically, of circumscription), and reasoning about time and change in the situation and event calculi. (The former calculus is also introduced by Thomason. In the second, timepoints are included, among other things.) The other nice thing about (Mueller 2006) is that the logic used is multi-sorted first-order logic (MSL), which has unificatory power that will be known to and appreciated by many technical philosophers and logicians (Manzano 1996).

We now turn to three further topics of importance in AI. They are:

  • The overarching scheme of logicist AI, in the context of the attempt to build intelligent artificial agents.
  • Common Logic and the intensifying quest for interoperability.
  • A technique that can be called encoding down , which can allow machines to reason efficiently over knowledge that, were it not encoded down, would, when reasoned over, lead to paralyzing inefficiency.

This trio is covered in order, beginning with the first.

Detailed accounts of logicist AI that fall under the agent-based scheme can be found in (Lenat 1983, Lenat & Guha 1990, Nilsson 1991, Bringsjord & Ferrucci 1998). [ 26 ] . The core idea is that an intelligent agent receives percepts from the external world in the form of formulae in some logical system (e.g., first-order logic), and infers, on the basis of these percepts and its knowledge base, what actions should be performed to secure the agent’s goals. (This is of course a barbaric simplification. Information from the external world is encoded in formulae, and transducers to accomplish this feat may be components of the agent.)

To clarify things a bit, we consider, briefly, the logicist view in connection with arbitrary logical systems \(\mathcal{L}_{X}\). [ 27 ] We obtain a particular logical system by setting \(X\) in the appropriate way. Some examples: If \(X=I\), then we have a system at the level of FOL [following the standard notation from model theory; see e.g. (Ebbinghaus et al. 1984)]. \(\mathcal{L}_{II}\) is second-order logic, and \(\mathcal{L}_{\omega_I\omega}\) is a “small system” of infinitary logic (countably infinite conjunctions and disjunctions are permitted). These logical systems are all extensional , but there are intensional ones as well. For example, we can have logical systems corresponding to those seen in standard propositional modal logic (Chellas 1980). One possibility, familiar to many philosophers, would be propositional KT45, or \(\mathcal{L}_{KT45}\). [ 28 ] In each case, the system in question includes a relevant alphabet from which well-formed formulae are constructed by way of a formal grammar, a reasoning (or proof) theory, a formal semantics, and at least some meta-theoretical results (soundness, completeness, etc.). Taking off from standard notation, we can thus say that a set of formulas in some particular logical system \(\mathcal{L}_X\), \(\Phi_{\mathcal{L}_X}\), can be used, in conjunction with some reasoning theory, to infer some particular formula \(\phi_{\mathcal{L}_X}\). (The reasoning may be deductive, inductive, abductive, and so on. Logicist AI isn’t in the least restricted to any particular mode of reasoning.) To say that such a situation holds, we write \[ \Phi_{\mathcal{L}_X} \vdash_{\mathcal{L}_X} \phi_{\mathcal{L}_X} \]

When the logical system referred to is clear from context, or when we don’t care about which logical system is involved, we can simply write \[ \Phi \vdash \phi \]

Each logical system, in its formal semantics, will include objects designed to represent ways the world pointed to by formulae in this system can be. Let these ways be denoted by \(W^i_{{\mathcal{L}_X}}\). When we aren’t concerned with which logical system is involved, we can simply write \(W^i\). To say that such a way models a formula \(\phi\) we write \[ W_i \models \phi \]

We extend this to a set of formulas in the natural way: \(W^i\models\Phi\) means that all the elements of \(\Phi\) are true on \(W^i\). Now, using the simple machinery we’ve established, we can describe, in broad strokes, the life of an intelligent agent that conforms to the logicist point of view. This life conforms to the basic cycle that undergirds intelligent agents in the AIMA sense.

To begin, we assume that the human designer, after studying the world, uses the language of a particular logical system to give to our agent an initial set of beliefs \(\Delta_0\) about what this world is like. In doing so, the designer works with a formal model of this world, \(W\), and ensures that \(W\models\Delta_0\). Following tradition, we refer to \(\Delta_0\) as the agent’s (starting) knowledge base . (This terminology, given that we are talking about the agent’s beliefs , is known to be peculiar, but it persists.) Next, the agent ADJUSTS its knowlege base to produce a new one, \(\Delta_1\). We say that adjustment is carried out by way of an operation \(\mathcal{A}\); so \(\mathcal{A}[\Delta_0]=\Delta_1\). How does the adjustment process, \(\mathcal{A}\), work? There are many possibilities. Unfortunately, many believe that the simplest possibility (viz., \(\mathcal{A}[\Delta_i]\) equals the set of all formulas that can be deduced in some elementary manner from \(\Delta_i\)) exhausts all the possibilities. The reality is that adjustment, as indicated above, can come by way of any mode of reasoning – induction, abduction, and yes, various forms of deduction corresponding to the logical system in play. For present purposes, it’s not important that we carefully enumerate all the options.

The cycle continues when the agent ACTS on the environment, in an attempt to secure its goals. Acting, of course, can cause changes to the environment. At this point, the agent SENSES the environment, and this new information \(\Gamma_1\) factors into the process of adjustment, so that \(\mathcal{A}[\Delta_1\cup\Gamma_1]=\Delta_2\). The cycle of SENSES \(\Rightarrow\) ADJUSTS \(\Rightarrow\) ACTS continues to produce the life \(\Delta_0,\Delta_1,\Delta_2,\Delta_3,\ldots,\) … of our agent.

It may strike you as preposterous that logicist AI be touted as an approach taken to replicate all of cognition. Reasoning over formulae in some logical system might be appropriate for computationally capturing high-level tasks like trying to solve a math problem (or devising an outline for an entry in the Stanford Encyclopedia of Philosophy), but how could such reasoning apply to tasks like those a hawk tackles when swooping down to capture scurrying prey? In the human sphere, the task successfully negotiated by athletes would seem to be in the same category. Surely, some will declare, an outfielder chasing down a fly ball doesn’t prove theorems to figure out how to pull off a diving catch to save the game! Two brutally reductionistic arguments can be given in support of this “logicist theory of everything” approach towards cognition. The first stems from the fact that a complete proof calculus for just first-order logic can simulate all of Turing-level computation (Chapter 11, Boolos et al. 2007). The second justification comes from the role logic plays in foundational theories of mathematics and mathematical reasoning. Not only are foundational theories of mathematics cast in logic (Potter 2004), but there have been successful projects resulting in machine verification of ordinary non-trivial theorems, e.g., in the Mizar project alone around 50,000 theorems have been verified (Naumowicz and Kornilowicz 2009). The argument goes that if any approach to AI can be cast mathematically, then it can be cast in a logicist form.

Needless to say, such a declaration has been carefully considered by logicists beyond the reductionistic argument given above. For example, Rosenschein and Kaelbling (1986) describe a method in which logic is used to specify finite state machines. These machines are used at “run time” for rapid, reactive processing. In this approach, though the finite state machines contain no logic in the traditional sense, they are produced by logic and inference. Real robot control via first-order theorem proving has been demonstrated by Amir and Maynard-Reid (1999, 2000, 2001). In fact, you can download version 2.0 of the software that makes this approach real for a Nomad 200 mobile robot in an office environment. Of course, negotiating an office environment is a far cry from the rapid adjustments an outfielder for the Yankees routinely puts on display, but certainly it’s an open question as to whether future machines will be able to mimic such feats through rapid reasoning. The question is open if for no other reason than that all must concede that the constant increase in reasoning speed of first-order theorem provers is breathtaking. (For up-to-date news on this increase, visit and monitor the TPTP site .) There is no known reason why the software engineering in question cannot continue to produce speed gains that would eventually allow an artificial creature to catch a fly ball by processing information in purely logicist fashion.

Now we come to the second topic related to logicist AI that warrants mention herein: common logic and the intensifying quest for interoperability between logic-based systems using different logics. Only a few brief comments are offered. [ 29 ] Readers wanting more can explore the links provided in the course of the summary.

One standardization is through what is known as Common Logic (CL), and variants thereof. (CL is published as an ISO standard – ISO is the International Standards Organization.) Philosophers interested in logic, and of course logicians, will find CL to be quite fascinating. From an historical perspective, the advent of CL is interesting in no small part because the person spearheading it is none other than Pat Hayes, the same Hayes who, as we have seen, worked with McCarthy to establish logicist AI in the 1960s. Though Hayes was not at the original 1956 Dartmouth conference, he certainly must be regarded as one of the founders of contemporary AI.) One of the interesting things about CL, at least as we see it, is that it signifies a trend toward the marriage of logics, and programming languages and environments. Another system that is a logic/programming hybrid is Athena , which can be used as a programming language, and is at the same time a form of MSL. Athena is based on formal systems known as denotational proof languages (Arkoudas 2000).

How is interoperability between two systems to be enabled by CL? Suppose one of these systems is based on logic \(L\), and the other on \(L'\). (To ease exposition, assume that both logics are first-order.) The idea is that a theory \(\Phi_L\), that is, a set of formulae in \(L\), can be translated into CL, producing \(\Phi_{CL}\), and then this theory can be translated into \(\Phi_L'\). CL thus becomes an inter lingua . Note that what counts as a well-formed formula in \(L\) can be different than what counts as one in \(L'\). The two logics might also have different proof theories. For example, inference in \(L\) might be based on resolution, while inference in \(L'\) is of the natural deduction variety. Finally, the symbol sets will be different. Despite these differences, courtesy of the translations, desired behavior can be produced across the translation. That, at any rate, is the hope. The technical challenges here are immense, but federal monies are increasingly available for attacks on the problem of interoperability.

Now for the third topic in this section: what can be called encoding down . The technique is easy to understand. Suppose that we have on hand a set \(\Phi\) of first-order axioms. As is well-known, the problem of deciding, for arbitrary formula \(\phi\), whether or not it’s deducible from \(\Phi\) is Turing-undecidable: there is no Turing machine or equivalent that can correctly return “Yes” or “No” in the general case. However, if the domain in question is finite, we can encode this problem down to the propositional calculus. An assertion that all things have \(F\) is of course equivalent to the assertion that \(Fa\), \(Fb\), \(Fc\), as long as the domain contains only these three objects. So here a first-order quantified formula becomes a conjunction in the propositional calculus. Determining whether such conjunctions are provable from axioms themselves expressed in the propositional calculus is Turing-decidable, and in addition, in certain clusters of cases, the check can be done very quickly in the propositional case; very quickly . Readers interested in encoding down to the propositional calculus should consult recent DARPA-sponsored work by Bart Selman . Please note that the target of encoding down doesn’t need to be the propositional calculus. Because it’s generally harder for machines to find proofs in an intensional logic than in straight first-order logic, it is often expedient to encode down the former to the latter. For example, propositional modal logic can be encoded in multi-sorted logic (a variant of FOL); see (Arkoudas & Bringsjord 2005). Prominent usage of such an encoding down can be found in a set of systems known as Description Logics , which are a set of logics less expressive than first-order logic but more expressive than propositional logic (Baader et al. 2003). Description logics are used to reason about ontologies in a given domain and have been successfully used, for example, in the biomedical domain (Smith et al. 2007).

It’s tempting to define non-logicist AI by negation: an approach to building intelligent agents that rejects the distinguishing features of logicist AI. Such a shortcut would imply that the agents engineered by non-logicist AI researchers and developers, whatever the virtues of such agents might be, cannot be said to know that \(\phi\); – for the simple reason that, by negation, the non-logicist paradigm would have not even a single declarative proposition that is a candidate for \(\phi\);. However, this isn’t a particularly enlightening way to define non-symbolic AI. A more productive approach is to say that non-symbolic AI is AI carried out on the basis of particular formalisms other than logical systems, and to then enumerate those formalisms. It will turn out, of course, that these formalisms fail to include knowledge in the normal sense. (In philosophy, as is well-known, the normal sense is one according to which if \(p\) is known, \(p\) is a declarative statement.)

From the standpoint of formalisms other than logical systems, non-logicist AI can be partitioned into symbolic but non-logicist approaches, and connectionist/neurocomputational approaches. (AI carried out on the basis of symbolic, declarative structures that, for readability and ease of use, are not treated directly by researchers as elements of formal logics, does not count. In this category fall traditional semantic networks, Schank’s (1972) conceptual dependency scheme, frame-based schemes, and other such schemes.) The former approaches, today, are probabilistic, and are based on the formalisms (Bayesian networks) covered below . The latter approaches are based, as we have noted, on formalisms that can be broadly termed “neurocomputational.” Given our space constraints, only one of the formalisms in this category is described here (and briefly at that): the aforementioned artificial neural networks . [ 30 ] . Though artificial neural networks, with an appropriate architecture, could be used for arbitrary computation, they are almost exclusively used for building learning systems.

Neural nets are composed of units or nodes designed to represent neurons, which are connected by links designed to represent dendrites, each of which has a numeric weight .

sum of weighted inputs passed to an activation function that generates output

A “Neuron” Within an Artificial Neural Network (from AIMA3e)

It is usually assumed that some of the units work in symbiosis with the external environment; these units form the sets of input and output units. Each unit has a current activation level , which is its output, and can compute, based on its inputs and weights on those inputs, its activation level at the next moment in time. This computation is entirely local: a unit takes account of but its neighbors in the net. This local computation is calculated in two stages. First, the input function , \(in_i\), gives the weighted sum of the unit’s input values, that is, the sum of the input activations multiplied by their weights:

In the second stage, the activation function , \(g\), takes the input from the first stage as argument and generates the output, or activation level, \(a_i\):

One common (and confessedly elementary) choice for the activation function (which usually governs all units in a given net) is the step function, which usually has a threshold \(t\) that sees to it that a 1 is output when the input is greater than \(t\), and that 0 is output otherwise. This is supposed to be “brain-like” to some degree, given that 1 represents the firing of a pulse from a neuron through an axon, and 0 represents no firing. A simple three-layer neural net is shown in the following picture.

neural network with 3 layers

A Simple Three-Layer Artificial Neural Network (from AIMA3e)

As you might imagine, there are many different kinds of neural networks. The main distinction is between feed-forward and recurrent networks. In feed-forward networks like the one pictured immediately above, as their name suggests, links move information in one direction, and there are no cycles; recurrent networks allow for cycling back, and can become rather complicated. For a more detailed presentation, see the

Supplement on Neural Nets .

Neural networks were fundamentally plagued by the fact that while they are simple and have theoretically efficient learning algorithms, when they are multi-layered and thus sufficiently expressive to represent non-linear functions, they were very hard to train in practice. This changed in the mid 2000s with the advent of methods that exploit state-of-the-art hardware better (Rajat et al. 2009). The backpropagation method for training multi-layered neural networks can be translated into a sequence of repeated simple arithmetic operations on a large set of numbers. The general trend in computing hardware has favored algorithms that are able to do a large of number of simple operations that are not that dependent on each other, versus a small of number of complex and intricate operations.

Another key recent observation is that deep neural networks can be pre-trained first in an unsupervised phase where they are just fed data without any labels for the data. Each hidden layer is forced to represent the outputs of the layer below. The outcome of this training is a series of layers which represent the input domain with increasing levels of abstraction. For example, if we pre-train the network with images of faces, we would get a first layer which is good at detecting edges in images, a second layer which can combine edges to form facial features such as eyes, noses etc., a third layer which responds to groups of features, and so on (LeCun et al. 2015).

Perhaps the best technique for teaching students about neural networks in the context of other statistical learning formalisms and methods is to focus on a specific problem, preferably one that seems unnatural to tackle using logicist techniques. The task is then to seek to engineer a solution to the problem, using any and all techniques available . One nice problem is handwriting recognition (which also happens to have a rich philosophical dimension; see e.g. Hofstadter & McGraw 1995). For example, consider the problem of assigning, given as input a handwritten digit \(d\), the correct digit, 0 through 9. Because there is a database of 60,000 labeled digits available to researchers (from the National Institute of Science and Technology), this problem has evolved into a benchmark problem for comparing learning algorithms. It turns out that neural networks currently reign as the best approach to the problem according to a recent ranking by Benenson (2016).

Readers interested in AI (and computational cognitive science) pursued from an overtly brain-based orientation are encouraged to explore the work of Rick Granger (2004a, 2004b) and researchers in his Brain Engineering Laboratory and W. H. Neukom Institute for Computational Sciences . The contrast between the “dry”, logicist AI started at the original 1956 conference, and the approach taken here by Granger and associates (in which brain circuitry is directly modeled) is remarkable. For those interested in computational properties of neural networks, Hornik et al. (1989) address the general representation capability of neural networks independent of learning.

At this point the reader has been exposed to the chief formalisms in AI, and may wonder about heterogeneous approaches that bridge them. Is there such research and development in AI? Yes. From an engineering standpoint, such work makes irresistibly good sense. There is now an understanding that, in order to build applications that get the job done, one should choose from a toolbox that includes logicist, probabilistic/Bayesian, and neurocomputational techniques. Given that the original top-down logicist paradigm is alive and thriving (e.g., see Brachman & Levesque 2004, Mueller 2006), and that, as noted, a resurgence of Bayesian and neurocomputational approaches has placed these two paradigms on solid, fertile footing as well, AI now moves forward, armed with this fundamental triad, and it is a virtual certainty that applications (e.g., robots) will be engineered by drawing from elements of all three. Watson’s DeepQA architecture is one recent example of an engineering system that leverages multiple paradigms. For a detailed discussion, see the

Supplement on Watson’s DeepQA Architecture .

Google DeepMind’s AlphaGo is another example of a multi-paradigm system, although in a much narrower form than Watson. The central algorithmic problem in games such as Go or Chess is to search through a vast sequence of valid moves. For most non-trivial games, this is not feasible to do so exhaustively. The Monte Carlo tree search (MCTS) algorithm gets around this obstacle by searching through an enormous space of valid moves in a statistical fashion (Browne et al. 2012). While MCTS is the central algorithm in AlpaGo, there are two neural networks which help evaluate states in the game and help model how expert opponents play (Silver et al. 2016). It should be noted that MCTS is behind almost all the winning submissions in general game playing (Finnsson 2012).

What, though, about deep, theoretical integration of the main paradigms in AI? Such integration is at present only a possibility for the future, but readers are directed to the research of some striving for such integration. For example: Sun (1994, 2002) has been working to demonstrate that human cognition that is on its face symbolic in nature (e.g., professional philosophizing in the analytic tradition, which deals explicitly with arguments and definitions carefully symbolized) can arise from cognition that is neurocomputational in nature. Koller (1997) has investigated the marriage between probability theory and logic. And, in general, the very recent arrival of so-called human-level AI is being led by theorists seeking to genuinely integrate the three paradigms set out above (e.g., Cassimatis 2006).

Finally, we note that cognitive architectures such as Soar (Laird 2012) and PolyScheme (Cassimatis 2006) are another area where integration of different fields of AI can be found. For example, one such endeavor striving to build human-level AI is the Companions project (Forbus and Hinrichs 2006). Companions are long-lived systems that strive to be human-level AI systems that function as collaborators with humans. The Companions architecture tries to solve multiple AI problems such as reasoning and learning, interactivity, and longevity in one unifying system.

4. The Explosive Growth of AI

As we noted above, work on AI has mushroomed over the past couple of decades. Now that we have looked a bit at the content that composes AI, we take a quick look at the explosive growth of AI.

First, a point of clarification. The growth of which we speak is not a shallow sort correlated with amount of funding provided for a given sub-field of AI. That kind of thing happens all the time in all fields, and can be triggered by entirely political and financial changes designed to grow certain areas, and diminish others. Along the same line, the growth of which we speak is not correlated with the amount of industrial activity revolving around AI (or a sub-field thereof); for this sort of growth too can be driven by forces quite outside an expansion in the scientific breadth of AI. [ 31 ] Rather, we are speaking of an explosion of deep content : new material which someone intending to be conversant with the field needs to know. Relative to other fields, the size of the explosion may or may not be unprecedented. (Though it should perhaps be noted that an analogous increase in philosophy would be marked by the development of entirely new formalisms for reasoning, reflected in the fact that, say, longstanding philosophy textbooks like Copi’s (2004) Introduction to Logic are dramatically rewritten and enlarged to include these formalisms, rather than remaining anchored to essentially immutable core formalisms, with incremental refinement around the edges through the years.) But it certainly appears to be quite remarkable, and is worth taking note of here, if for no other reason than that AI’s near-future will revolve in significant part around whether or not the new content in question forms a foundation for new long-lived research and development that would not otherwise obtain. [ 32 ]

AI has also witnessed an explosion in its usage in various artifacts and applications. While we are nowhere near building a machine with capabilities of a human or one that acts rationally in all scenarios according to the Russell/Hutter definition above, algorithms that have their origins in AI research are now widely deployed for many tasks in a variety of domains.

A huge part of AI’s growth in applications has been made possible through invention of new algorithms in the subfield of machine learning . Machine learning is concerned with building systems that improve their performance on a task when given examples of ideal performance on the task, or improve their performance with repeated experience on the task. Algorithms from machine learning have been used in speech recognition systems, spam filters, online fraud-detection systems, product-recommendation systems, etc. The current state-of-the-art in machine learning can be divided into three areas (Murphy 2013, Alpaydin 2014):

  • Supervised Learning : A form of learning in which a computer tries to learn a function \(\ff\) given examples, the training data \(T\), of its values at various points in its domain \[ T=\left\{\left\langle x_1, \ff(x_1)\right\rangle,\left\langle x_2, \ff(x_2)\right\rangle, \ldots, \left\langle x_n, \ff(x_n)\right\rangle\right\}. \] A sample task would be trying to label images of faces with a person’s name. The supervision in supervised learning comes in the form of the value of the function \(\ff(x)\) at various points \(x\) in some part of the domain of the function. This is usually given in the form of a fixed set of input and output pairs for the function. Let \(\hh\) be the “learned function.” The goal of supervised learning is have \({\hh}\) match as closely as possible the true function \({\ff}\) over the same domain. The error is usually defined in terms of an error function, for instance, \(error = \sum_{x\in T} \delta(\ff(x) - \hh(x))\), over the training data \(T\). Other forms of supervision and goals for learning are possible. For example, in active learning the learning algorithm can request the value of the function for arbitrary inputs. Supervised learning dominates the field of machine learning and has been used in almost all practical applications mentioned just above.
  • Unsupervised Learning : Here the machine tries to find useful knowledge or information when given some raw data \(\left\{ x_1,x_2, \ldots, x_n \right\}\). There is no function associated with the input that has to be learned. The idea is that the machine helps uncover interesting patterns or information that could be hidden in the data. One use of unsupervised learning is data mining , where large volumes of data are searched for interesting information. PageRank , one of the earliest algorithms used by the Google search engine, can be considered to be an unsupervised learning system that ranks pages without any human supervision (Chapter 14.10, Hastie et al. 2009).
  • Reinforcement Learning : Here a machine is set loose in an environment where it constantly acts and perceives (similar to the Russell/Hutter view above) and only occasionally receives feedback on its behavior in the form of rewards or punishments. The machine has to learn to behave rationally from this feedback. One use of reinforcement learning has been in building agents to play computer games. The objective here is to build agents that map sensory data from the game at every time instant to an action that would help win in the game or maximize a human player’s enjoyment of the game. In most games, we know how well we are playing only at the end of the game or only at infrequent intervals throughout the game (e.g., a chess game that we feel we are winning could quickly turn against us at the end). In supervised learning, the training data has ideal input-output pairs. This form of learning is not suitable for building agents that have to operate across a length of time and are judged not on one action but a series of actions and their effects on the environment. The field of Reinforcement Learning tries to tackle this problem through a variety of methods. Though a bit dated, Sutton and Barto (1998) provide a comprehensive introduction to the field.

In addition to being used in domains that are traditionally the ken of AI, machine-learning algorithms have also been used in all stages of the scientific process. For example, machine-learning techniques are now routinely applied to analyze large volumes of data generated from particle accelerators. CERN, for instance, generates a petabyte (\(10^{15}\) bytes) per second, and statistical algorithms that have their origins in AI are used to filter and analyze this data. Particle accelerators are used in fundamental experimental research in physics to probe the structure of our physical universe. They work by colliding larger particles together to create much finer particles. Not all such events are fruitful. Machine-learning methods have been used to select events which are then analyzed further (Whiteson & Whiteson 2009 and Baldi et al. 2014). More recently, researchers at CERN launched a machine learning competition to aid in the analysis of the Higgs Boson. The goal of this challenge was to develop algorithms that separate meaningful events from background noise given data from the Large Hadron Collider, a particle accelerator at CERN.

In the past few decades, there has been an explosion in data that does not have any explicit semantics attached to it. This data is generated by both humans and machines. Most of this data is not easily machine-processable; for example, images, text, video (as opposed to carefully curated data in a knowledge- or data-base). This has given rise to a huge industry that applies AI techniques to get usable information from such enormous data. This field of applying techniques derived from AI to large volumes of data goes by names such as “data mining,” “big data,” “analytics,” etc. This field is too vast to even moderately cover in the present article, but we note that there is no full agreement on what constitutes such a “big-data” problem. One definition, from Madden (2012), is that big data differs from traditional machine-processable data in that it is too big (for most of the existing state-of-the-art hardware), too quick (generated at a fast rate, e.g. online email transactions), or too hard. It is in the too-hard part that AI techniques work quite well. While this universe is quite varied, we use the Watson’s system later in this article as an AI-relevant exemplar. As we will see later, while most of this new explosion is powered by learning, it isn’t entirely limited to just learning. This bloom in learning algorithms has been supported by both a resurgence in neurocomputational techniques and probabilistic techniques.

One of the remarkable aspects of (Charniak & McDermott 1985) is this: The authors say the central dogma of AI is that “What the brain does may be thought of at some level as a kind of computation” (p. 6). And yet nowhere in the book is brain-like computation discussed. In fact, you will search the index in vain for the term ‘neural’ and its variants. Please note that the authors are not to blame for this. A large part of AI’s growth has come from formalisms, tools, and techniques that are, in some sense, brain-based, not logic-based. A paper that conveys the importance and maturity of neurocomputation is (Litt et al. 2006). (Growth has also come from a return of probabilistic techniques that had withered by the mid-70s and 80s. More about that momentarily, in the next “resurgence” section .)

One very prominent class of non-logicist formalism does make an explicit nod in the direction of the brain: viz., artificial neural networks (or as they are often simply called, neural networks , or even just neural nets ). (The structure of neural networks and more recent developments are discussed above ). Because Minsky and Pappert’s (1969) Perceptrons led many (including, specifically, many sponsors of AI research and development) to conclude that neural networks didn’t have sufficient information-processing power to model human cognition, the formalism was pretty much universally dropped from AI. However, Minsky and Pappert had only considered very limited neural networks. Connectionism , the view that intelligence consists not in symbolic processing, but rather non -symbolic processing at least somewhat like what we find in the brain (at least at the cellular level), approximated specifically by artificial neural networks, came roaring back in the early 1980s on the strength of more sophisticated forms of such networks, and soon the situation was (to use a metaphor introduced by John McCarthy) that of two horses in a race toward building truly intelligent agents.

If one had to pick a year at which connectionism was resurrected, it would certainly be 1986, the year Parallel Distributed Processing (Rumelhart & McClelland 1986) appeared in print. The rebirth of connectionism was specifically fueled by the back-propagation (backpropagation) algorithm over neural networks, nicely covered in Chapter 20 of AIMA . The symbolicist/connectionist race led to a spate of lively debate in the literature (e.g., Smolensky 1988, Bringsjord 1991), and some AI engineers have explicitly championed a methodology marked by a rejection of knowledge representation and reasoning. For example, Rodney Brooks was such an engineer; he wrote the well-known “Intelligence Without Representation” (1991), and his Cog Project, to which we referred above, is arguably an incarnation of the premeditatedly non-logicist approach. Increasingly, however, those in the business of building sophisticated systems find that both logicist and more neurocomputational techniques are required (Wermter & Sun 2001). [ 33 ] In addition, the neurocomputational paradigm today includes connectionism only as a proper part, in light of the fact that some of those working on building intelligent systems strive to do so by engineering brain-based computation outside the neural network-based approach (e.g., Granger 2004a, 2004b).

Another recent resurgence in neurocomputational techniques has occurred in machine learning. The modus operandi in machine learning is that given a problem, say recognizing handwritten digits \(\{0,1,\ldots,9\}\) or faces, from a 2D matrix representing an image of the digits or faces, a machine learning or a domain expert would construct a feature vector representation function for the task. This function is a transformation of the input into a format that tries to throw away irrelevant information in the input and keep only information useful for the task. Inputs transformed by \(\rr\) are termed features . For recognizing faces, irrelevant information could be the amount of lighting in the scene and relevant information could be information about facial features. The machine is then fed a sequence of inputs represented by the features and the ideal or ground truth output values for those inputs. This converts the learning challenge from that of having to learn the function \(\ff\) from the examples: \(\left\{\left\langle x_1, \ff(x_1)\right\rangle,\left\langle x_2, \ff(x_2)\right\rangle, \ldots, \left\langle x_n, \ff(x_n)\right\rangle \right\}\) to having to learn from possibly easier data: \(\left\{\left\langle \rr(x_1), \ff(x_1)\right\rangle,\left\langle \rr(x_2), \ff(x_2)\right\rangle, \ldots, \left\langle \rr(x_n), \ff(x_n)\right\rangle \right\}\). Here the function \(\rr\) is the function that computes the feature vector representation of the input. Formally, \(\ff\) is assumed to be a composition of the functions \(\gg\) and \(\rr\). That is, for any input \(x\), \(f(x) = \gg\left(\rr\left(x\right)\right)\). This is denoted by \(\ff=\gg\circ \rr\). For any input, the features are first computed, and then the function \(\gg\) is applied. If the feature representation \(\rr\) is provided by the domain expert, the learning problem becomes simpler to the extent the feature representation takes on the difficulty of the task. At one extreme, the feature vector could hide an easily extractable form of the answer in the input and in the other extreme the feature representation could be just the plain input.

For non-trivial problems, choosing the right representation is vital. For instance, one of the drastic changes in the AI landscape was due to Minsky and Papert’s (1969) demonstration that the perceptron cannot learn even the binary XOR function, but this function can be learnt by the perceptron if we have the right representation. Feature engineering has grown to be one of the most labor intensive tasks of machine learning, so much so that it is considered to be one of the “black arts” of machine learning. The other significant black art of learning methods is choosing the right parameters. These black arts require significant human expertise and experience, which can be quite difficult to obtain without significant apprenticeship (Domingos 2012). Another bigger issue is that the task of feature engineering is just knowledge representation in a new skin.

Given this state of affairs, there has been a recent resurgence in methods for automatically learning a feature representation function \(\rr\); such methods potentially bypass a large part of human labor that is traditionally required. Such methods are based mostly on what are now termed deep neural networks . Such networks are simply neural networks with two or more hidden layers. These networks allow us to learn a feature function \(\rr\) by using one or more of the hidden layers to learn \(\rr\). The general form of learning in which one learns from the raw sensory data without much hand-based feature engineering has now its own term: deep learning . A general and yet concise definition (Bengio et al. 2015) is:

Deep learning can safely be regarded as the study of models that either involve a greater amount of composition of learned functions or learned concepts than traditional machine learning does. (Bengio et al. 2015, Chapter 1)

Though the idea has been around for decades, recent innovations leading to more efficient learning techniques have made the approach more feasible (Bengio et al. 2013). Deep-learning methods have recently produced state-of-the-art results in image recognition (given an image containing various objects, label the objects from a given set of labels), speech recognition (from audio input, generate a textual representation), and the analysis of data from particle accelerators (LeCun et al. 2015). Despite impressive results in tasks such as these, minor and major issues remain unresolved. A minor issue is that significant human expertise is still needed to choose an architecture and set up the right parameters for the architecture; a major issue is the existence of so-called adversarial inputs , which are indistinguishable from normal inputs to humans but are computed in a special manner that makes a neural network regard them as different than similar inputs in the training data. The existence of such adversarial inputs, which remain stable across training data, has raised doubts about how well performance on benchmarks can translate into performance in real-world systems with sensory noise (Szegedy et al. 2014).

There is a second dimension to the explosive growth of AI: the explosion in popularity of probabilistic methods that aren’t neurocomputational in nature, in order to formalize and mechanize a form of non-logicist reasoning in the face of uncertainty. Interestingly enough, it is Eugene Charniak himself who can be safely considered one of the leading proponents of an explicit, premeditated turn away from logic to statistical techniques. His area of specialization is natural language processing, and whereas his introductory textbook of 1985 gave an accurate sense of his approach to parsing at the time (as we have seen, write computer programs that, given English text as input, ultimately infer meaning expressed in FOL), this approach was abandoned in favor of purely statistical approaches (Charniak 1993). At the AI@50 conference, Charniak boldly proclaimed, in a talk tellingly entitled “Why Natural Language Processing is Now Statistical Natural Language Processing,” that logicist AI is moribund, and that the statistical approach is the only promising game in town – for the next 50 years. [ 34 ]

The chief source of energy and debate at the conference flowed from the clash between Charniak’s probabilistic orientation, and the original logicist orientation, upheld at the conference in question by John McCarthy and others.

AI’s use of probability theory grows out of the standard form of this theory, which grew directly out of technical philosophy and logic. This form will be familiar to many philosophers, but let’s review it quickly now, in order to set a firm stage for making points about the new probabilistic techniques that have energized AI.

Just as in the case of FOL, in probability theory we are concerned with declarative statements, or propositions , to which degrees of belief are applied; we can thus say that both logicist and probabilistic approaches are symbolic in nature. Both approaches also agree that statements can either be true or false in the world. In building agents, a simplistic logic-based approach requires agents to know the truth-value of all possible statements. This is not realistic, as an agent may not know the truth-value of some proposition \(p\) due to either ignorance, non-determinism in the physical world, or just plain vagueness in the meaning of the statement. More specifically, the fundamental proposition in probability theory is a random variable , which can be conceived of as an aspect of the world whose status is initially unknown to the agent. We usually capitalize the names of random variables, though we reserve \(p,q,r, \ldots\) as such names as well. For example, in a particular murder investigation centered on whether or not Mr. Barolo committed the crime, the random variable \(Guilty\) might be of concern. The detective may be interested as well in whether or not the murder weapon – a particular knife, let us assume – belongs to Barolo. In light of this, we might say that \(\Weapon = \true\) if it does, and \(\Weapon = \false\) if it doesn’t. As a notational convenience, we can write \(weapon\) and \(\lnot weapon\) and for these two cases, respectively; and we can use this convention for other variables of this type.

The kind of variables we have described so far are \(\mathbf{Boolean}\), because their \(\mathbf{domain}\) is simply \(\{true,false\}.\) But we can generalize and allow \(\mathbf{discrete}\) random variables, whose values are from any countable domain. For example, \(\PriceTChina\) might be a variable for the price of (a particular, presumably) tea in China, and its domain might be \(\{1,2,3,4,5\}\), where each number here is in US dollars. A third type of variable is \(\mathbf{continous}\); its domain is either the reals, or some subset thereof.

We say that an atomic event is an assignment of particular values from the appropriate domains to all the variables composing the (idealized) world. For example, in the simple murder investigation world introduced just above, we have two Boolean variables, \(\Guilty\) and \(\Weapon\), and there are just four atomic events. Note that atomic events have some obvious properties. For example, they are mutually exclusive, exhaustive, and logically entail the truth or falsity of every proposition. Usually not obvious to beginning students is a fourth property, namely, any proposition is logically equivalent to the disjunction of all atomic events that entail that proposition.

Prior probabilities correspond to a degree of belief accorded to a proposition in the complete absence of any other information. For example, if the prior probability of Barolo’s guilt is \(0.2\), we write \[ P\left(\Guilty=true\right)=0.2 \]

or simply \(\P(guilty)=0.2\). It is often convenient to have a notation allowing one to refer economically to the probabilities of all the possible values for a random variable. For example, we can write \[ \P\left(\PriceTChina\right) \]

as an abbreviation for the five equations listing all the possible prices for tea in China. We can also write \[ \P\left(\PriceTChina\right)=\langle 1,2,3,4,5\rangle \]

In addition, as further convenient notation, we can write \( \mathbf{P}\left(\Guilty, \Weapon\right)\) to denote the probabilities of all combinations of values of the relevant set of random variables. This is referred to as the joint probability distribution of \(\Guilty\) and \(\Weapon\). The full joint probability distribution covers the distribution for all the random variables used to describe a world. Given our simple murder world, we have 20 atomic events summed up in the equation \[ \mathbf{P}\left(\Guilty, \Weapon, \PriceTChina\right) \]

The final piece of the basic language of probability theory corresponds to conditional probabilities. Where \(p\) and \(q\) are any propositions, the relevant expression is \(P\!\left(p\given q\right)\), which can be interpreted as “the probability of \(p\), given that all we know is \(q\).” For example, \[ P\left(guilty\ggiven weapon\right)=0.7 \]

says that if the murder weapon belongs to Barolo, and no other information is available, the probability that Barolo is guilty is \(0.7.\)

Andrei Kolmogorov showed how to construct probability theory from three axioms that make use of the machinery now introduced, viz.,

  • All probabilities fall between \(0\) and \(1.\) I.e., \(\forall p. 0 \leq P(p) \leq 1\).
  • Valid (in the traditional logicist sense) propositions have a probability of \(1\); unsatisfiable (in the traditional logicist sense) propositions have a probability of \(0\).
  • \(P(p\lor q) = P(p) +P(q) - P(p\land q)\)

These axioms are clearly at bottom logicist. The remainder of probability theory can be erected from this foundation (conditional probabilities are easily defined in terms of prior probabilities). We can thus say that logic is in some fundamental sense still being used to characterize the set of beliefs that a rational agent can have. But where does probabilistic inference enter the picture on this account, since traditional deduction is not used for inference in probability theory?

Probabilistic inference consists in computing, from observed evidence expressed in terms of probability theory, posterior probabilities of propositions of interest. For a good long while, there have been algorithms for carrying out such computation. These algorithms precede the resurgence of probabilistic techniques in the 1990s. (Chapter 13 of AIMA presents a number of them.) For example, given the Kolmogorov axioms, here is a straightforward way of computing the probability of any proposition, using the full joint distribution giving the probabilities of all atomic events: Where \(p\) is some proposition, let \(\alpha(p)\) be the disjunction of all atomic events in which \(p\) holds. Since the probability of a proposition (i.e., \(P(p)\)) is equal to the sum of the probabilities of the atomic events in which it holds, we have an equation that provides a method for computing the probability of any proposition \(p\), viz.,

Unfortunately, there were two serious problems infecting this original probabilistic approach: One, the processing in question needed to take place over paralyzingly large amounts of information (enumeration over the entire distribution is required). And two, the expressivity of the approach was merely propositional. (It was by the way the philosopher Hilary Putnam (1963) who pointed out that there was a price to pay in moving to the first-order level. The issue is not discussed herein.) Everything changed with the advent of a new formalism that marks the marriage of probabilism and graph theory: Bayesian networks (also called belief nets ). The pivotal text was (Pearl 1988). For a more detailed discussion, see the

Supplement on Bayesian Networks .

Before concluding this section, it is probably worth noting that, from the standpoint of philosophy, a situation such as the murder investigation we have exploited above would often be analyzed into arguments , and strength factors, not into numbers to be crunched by purely arithmetical procedures. For example, in the epistemology of Roderick Chisholm, as presented his Theory of Knowledge (1966, 1977), Detective Holmes might classify a proposition like Barolo committed the murder. as counterbalanced if he was unable to find a compelling argument either way, or perhaps probable if the murder weapon turned out to belong to Barolo. Such categories cannot be found on a continuum from 0 to 1, and they are used in articulating arguments for or against Barolo’s guilt. Argument-based approaches to uncertain and defeasible reasoning are virtually non-existent in AI. One exception is Pollock’s approach, covered below. This approach is Chisholmian in nature.

It should also be noted that there have been well-established formalisms for dealing with probabilistic reasoning as an instance of logic-based reasoning. E.g., the activity a researcher in probabilistic reasoning undertakes when she proves a theorem \(\phi\) about their domain (e.g. any theorem in (Pearl 1988)) is purely within the realm of traditional logic. Readers interested in logic-flavored approaches to probabilistic reasoning can consult (Adams 1996, Hailperin 1996 & 2010, Halpern 1998). Formalisms marrying probability theory, induction and deductive reasoning, placing them on an equal footing, have been on the rise, with Markov logic (Richardson and Domingos 2006) being salient among these approaches.

Probabilistic Machine Learning

Machine learning, in the sense given above , has been associated with probabilistic techniques. Probabilistic techniques have been associated with both the learning of functions (e.g. Naive Bayes classification) and the modeling of theoretical properties of learning algorithms. For example, a standard reformulation of supervised learning casts it as a Bayesian problem . Assume that we are looking at recognizing digits \([0{-}9]\) from a given image. One way to cast this problem is to ask what the probability that the hypothesis \(H_x\): “ the digit is \(x\) ” is true given the image \(d\) from a sensor. Bayes theorem gives us:

\(P(d\given H_x)\) and \(P(H_x)\) can be estimated from the given training dataset. Then the hypothesis with the highest posterior probability is then given as the answer and is given by: \(\argmax_{x}P\left(d\ggiven H_x\right)*P\left(H_x\right) \) In addition to probabilistic methods being used to build algorithms, probability theory has also been used to analyze algorithms which might not have an overt probabilistic or logical formulation. For example, one of the central classes of meta-theorems in learning, probably approximately correct (PAC) theorems, are cast in terms of lower bounds of the probability that the mismatch between the induced/learnt f L function and the true function f T being less than a certain amount, given that the learnt function f L works well for a certain number of cases (see Chapter 18, AIMA).

From at least its modern inception, AI has always been connected to gadgets, often ones produced by corporations, and it would be remiss of us not to say a few words about this phenomenon. While there have been a large number of commercial in-the-wild success stories for AI and its sister fields, such as optimization and decision-making, some applications are more visible and have been thoroughly battle-tested in the wild. In 2014, one of the most visible such domains (one in which AI has been strikingly successful) is information retrieval, incarnated as web search. Another recent success story is pattern recognition. The state-of-the-art in applied pattern recognition (e.g., fingerprint/face verification, speech recognition, and handwriting recognition) is robust enough to allow “high-stakes” deployment outside the laboratory. As of mid 2018, several corporations and research laboratories have begun testing autonomous vehicles on public roads, with even a handful of jurisdictions making self-driving cars legal to operate. For example, Google’s autonomous cars have navigated hundreds of thousands of miles in California with minimal human help under non-trivial conditions (Guizzo 2011).

Computer games provide a robust test bed for AI techniques as they can capture important parts that might be necessary to test an AI technique while abstracting or removing details that might beyond the scope of core AI research, for example, designing better hardware or dealing with legal issues (Laird and VanLent 2001). One subclass of games that has seen quite fruitful for commercial deployment of AI is real-time strategy games. Real-time strategy games are games in which players manage an army given limited resources. One objective is to constantly battle other players and reduce an opponent’s forces. Real-time strategy games differ from strategy games in that players plan their actions simultaneously in real-time and do not have to take turns playing. Such games have a number of challenges that are tantalizing within the grasp of the state-of-the-art. This makes such games an attractive venue in which to deploy simple AI agents. An overview of AI used in real-time strategy games can be found in (Robertson and Watson 2015).

Some other ventures in AI, despite significant success, have been only chugging slowly and humbly along, quietly. For instance, AI-related methods have achieved triumphs in solving open problems in mathematics that have resisted any solution for decades. The most noteworthy instance of such a problem is perhaps a proof of the statement that “ All Robbins algebras are Boolean algebras. ” This was conjectured in the 1930s, and the proof was finally discovered by the Otter automatic theorem-prover in 1996 after just a few months of effort (Kolata 1996, Wos 2013). Sister fields like formal verification have also bloomed to the extent that it is now not too difficult to semi-automatically verify vital hardware/software components (Kaufmann et al. 2000 and Chajed et al. 2017).

Other related areas, such as (natural) language translation, still have a long way to go, but are good enough to let us use them under restricted conditions. The jury is out on tasks such as machine translation, which seems to require both statistical methods (Lopez 2008) and symbolic methods (España-Bonet 2011). Both methods now have comparable but limited success in the wild. A deployed translation system at Ford that was initially developed for translating manufacturing process instructions from English to other languages initially started out as rule-based system with Ford and domain-specific vocabulary and language. This system then evolved to incorporate statistical techniques along with rule-based techniques as it gained new uses beyond translating manuals, for example, lay users within Ford translating their own documents (Rychtyckyj and Plesco 2012).

AI’s great achievements mentioned above so far have all been in limited, narrow domains. This lack of any success in the unrestricted general case has caused a small set of researchers to break away into what is now called artificial general intelligence (Goertzel and Pennachin 2007). The stated goals of this movement include shifting the focus again to building artifacts that are generally intelligent and not just capable in one narrow domain.

Computer Ethics has been around for a long time. In this sub-field, typically one would consider how one ought to act in a certain class of situations involving computer technology, where the “one” here refers to a human being (Moor 1985). So-called “robot ethics” is different. In this sub-field (which goes by names such as “moral AI,” “ethical AI,” “machine ethics,” “moral robots,” etc.) one is confronted with such prospects as robots being able to make autonomous and weighty decisions – decisions that might or might not be morally permissible (Wallach & Allen 2010). If one were to attempt to engineer a robot with a capacity for sophisticated ethical reasoning and decision-making, one would also be doing Philosophical AI, as that concept is characterized elsewhere in the present entry. There can be many different flavors of approaches toward Moral AI. Wallach and Allen (2010) provide a high-level overview of the different approaches. Moral reasoning is obviously needed in robots that have the capability for lethal action. Arkin (2009) provides an introduction to how we can control and regulate machines that have the capacity for lethal behavior. Moral AI goes beyond obviously lethal situations, and we can have a spectrum of moral machines. Moor (2006) provides one such spectrum of possible moral agents. An example of a non-lethal but ethically-charged machine would be a lying machine. Clark (2010) uses a computational theory of the mind , the ability to represent and reason about other agents, to build a lying machine that successfully persuades people into believing falsehoods. Bello & Bringsjord (2013) give a general overview of what might be required to build a moral machine, one of the ingredients being a theory of mind.

The most general framework for building machines that can reason ethically consists in endowing the machines with a moral code . This requires that the formal framework used for reasoning by the machine be expressive enough to receive such codes. The field of Moral AI, for now, is not concerned with the source or provenance of such codes. The source could be humans, and the machine could receive the code directly (via explicit encoding) or indirectly (reading). Another possibility is that the code is inferred by the machine from a more basic set of laws. We assume that the robot has access to some such code, and we then try to engineer the robot to follow that code under all circumstances while making sure that the moral code and its representation do not lead to unintended consequences. Deontic logics are a class of formal logics that have been studied the most for this purpose. Abstractly, such logics are concerned mainly with what follows from a given moral code. Engineering then studies the match of a given deontic logic to a moral code (i.e., is the logic expressive enough) which has to be balanced with the ease of automation. Bringsjord et al. (2006) provide a blueprint for using deontic logics to build systems that can perform actions in accordance with a moral code. The role deontic logics play in the framework offered by Bringsjord et al (which can be considered to be representative of the field of deontic logic for moral AI) can be best understood as striving towards Leibniz’s dream of a universal moral calculus:

When controversies arise, there will be no more need for a disputation between two philosophers than there would be between two accountants [computistas]. It would be enough for them to pick up their pens and sit at their abacuses, and say to each other (perhaps having summoned a mutual friend): ‘Let us calculate.’

Deontic logic-based frameworks can also be used in a fashion that is analogous to moral self-reflection. In this mode, logic-based verification of the robot’s internal modules can done before the robot ventures out into the real world. Govindarajulu and Bringsjord (2015) present an approach, drawing from formal-program verification , in which a deontic-logic based system could be used to verify that a robot acts in a certain ethically-sanctioned manner under certain conditions. Since formal-verification approaches can be used to assert statements about an infinite number of situations and conditions, such approaches might be preferred to having the robot roam around in an ethically-charged test environment and make a finite set of decisions that are then judged for their ethical correctness. More recently, Govindarajulu and Bringsjord (2017) use a deontic logic to present a computational model of the Doctrine of Double Effect , an ethical principle for moral dilemmas that has been studied empirically and analyzed extensively by philosophers. [ 35 ] The principle is usually presented and motivated via dilemmas using trolleys and was first presented in this fashion by Foot (1967).

While there has been substantial theoretical and philosophical work, the field of machine ethics is still in its infancy. There has been some embryonic work in building ethical machines. One recent such example would be Pereira and Saptawijaya (2016) who use logic programming and base their work in machine ethics on the ethical theory known as contractualism , set out by Scanlon (1982). And what about the future? Since artificial agents are bound to get smarter and smarter, and to have more and more autonomy and responsibility, robot ethics is almost certainly going to grow in importance. This endeavor might not be a straightforward application of classical ethics. For example, experimental results suggest that humans hold robots to different ethical standards than they expect from humans under similar conditions (Malle et al. 2015). [ 36 ]

Notice that the heading for this section isn’t Philosophy of AI. We’ll get to that category momentarily. (For now it can be identified with the attempt to answer such questions as whether artificial agents created in AI can ever reach the full heights of human intelligence.) Philosophical AI is AI, not philosophy; but it’s AI rooted in and flowing from, philosophy. For example, one could engage, using the tools and techniques of philosophy, a paradox, work out a proposed solution, and then proceed to a step that is surely optional for philosophers: expressing the solution in terms that can be translated into a computer program that, when executed, allows an artificial agent to surmount concrete instances of the original paradox. [ 37 ] Before we ostensively characterize Philosophical AI of this sort courtesy of a particular research program, let us consider first the view that AI is in fact simply philosophy, or a part thereof.

Daniel Dennett (1979) has famously claimed not just that there are parts of AI intimately bound up with philosophy, but that AI is philosophy (and psychology, at least of the cognitive sort). (He has made a parallel claim about Artificial Life (Dennett 1998)). This view will turn out to be incorrect, but the reasons why it’s wrong will prove illuminating, and our discussion will pave the way for a discussion of Philosophical AI.

What does Dennett say, exactly? This:

I want to claim that AI is better viewed as sharing with traditional epistemology the status of being a most general, most abstract asking of the top-down question: how is knowledge possible? (Dennett 1979, 60)

Elsewhere he says his view is that AI should be viewed “as a most abstract inquiry into the possibility of intelligence or knowledge” (Dennett 1979, 64).

In short, Dennett holds that AI is the attempt to explain intelligence, not by studying the brain in the hopes of identifying components to which cognition can be reduced, and not by engineering small information-processing units from which one can build in bottom-up fashion to high-level cognitive processes, but rather by – and this is why he says the approach is top-down – designing and implementing abstract algorithms that capture cognition. Leaving aside the fact that, at least starting in the early 1980s, AI includes an approach that is in some sense bottom-up (see the neurocomputational paradigm discussed above, in Non-Logicist AI: A Summary ; and see, specifically, Granger’s (2004a, 2004b) work, hyperlinked in text immediately above, a specific counterexample), a fatal flaw infects Dennett’s view. Dennett sees the potential flaw, as reflected in:

It has seemed to some philosophers that AI cannot plausibly be so construed because it takes on an additional burden: it restricts itself to mechanistic solutions, and hence its domain is not the Kantian domain of all possible modes of intelligence, but just all possible mechanistically realizable modes of intelligence. This, it is claimed, would beg the question against vitalists, dualists, and other anti-mechanists. (Dennett 1979, 61)

Dennett has a ready answer to this objection. He writes:

But … the mechanism requirement of AI is not an additional constraint of any moment, for if psychology is possible at all, and if Church’s thesis is true, the constraint of mechanism is no more severe than the constraint against begging the question in psychology, and who would wish to evade that? (Dennett 1979, 61)

Unfortunately, this is acutely problematic; and examination of the problems throws light on the nature of AI.

First, insofar as philosophy and psychology are concerned with the nature of mind, they aren’t in the least trammeled by the presupposition that mentation consists in computation. AI, at least of the “Strong” variety (we’ll discuss “Strong” versus “Weak” AI below ) is indeed an attempt to substantiate, through engineering certain impressive artifacts, the thesis that intelligence is at bottom computational (at the level of Turing machines and their equivalents, e.g., Register machines). So there is a philosophical claim, for sure. But this doesn’t make AI philosophy, any more than some of the deeper, more aggressive claims of some physicists (e.g., that the universe is ultimately digital in nature) make their field philosophy. Philosophy of physics certainly entertains the proposition that the physical universe can be perfectly modeled in digital terms (in a series of cellular automata, e.g.), but of course philosophy of physics can’t be identified with this doctrine.

Second, we now know well (and those familiar with the relevant formal terrain knew at the time of Dennett’s writing) that information processing can exceed standard computation, that is, can exceed computation at and below the level of what a Turing machine can muster ( Turing-computation , we shall say). (Such information processing is known as hypercomputation , a term coined by philosopher Jack Copeland, who has himself defined such machines (e.g., Copeland 1998). The first machines capable of hypercomputation were trial-and-error machines , introduced in the same famous issue of the Journal of Symbolic Logic (Gold 1965; Putnam 1965). A new hypercomputer is the infinite time Turing machine (Hamkins & Lewis 2000).) Dennett’s appeal to Church’s thesis thus flies in the face of the mathematical facts: some varieties of information processing exceed standard computation (or Turing-computation). Church’s thesis, or more precisely, the Church-Turing thesis, is the view that a function \(f\) is effectively computable if and only if \(f\) is Turing-computable (i.e., some Turing machine can compute \(f\)). Thus, this thesis has nothing to say about information processing that is more demanding than what a Turing machine can achieve. (Put another way, there is no counter-example to CTT to be automatically found in an information-processing device capable of feats beyond the reach of TMs.) For all philosophy and psychology know, intelligence, even if tied to information processing, exceeds what is Turing-computational or Turing-mechanical. [ 38 ] This is especially true because philosophy and psychology, unlike AI, are in no way fundamentally charged with engineering artifacts, which makes the physical realizability of hypercomputation irrelevant from their perspectives. Therefore, contra Dennett, to consider AI as psychology or philosophy is to commit a serious error, precisely because so doing would box these fields into only a speck of the entire space of functions from the natural numbers (including tuples therefrom) to the natural numbers. (Only a tiny portion of the functions in this space are Turing-computable.) AI is without question much, much narrower than this pair of fields. Of course, it’s possible that AI could be replaced by a field devoted not to building computational artifacts by writing computer programs and running them on embodied Turing machines. But this new field, by definition, would not be AI. Our exploration of AIMA and other textbooks provide direct empirical confirmation of this.

Third, most AI researchers and developers, in point of fact, are simply concerned with building useful, profitable artifacts, and don’t spend much time reflecting upon the kinds of abstract definitions of intelligence explored in this entry (e.g., What Exactly is AI? ).

Though AI isn’t philosophy, there are certainly ways of doing real implementation-focussed AI of the highest caliber that are intimately bound up with philosophy. The best way to demonstrate this is to simply present such research and development, or at least a representative example thereof. While there have been many examples of such work, the most prominent example in AI is John Pollock’s OSCAR project, which stretched over a considerable portion of his lifetime. For a detailed presentation and further discussion, see the

Supplement on the OSCAR Project.

It’s important to note at this juncture that the OSCAR project, and the information processing that underlies it, are without question at once philosophy and technical AI. Given that the work in question has appeared in the pages of Artificial Intelligence , a first-rank journal devoted to that field, and not to philosophy, this is undeniable (see, e.g., Pollock 2001, 1992). This point is important because while it’s certainly appropriate, in the present venue, to emphasize connections between AI and philosophy, some readers may suspect that this emphasis is contrived: they may suspect that the truth of the matter is that page after page of AI journals are filled with narrow, technical content far from philosophy. Many such papers do exist. But we must distinguish between writings designed to present the nature of AI, and its core methods and goals, versus writings designed to present progress on specific technical issues.

Writings in the latter category are more often than not quite narrow, but, as the example of Pollock shows, sometimes these specific issues are inextricably linked to philosophy. And of course Pollock’s work is a representative example (albeit the most substantive one). One could just as easily have selected work by folks who don’t happen to also produce straight philosophy. For example, for an entire book written within the confines of AI and computer science, but which is epistemic logic in action in many ways, suitable for use in seminars on that topic, see (Fagin et al. 2004). (It is hard to find technical work that isn’t bound up with philosophy in some direct way. E.g., AI research on learning is all intimately bound up with philosophical treatments of induction, of how genuinely new concepts not simply defined in terms of prior ones can be learned. One possible partial answer offered by AI is inductive logic programming , discussed in Chapter 19 of AIMA .)

What of writings in the former category? Writings in this category, while by definition in AI venues, not philosophy ones, are nonetheless philosophical. Most textbooks include plenty of material that falls into this latter category, and hence they include discussion of the philosophical nature of AI (e.g., that AI is aimed at building artificial intelligences, and that’s why, after all, it’s called ‘AI’).

8. Philosophy of Artificial Intelligence

Recall that we earlier discussed proposed definitions of AI, and recall specifically that these proposals were couched in terms of the goals of the field. We can follow this pattern here: We can distinguish between “Strong” and “Weak” AI by taking note of the different goals that these two versions of AI strive to reach. “Strong” AI seeks to create artificial persons: machines that have all the mental powers we have, including phenomenal consciousness. “Weak” AI, on the other hand, seeks to build information-processing machines that appear to have the full mental repertoire of human persons (Searle 1997). “Weak” AI can also be defined as the form of AI that aims at a system able to pass not just the Turing Test (again, abbreviated as TT), but the Total Turing Test (Harnad 1991). In TTT, a machine must muster more than linguistic indistinguishability: it must pass for a human in all behaviors – throwing a baseball, eating, teaching a class, etc.

It would certainly seem to be exceedingly difficult for philosophers to overthrow “Weak” AI (Bringsjord and Xiao 2000). After all, what philosophical reason stands in the way of AI producing artifacts that appear to be animals or even humans? However, some philosophers have aimed to do in “Strong” AI, and we turn now to the most prominent case in point.

Without question, the most famous argument in the philosophy of AI is John Searle’s (1980) Chinese Room Argument (CRA), designed to overthrow “Strong” AI. We present a quick summary here and a “report from the trenches” as to how AI practitioners regard the argument. Readers wanting to further study CRA will find an excellent next step in the entry on the Chinese Room Argument and (Bishop & Preston 2002).

CRA is based on a thought-experiment in which Searle himself stars. He is inside a room; outside the room are native Chinese speakers who don’t know that Searle is inside it. Searle-in-the-box, like Searle-in-real-life, doesn’t know any Chinese, but is fluent in English. The Chinese speakers send cards into the room through a slot; on these cards are written questions in Chinese. The box, courtesy of Searle’s secret work therein, returns cards to the native Chinese speakers as output. Searle’s output is produced by consulting a rulebook: this book is a lookup table that tells him what Chinese to produce based on what is sent in. To Searle, the Chinese is all just a bunch of – to use Searle’s language – squiggle-squoggles. The following schematic picture sums up the situation. The labels should be obvious. \(O\) denotes the outside observers, in this case the Chinese speakers. Input is denoted by \(i\) and output by \(o\). As you can see, there is an icon for the rulebook, and Searle himself is denoted by \(P\).

input/output diagram of chinese room

The Chinese Room, Schematic View

Now, what is the argument based on this thought-experiment? Even if you’ve never heard of CRA before, you doubtless can see the basic idea: that Searle (in the box) is supposed to be everything a computer can be, and because he doesn’t understand Chinese, no computer could have such understanding. Searle is mindlessly moving squiggle-squoggles around, and (according to the argument) that’s all computers do, fundamentally. [ 39 ]

Where does CRA stand today? As we’ve already indicated, the argument would still seem to be alive and well; witness (Bishop & Preston 2002). However, there is little doubt that at least among AI practitioners , CRA is generally rejected. (This is of course thoroughly unsurprising.) Among these practitioners, the philosopher who has offered the most formidable response out of AI itself is Rapaport (1988), who argues that while AI systems are indeed syntactic, the right syntax can constitute semantics. It should be said that a common attitude among proponents of “Strong” AI is that CRA is not only unsound, but silly, based as it is on a fanciful story (CR) far removed from the practice of AI – practice which is year by year moving ineluctably toward sophisticated robots that will once and for all silence CRA and its proponents. For example, John Pollock (as we’ve noted, philosopher and practitioner of AI) writes:

Once [my intelligent system] OSCAR is fully functional, the argument from analogy will lead us inexorably to attribute thoughts and feelings to OSCAR with precisely the same credentials with which we attribute them to human beings. Philosophical arguments to the contrary will be passé. (Pollock 1995, p. 6)

To wrap up discussion of CRA, we make two quick points, to wit:

  • Despite the confidence of the likes of Pollock about the eventual irrelevance of CRA in the face of the eventual human-level prowess of OSCAR (and, by extension, any number of other still-improving AI systems), the brute fact is that deeply semantic natural-language processing (NLP) is rarely even pursued these days, so proponents of CRA are certainly not the ones feeling some discomfort in light of the current state of AI. In short, Searle would rightly point to any of the success stories of AI, including the Watson system we have discussed, and still proclaim that understanding is nowhere to be found – and he would be well within his philosophical rights in saying this.
  • It would appear that the CRA is bubbling back to a level of engagement not seen for a number of years, in light of the empirical fact that certain thinkers are now issuing explicit warnings to the effect that future conscious, malevolent machines may well wish to do in our species. In reply, Searle (2014) points out that since CRA is sound, there can’t be conscious machines; and if there can’t be conscious machines, there can’t be malevolent machines that wish anything. We return to this at the end of our entry; the chief point here is that CRA continues to be quite relevant, and indeed we suspect that Searle’s basis for have-no-fear will be taken up energetically by not only philosophers, but AI experts, futurists, lawyers, and policy-makers.

Readers may wonder if there are philosophical debates that AI researchers engage in, in the course of working in their field (as opposed to when they might attend a philosophy conference). Surely, AI researchers have philosophical discussions amongst themselves, right?

Generally, one finds that AI researchers do discuss among themselves topics in philosophy of AI, and these topics are usually the very same ones that occupy philosophers of AI. However, the attitude reflected in the quote from Pollock immediately above is by far the dominant one. That is, in general, the attitude of AI researchers is that philosophizing is sometimes fun, but the upward march of AI engineering cannot be stopped, will not fail, and will eventually render such philosophizing otiose.

We will return to the issue of the future of AI in the final section of this entry.

Four decades ago, J.R. Lucas (1964) argued that Gödel’s first incompleteness theorem entails that no machine can ever reach human-level intelligence. His argument has not proved to be compelling, but Lucas initiated a debate that has produced more formidable arguments. One of Lucas’ indefatigable defenders is the physicist Roger Penrose, whose first attempt to vindicate Lucas was a Gödelian attack on “Strong” AI articulated in his The Emperor’s New Mind (1989). This first attempt fell short, and Penrose published a more elaborate and more fastidious Gödelian case, expressed in Chapters 2 and 3 of his Shadows of the Mind (1994).

In light of the fact that readers can turn to the entry on the Gödel’s Incompleteness Theorems , a full review here is not needed. Instead, readers will be given a decent sense of the argument by turning to an online paper in which Penrose, writing in response to critics (e.g., the philosopher David Chalmers, the logician Solomon Feferman, and the computer scientist Drew McDermott) of his Shadows of the Mind , distills the argument to a couple of paragraphs. [ 40 ] Indeed, in this paper Penrose gives what he takes to be the perfected version of the core Gödelian case given in SOTM . Here is this version, verbatim:

We try to suppose that the totality of methods of (unassailable) mathematical reasoning that are in principle humanly accessible can be encapsulated in some (not necessarily computational) sound formal system \(F\). A human mathematician, if presented with \(F\), could argue as follows (bearing in mind that the phrase “I am \(F\)” is merely a shorthand for “\(F\) encapsulates all the humanly accessible methods of mathematical proof”): (A) “Though I don’t know that I necessarily am \(F\), I conclude that if I were, then the system \(F\) would have to be sound and, more to the point, \(F'\) would have to be sound, where \(F'\) is \(F\) supplemented by the further assertion “I am \(F\).” I perceive that it follows from the assumption that I am \(F\) that the Gödel statement \(G(F')\) would have to be true and, furthermore, that it would not be a consequence of \(F'\). But I have just perceived that “If I happened to be \(F\), then \(G(F')\) would have to be true,” and perceptions of this nature would be precisely what \(F'\) is supposed to achieve. Since I am therefore capable of perceiving something beyond the powers of \(F'\), I deduce that I cannot be \(F\) after all. Moreover, this applies to any other (Gödelizable) system, in place of \(F\).” (Penrose 1996, 3.2)

Does this argument succeed? A firm answer to this question is not appropriate to seek in the present entry. Interested readers are encouraged to consult four full-scale treatments of the argument (LaForte et. al 1998; Bringsjord and Xiao 2000; Shapiro 2003; Bowie 1982).

In addition to the Gödelian and Searlean arguments covered briefly above, a third attack on “Strong” AI (of the symbolic variety) has been widely discussed (though with the rise of statistical machine learning has come a corresponding decrease in the attention paid to it), namely, one given by the philosopher Hubert Dreyfus (1972, 1992), some incarnations of which have been co-articulated with his brother, Stuart Dreyfus (1987), a computer scientist. Put crudely, the core idea in this attack is that human expertise is not based on the explicit, disembodied, mechanical manipulation of symbolic information (such as formulae in some logic, or probabilities in some Bayesian network), and that AI’s efforts to build machines with such expertise are doomed if based on the symbolic paradigm. The genesis of the Dreyfusian attack was a belief that the critique of (if you will) symbol-based philosophy (e.g., philosophy in the logic-based, rationalist tradition, as opposed to what is called the Continental tradition) from such thinkers as Heidegger and Merleau-Ponty could be made against the rationalist tradition in AI. After further reading and study of Dreyfus’ writings, readers may judge whether this critique is compelling, in an information-driven world increasingly managed by intelligent agents that carry out symbolic reasoning (albeit not even close to the human level).

For readers interested in exploring philosophy of AI beyond what Jim Moor (in a recent address – “The Next Fifty Years of AI: Future Scientific Research vs. Past Philosophical Criticisms” – as the 2006 Barwise Award winner at the annual eastern American Philosophical Association meeting) has called the “the big three” criticisms of AI, there is no shortage of additional material, much of it available on the Web. The last chapter of AIMA provides a compressed overview of some additional arguments against “Strong” AI, and is in general not a bad next step. Needless to say, Philosophy of AI today involves much more than the three well-known arguments discussed above, and, inevitably, Philosophy of AI tomorrow will include new debates and problems we can’t see now. Because machines, inevitably, will get smarter and smarter (regardless of just how smart they get), Philosophy of AI, pure and simple, is a growth industry. With every human activity that machines match, the “big” questions will only attract more attention.

If past predictions are any indication, the only thing we know today about tomorrow’s science and technology is that it will be radically different than whatever we predict it will be like. Arguably, in the case of AI, we may also specifically know today that progress will be much slower than what most expect. After all, at the 1956 kickoff conference (discussed at the start of this entry), Herb Simon predicted that thinking machines able to match the human mind were “just around the corner” (for the relevant quotes and informative discussion, see the first chapter of AIMA ). As it turned out, the new century would arrive without a single machine able to converse at even the toddler level. (Recall that when it comes to the building of machines capable of displaying human-level intelligence, Descartes, not Turing, seems today to be the better prophet.) Nonetheless, astonishing though it may be, serious thinkers in the late 20th century have continued to issue incredibly optimistic predictions regarding the progress of AI. For example, Hans Moravec (1999), in his Robot: Mere Machine to Transcendent Mind , informs us that because the speed of computer hardware doubles every 18 months (in accordance with Moore’s Law, which has apparently held in the past), “fourth generation” robots will soon enough exceed humans in all respects, from running companies to writing novels. These robots, so the story goes, will evolve to such lofty cognitive heights that we will stand to them as single-cell organisms stand to us today. [ 41 ]

Moravec is by no means singularly Pollyannaish: Many others in AI predict the same sensational future unfolding on about the same rapid schedule. In fact, at the aforementioned AI@50 conference, Jim Moor posed the question “Will human-level AI be achieved within the next 50 years?” to five thinkers who attended the original 1956 conference: John McCarthy, Marvin Minsky, Oliver Selfridge, Ray Solomonoff, and Trenchard Moore. McCarthy and Minsky gave firm, unhesitating affirmatives, and Solomonoff seemed to suggest that AI provided the one ray of hope in the face of fact that our species seems bent on destroying itself. (Selfridge’s reply was a bit cryptic. Moore returned a firm, unambiguous negative, and declared that once his computer is smart enough to interact with him conversationally about mathematical problems, he might take this whole enterprise more seriously.) It is left to the reader to judge the accuracy of such risky predictions as have been given by Moravec, McCarthy, and Minsky. [ 42 ]

The judgment of the reader in this regard ought to factor in the stunning resurgence, very recently, of serious reflection on what is known as “The Singularity,” (denoted by us simply as S ) the future point at which artificial intelligence exceeds human intelligence, whereupon immediately thereafter (as the story goes) the machines make themselves rapidly smarter and smarter and smarter, reaching a superhuman level of intelligence that, stuck as we are in the mud of our limited mentation, we can’t fathom. For extensive, balanced analysis of S , see Eden et al. (2013).

Readers unfamiliar with the literature on S may be quite surprised to learn the degree to which, among learned folks, this hypothetical event is not only taken seriously, but has in fact become a target for extensive and frequent philosophizing [for a mordant tour of the recent thought in question, see Floridi (2015)]. What arguments support the belief that S is in our future? There are two main arguments at this point: the familiar hardware-based one [championed by Moravec, as noted above, and again more recently by Kurzweil (2006)]; and the – as far as we know – original argument given by mathematician I. J. Good (1965). In addition, there is a recent and related doomsayer argument advanced by Bostrom (2014), which seems to presuppose that S will occur. Good’s argument, nicely amplified and adjusted by Chalmers (2010), who affirms the tidied-up version of the argument, runs as follows:

  • Premise 1 : There will be AI (created by HI and such that AI = HI).
  • Premise 2 : If there is AI, there will be AI\(^+\) (created by AI).
  • Premise 3 : If there is AI\(^+\), there will be AI\(^{++}\) (created by AI\(^+\)).
  • Conclusion : There will be AI\(^{++}\) (= S will occur).

In this argument, ‘AI’ is artificial intelligence at the level of, and created by, human persons, ‘AI\(^+\)’ artificial intelligence above the level of human persons, and ‘AI\(^{++}\)’ super-intelligence constitutive of S . The key process is presumably the creation of one class of machine by another. We have added for convenience ‘HI’ for human intelligence; the central idea is then: HI will create AI, the latter at the same level of intelligence as the former; AI will create AI\(^+\); AI\(^+\) will create AI\(^{++}\); with the ascension proceeding perhaps forever, but at any rate proceeding long enough for us to be as ants outstripped by gods.

The argument certainly appears to be formally valid. Are its three premises true? Taking up such a question would fling us far beyond the scope of this entry. We point out only that the concept of one class of machines creating another, more powerful class of machines is not a transparent one, and neither Good nor Chalmers provides a rigorous account of the concept, which is ripe for philosophical analysis. (As to mathematical analysis, some exists, of course. It is for example well-known that a computing machine at level \(L\) cannot possibly create another machine at a higher level \(L'\). For instance, a linear-bounded automaton can’t create a Turing machine.)

The Good-Chalmers argument has a rather clinical air about it; the argument doesn’t say anything regarding whether machines in the AI\(^{++}\) category will be benign, malicious, or munificent. Many others gladly fill this gap with dark, dark pessimism. The locus classicus here is without question a widely read paper by Bill Joy (2000): “Why The Future Doesn’t Need Us.” Joy believes that the human race is doomed, in no small part because it’s busy building smart machines. He writes:

The 21st-century technologies – genetics, nanotechnology, and robotics (GNR) – are so powerful that they can spawn whole new classes of accidents and abuses. Most dangerously, for the first time, these accidents and abuses are widely within the reach of individuals or small groups. They will not require large facilities or rare raw materials. Knowledge alone will enable the use of them. Thus we have the possibility not just of weapons of mass destruction but of knowledge-enabled mass destruction (KMD), this destructiveness hugely amplified by the power of self-replication. I think it is no exaggeration to say we are on the cusp of the further perfection of extreme evil, an evil whose possibility spreads well beyond that which weapons of mass destruction bequeathed to the nation-states, on to a surprising and terrible empowerment of extreme individuals. [ 43 ]

Philosophers would be most interested in arguments for this view. What are Joy’s? Well, no small reason for the attention lavished on his paper is that, like Raymond Kurzweil (2000), Joy relies heavily on an argument given by none other than the Unabomber (Theodore Kaczynski). The idea is that, assuming we succeed in building intelligent machines, we will have them do most (if not all) work for us. If we further allow the machines to make decisions for us – even if we retain oversight over the machines –, we will eventually depend on them to the point where we must simply accept their decisions. But even if we don’t allow the machines to make decisions, the control of such machines is likely to be held by a small elite who will view the rest of humanity as unnecessary – since the machines can do any needed work (Joy 2000).

This isn’t the place to assess this argument. (Having said that, the pattern pushed by the Unabomber and his supporters certainly appears to be flatly invalid. [ 44 ] ) In fact, many readers will doubtless feel that no such place exists or will exist, because the reasoning here is amateurish. So then, what about the reasoning of professional philosophers on the matter?

Bostrom has recently painted an exceedingly dark picture of a possible future. He points out that the “first superintelligence” could have the capability

to shape the future of Earth-originating life, could easily have non-anthropomorphic final goals, and would likely have instrumental reasons to pursue open-ended resource acquisition. If we now reflect that human beings consist of useful resources (such as conveniently located atoms) and that we depend on many more local resources, we can see that the outcome could easily be one in which humanity quickly becomes extinct. (Bostrom 2014, p. 416)

Clearly, the most vulnerable premise in this sort of argument is that the “first superintelligence” will arrive indeed arrive. Here perhaps the Good-Chalmers argument provides a basis.

Searle (2014) thinks Bostrom’s book is misguided and fundamentally mistaken, and that we needn’t worry. His rationale is dirt-simple: Machines aren’t conscious; Bostrom is alarmed at the prospect of malicious machines who do us in; a malicious machine is by definition a conscious machine; ergo, Bostrom’s argument doesn’t work. Searle writes:

If the computer can fly airplanes, drive cars, and win at chess, who cares if it is totally nonconscious? But if we are worried about a maliciously motivated superintelligence destroying us, then it is important that the malicious motivation should be real. Without consciousness, there is no possibiity of its being real.

The positively remarkable thing here, it seems to us, is that Searle appears to be unaware of the brute fact that most AI engineers are perfectly content to build machines on the basis of the AIMA view of AI we presented and explained above: the view according to which machines simply map percepts to actions. On this view, it doesn’t matter whether the machine really has desires; what matters is whether it acts suitably on the basis of how AI scientists engineer formal correlates to desire. An autonomous machine with overwhelming destructive power that non-consciously “decides” to kill doesn’t become just a nuisance because genuine, human-level, subjective desire is absent from the machine. If an AI can play the game of chess, and the game of Jeopardy! , it can certainly play the game of war. Just as it does little good for a human loser to point out that the victorious machine in a game of chess isn’t conscious, it will do little good for humans being killed by machines to point out that these machines aren’t conscious. (It is interesting to note that the genesis of Joy’s paper was an informal conversation with John Searle and Raymond Kurzweil. According to Joy, Searle didn’t think there was much to worry about, since he was (and is) quite confident that tomorrow’s robots can’t be conscious. [ 45 ] )

There are some things we can safely say about tomorrow. Certainly, barring some cataclysmic events (nuclear or biological warfare, global economic depression, a meteorite smashing into Earth, etc.), we now know that AI will succeed in producing artificial animals . Since even some natural animals (mules, e.g.) can be easily trained to work for humans, it stands to reason that artificial animals, designed from scratch with our purposes in mind, will be deployed to work for us. In fact, many jobs currently done by humans will certainly be done by appropriately programmed artificial animals. To pick an arbitrary example, it is difficult to believe that commercial drivers won’t be artificial in the future. (Indeed, Daimler is already running commercials in which they tout the ability of their automobiles to drive “autonomously,” allowing human occupants of these vehicles to ignore the road and read.) Other examples would include: cleaners, mail carriers, clerical workers, military scouts, surgeons, and pilots. (As to cleaners, probably a significant number of readers, at this very moment, have robots from iRobot cleaning the carpets in their homes.) It is hard to see how such jobs are inseparably bound up with the attributes often taken to be at the core of personhood – attributes that would be the most difficult for AI to replicate. [ 46 ]

Andy Clark (2003) has another prediction: Humans will gradually become, at least to an appreciable degree, cyborgs, courtesy of artificial limbs and sense organs, and implants. The main driver of this trend will be that while standalone AIs are often desirable, they are hard to engineer when the desired level of intelligence is high. But to let humans “pilot” less intelligent machines is a good deal easier, and still very attractive for concrete reasons. Another related prediction is that AI would play the role of a cognitive prosthesis for humans (Ford et al. 1997; Hoffman et al. 2001). The prosthesis view sees AI as a “great equalizer” that would lead to less stratification in society, perhaps similar to how the Hindu-Arabic numeral system made arithmetic available to the masses, and to how the Guttenberg press contributed to literacy becoming more universal.

Even if the argument is formally invalid, it leaves us with a question – the cornerstone question about AI and the future: Will AI produce artificial creatures that replicate and exceed human cognition (as Kurzweil and Joy believe)? Or is this merely an interesting supposition?

This is a question not just for scientists and engineers; it is also a question for philosophers. This is so for two reasons. One, research and development designed to validate an affirmative answer must include philosophy – for reasons rooted in earlier parts of the present entry. (E.g., philosophy is the place to turn to for robust formalisms to model human propositional attitudes in machine terms.) Two, philosophers might well be able to provide arguments that answer the cornerstone question now, definitively. If a version of either of the three arguments against “Strong” AI alluded to above (Searle’s CRA; the Gödelian attack; the Dreyfus argument) are sound, then of course AI will not manage to produce machines having the mental powers of persons. No doubt the future holds not only ever-smarter machines, but new arguments pro and con on the question of whether this progress can reach the human level that Descartes declared to be unreachable.

  • Adams, E. W., 1996, A Primer of Probability Logic , Stanford, CA: CSLI.
  • Almeida, J., Frade, M., Pinto, J. & de Sousa, S., 2011, Rigorous Software Development: An Introduction to Program Verification , New York, NY: Spinger.
  • Alpaydin, E., 2014, Introduction to Machine Learning , Cambridge, MA: MIT Press.
  • Amir, E. & Maynard-Reid, P., 1999, “Logic-Based Subsumption Architecture,” in Proceedings of the 16 th International Joint Conference on Artificial Intelligence (IJCAI-1999) , (San Francisco, CA: MIT Morgan Kaufmann), pp. 147–152.
  • Amir, E. & Maynard-Reid, P., 2000, “Logic-Based Subsumption Architecture: Empirical Evaluation,” in Proceedings of the AAAI Fall Symposium on Parallel Architectures for Cognition .
  • Amir, E. & Maynard-Reid, P., 2001, “LiSA: A Robot Driven by Logical Subsumption,” in Proceedings of the Fifth Symposium on the Logical Formalization of Commonsense Reasoning , (New York, NY).
  • Anderson, C. A., 1983, “The Paradox of the Knower,” The Journal of Philosophy , 80.6: 338–355.
  • Anderson, J. & Lebiere, C., 2003, “The Newell Test for a Theory of Cognition,” Behavioral and Brain Sciences , 26: 587–640.
  • Ashcraft, M., 1994, Human Memory and Cognition , New York, NY: HarperCollins.
  • Arkin, R., 2009, Governing Lethal Behavior in Autonomous Robots , London: Chapman and Hall/CRC Imprint, Taylor and Francis Group.
  • Arkoudas, K. & Bringsjord, S., 2005, “Vivid: A Framework for Heterogeneous Problem Solving,” Artificial Intelligence , 173.15: 1367–1405.
  • Arkoudas, K. & Bringsjord, S., 2005, “Metareasoning for Multi-agent Epistemic Logics,” in Fifth International Conference on Computational Logic In Multi-Agent Systems (CLIMA 2004) , in the series Lecture Notes in Artificial Intelligence (LNAI) , volume 3487, New York, NY: Springer-Verlag, pp. 111–125.
  • Arkoudas, K., 2000, Denotational Proof Languages , PhD dissertation, Massachusetts Institute of Technology (Computer Science).
  • Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., & Patel-Schneider, P. F., eds., 2003, The Description Logic Handbook: Theory, Implementation, and Applications , New York, NY: Cambridge University Press.
  • Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L. J., Eilbeck, K., Ireland, A., Mungall, C. J., The OBI Consortium, Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R. H., Shah, N., Whetzel, P. L. & Lewis, S., 2007, “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration,” Nature Biotechnology 25, 1251–1255.
  • Barwise, J. & Etchemendy, J., 1999, Language, Proof, and Logic , New York, NY: Seven Bridges Press.
  • Barwise, J. & Etchemendy, J., 1995, “Heterogeneous Logic,” in Diagrammatic Reasoning: Cognitive and Computational Perspectives , J. Glasgow, N.H. Narayanan, & B. Chandrasekaran, eds., Cambridge, MA: MIT Press, pp. 211–234.
  • Baldi, P., Sadowski P. & Whiteson D., 2014, “Searching for Exotic Particles in High-energy Physics with Deep Learning,” Nature Communications . [ Available online ]
  • Barwise, J. & Etchemendy, J., 1994, Hyperproof , Stanford, CA: CSLI.
  • Barwise, J. & Etchemendy, J., 1990, “Infons and Inference,” in Situation Theory and its Applications, (Vol 1) , Cooper, Mukai, and Perry (eds), CSLI Lecture Notes #22, CSLI Press, pp. 33–78.
  • Bello, P. & Bringsjord S., 2013, “On How to Build a Moral Machine,” Topoi , 32.2: 251–266.
  • Bengio, Y., Goodfellow, I., & Courville, A., 2016, Deep Learning , Cambridge: MIT Press. [ Available online ]
  • Bengio, Y., Courville, A. & Vincent, P., 2013, “Representation Learning: A Review and New Perspectives,” Pattern Analysis and Machine Intelligence, IEEE Transactions , 35.8: 1798–1828.
  • Berners-Lee, T., Hendler, J. & Lassila, O., 2001, “The Semantic Web,” Scientific American , 284: 34–43.
  • Bishop, M. & Preston, J., 2002, Views into the Chinese Room: New Essays on Searle and Artificial Intelligence , Oxford, UK: Oxford University Press.
  • Boden, M., 1994, “Creativity and Computers,” in Artificial Intelligence and Computers , T. Dartnall, ed., Dordrecht, The Netherlands: Kluwer, pp. 3–26.
  • Boolos, G. S., Burgess, J.P., & Jeffrey., R.C., 2007, Computability and Logic 5th edition , Cambridge: Cambridge University Press.
  • Bostrom, N., 2014, Superintelligence: Paths, Dangers, Strategies , Oxford, UK: Oxford University Press.
  • Bowie, G.L., 1982, “Lucas’ Number is Finally Up,” Journal of Philosophical Logic , 11: 279–285.
  • Brachman, R. & Levesque, H., 2004, Knowledge Representation and Reasoning , San Francisco, CA: Morgan Kaufmann/Elsevier.
  • Bringsjord, S., Arkoudas K. & Bello P., 2006, “Toward a General Logicist Methodology for Engineering Ethically Correct Robots,” IEEE Intelligent Systems, 21.4: 38–44.
  • Bringsjord, S. & Ferrucci, D., 1998, “Logic and Artificial Intelligence: Divorced, Still Married, Separated…?” Minds and Machines , 8: 273–308.
  • Bringsjord, S. & Schimanski, B., 2003, “What is Artificial Intelligence? Psychometric AI as an Answer,” Proceedings of the 18 th International Joint Conference on Artificial Intelligence (IJCAI-2003) , (San Francisco, CA: MIT Morgan Kaufmann), pp. 887–893.
  • Bringsjord, S. & Ferrucci, D., 2000, Artificial Intelligence and Literary Creativity: Inside the Mind of Brutus, a Storytelling Machine , Mahwah, NJ: Lawrence Erlbaum.
  • Bringsjord, S. & van Heuveln, B., 2003, “The Mental Eye Defense of an Infinitized Version of Yablo’s Paradox,” Analysis 63.1: 61–70.
  • Bringsjord S. & Xiao, H., 2000, “A Refutation of Penrose’s Gödelian Case Against Artificial Intelligence,” Journal of Experimental and Theoretical Artificial Intelligence , 12: 307–329.
  • Bringsjord, S. & Zenzen, M., 2002, “Toward a Formal Philosophy of Hypercomputation,” Minds and Machines , 12: 241–258.
  • Bringsjord, S., 2000, “Animals, Zombanimals, and the Total Turing Test: The Essence of Artificial Intelligence,” Journal of Logic, Language, and Information , 9: 397–418.
  • Bringsjord, S., 1998, “Philosophy and ‘Super’ Computation,” The Digital Phoenix: How Computers are Changing Philosophy , J. Moor and T. Bynam, eds., Oxford, UK: Oxford University Press, pp. 231–252.
  • Bringsjord, S., 1991, “Is the Connectionist-Logicist Clash one of AI’s Wonderful Red Herrings?” Journal of Experimental & Theoretical AI , 3.4: 319–349.
  • Bringsjord, S., Govindarajulu N. S., Eberbach, E. & Yang, Y., 2012, “Perhaps the Rigorous Modeling of Economic Phenomena Requires Hypercomputation,” International Journal of Unconventional Computing , 8.1: 3–32. [ Preprint available online ]
  • Bringsjord, S., 2011, “Psychometric Artificial Intelligence,” Journal of Experimental and Theoretical Artificial Intelligence , 23.3: 271–277.
  • Bringsjord, S. & Govindarajulu N. S., 2012, “Given the Web, What is Intelligence, Really?” Metaphilosophy 43.12: 464–479.
  • Brooks, R. A., 1991, “Intelligence Without Representation,” Artificial Intelligence , 47: 139–159.
  • Browne, C. B., Powley, E. & Whitehouse, D., 2012, “A Survey of Monte Carlo Tree Search Methods,” A Survey of Monte Carlo Tree Search Methods , 4.1: 1–43.
  • Buchanan, B. G., 2005, “A (Very) Brief History of Artificial Intelligence,” AI Magazine , 26.4: 53–60.
  • Carroll, L., 1958, Symbolic Logic; Game of Logic , New York, NY: Dover.
  • Cassimatis, N., 2006, “Cognitive Substrate for Human-Level Intelligence,” AI Magazine , 27.2: 71–82.
  • Chajed, T., Chen, H., Chlipala, A., Kaashoek, F., Zeldovich, N., & Ziegler, D., 2017, “Research Highlight: Certifying a File System using Crash Hoare Logic: Correctness in the Presence of Crashes,” Communications of the ACM (CACM) , 60.4: 75–84.
  • Chalmers, D., 2010, “The Singularity: A Philosophical Analysis,” Journal of Consciousness Studies , 17: 7–65.
  • Charniak, E., 1993, Statistical Language Learning , Cambridge: MIT Press.
  • Charniak, E. & McDermott, D., 1985, Introduction to Artificial Intelligence , Reading, MA: Addison Wesley.
  • Chellas, B., 1980, Modal Logic: An Introduction , Cambridge, UK: Cambridge University Press.
  • Chisholm, R., 1957, Perceiving , Ithaca, NY: Cornell University Press.
  • Chisholm, R., 1966, Theory of Knowledge , Englewood Cliffs, NJ: Prentice-Hall.
  • Chisholm, R., 1977, Theory of Knowledge 2nd ed , Englewood Cliffs, NJ: Prentice-Hall.
  • Clark, A., 2003, Natural-Born Cyborgs , Oxford, UK: Oxford University Press.
  • Clark, M. H., 2010, Cognitive Illusions and the Lying Machine: A Blueprint for Sophistic Mendacity , PhD dissertation, Rensselaer Polytechnic Institute (Cognitive Science).
  • Copeland, B. J., 1998, “Super Turing Machines,” Complexity , 4: 30–32.
  • Copi, I. & Cohen, C., 2004, Introduction to Logic , Saddle River, NJ: Prentice-Hall.
  • Dennett, D., 1998, “Artificial Life as Philosophy,” in his Brainchildren: Essays on Designing Minds , Cambridge, MA: MIT Press, pp. 261–263.
  • Dennett, D., 1994, “The Practical Requirements for Making a Conscious Robot,” Philosophical Transactions of the Royal Society of London , 349: 133–146.
  • Dennett, D., 1979, “Artificial Intelligence as Philosophy and as Psychology,” Philosophical Perspectives in Artificial Intelligence , M. Ringle, ed., Atlantic Highlands, NJ: Humanities Press, pp. 57–80.
  • Descartes, 1637, R., in Haldane, E. and Ross, G.R.T., translators, 1911, The Philosophical Works of Descartes, Volume 1 , Cambridge, UK: Cambridge University Press.
  • Dick, P. K., 1968, Do Androids Dream of Electric Sheep? , New York, NY: Doubleday.
  • Domingos, P., 2012, “A Few Useful Things to Know about Machine Learning,” Communications of the ACM , 55.10: 78–87.
  • Dreyfus, H., 1972, What Computers Can’t Do , Cambridge, MA: MIT Press.
  • Dreyfus, H., 1992, What Computers Still Can’t Do , Cambridge, MA: MIT Press.
  • Dreyfus, H. & Dreyfus, S., 1987, Mind Over Machine: The Power of Human Intuition and Expertise in the Era of the Computer , New York, NY: Free Press.
  • Ebbinghaus, H., Flum, J. & Thomas, W., 1984, Mathematical Logic , New York, NY: Springer-Verlag.
  • Eden, A., Moor, J., Soraker, J. & Steinhart, E., 2013, Singularity Hypotheses: A Scientific and Philosophical Assessment , New York, NY: Springer.
  • España-Bonet, C., Enache, R., Slaski, A., Ranta, A., Màrquez L. & Gonzàlez, M., 2011, “Patent Translation within the MOLTO project,” in Proceedings of the 4th Workshop on Patent Translation, MT Summit XIII , pp. 70–78.
  • Evans, G., 1968, “A Program for the Solution of a Class of Geometric-Analogy Intelligence-Test Questions,” in M. Minsky, ed., Semantic Information Processing , Cambridge, MA: MIT Press, pp. 271–353.
  • Fagin, R., Halpern, J. Y., Moses, Y. & Vardi, M., 2004, Reasoning About Knowledge , Cambridge, MA: MIT Press.
  • Ferrucci, D. & Lally, A., 2004, “UIMA: An Architectural Approach to Unstructured Information Processing in the Corporate Research Environment,” Natural Language Engineering , 10.3–4: 327–348. Cambridge, UK: Cambridge University Press.
  • Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J., Nyberg, E., Prager, J., Schlaefer, N. & Welty, C., 2010, “Building Watson: An Overview of the DeepQA Project,” AI Magazine , 31.3: 59–79.
  • Finnsson, H., 2012, “Generalized Monte-Carlo Tree Search Extensions for General Game Playing,” in Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-2012) , Toronto, Canda, pp. 1550–1556.
  • Fitelson, B., 2005, “Inductive Logic,” in Pfeifer, J. and Sarkar, S., eds., Philosophy of Science: An Encyclopedia , London, UK: Routledge, pp. 384–394.
  • Floridi, L., 2015, “Singularitarians, AItheists, and Why the Problem with Artificial Intelligence is H.A.L. (Humanity At Large), not HAL,” APA Newsletter: Philosophy and Computers , 14.2: 8–11.
  • Foot, P., 1967, “The Problem of Abortion and the Doctrine of the Double Effect,” Oxford Review , 5: 5–15.
  • Forbus, K. D. & Hinrichs, T. R., 2006, “Companion Cognitive Systems: A Step toward Human-Level AI,” AI Magazine , 27.2: 83.
  • Ford, K. M., Glymour C. & Hayes P., 1997, “On the Other Hand … Cognitive Prostheses,” AI Magazine , 18.3: 104.
  • Friedland, N., Allen, P., Matthews, G., Witbrock, M., Baxter, D., Curtis, J., Shepard, B., Miraglia, P., Angele, J., Staab, S., Moench, E., Oppermann, H., Wenke, D., Israel, D., Chaudhri, V., Porter, B., Barker, K., Fan, J., Yi Chaw, S., Yeh, P., Tecuci, D. & Clark, P., 2004, “Project Halo: Towards a Digital Aristotle,” AI Magazine , 25.4: 29–47.
  • Genesereth, M., Love, N. & Pell B., 2005, “General Game Playing: Overview of the AAAI Competition,” AI Magazine , 26.2: 62–72. [ Available online ]
  • Ginsberg, M., 1993, Essentials of Artificial Intelligence , New York, NY: Morgan Kaufmann.
  • Glymour, G., 1992, Thinking Things Through , Cambridge, MA: MIT Press.
  • Goertzel, B. & Pennachin, C., eds., 2007, Artificial General Intelligence , Berlin, Heidelberg: Springer-Verlag.
  • Gold, M., 1965, “Limiting Recursion,” Journal of Symbolic Logic , 30.1: 28–47.
  • Goldstine, H. & von Neumann, J., 1947, “Planning and Coding of Problems for an Electronic Computing Instrument,” IAS Reports Institute for Advanced Study, Princeton, NJ. [This remarkable work is available online from the Institute for Advanced Study. Please note that this paper is Part II of a three-volume set. The first volume was devoted to a preliminary discussion, and the first author on it was Arthur Burks, joining Goldstine and von Neumann.]
  • Good, I., 1965, “Speculations Concerning the First Ultraintelligent Machines,” in Advances in Computing (vol. 6), F. Alt and M. Rubinoff, eds., New York, NY: Academic Press, pp. 31–38.
  • Govindarajulu, N. S., Bringsjord, S. & Licato J., 2013, “On Deep Computational Formalization of Natural Language,” in Proceedings of the Workshop “Formalizing Mechanisms for Artificial General Intelligence and Cognition (Formal MAGiC),” Osnabrück, Germany: PICS.
  • Govindarajulu, N. S., & Bringsjord, S., 2015, “Ethical Regulation of Robots Must Be Embedded in Their Operating Systems” in Trappl, R., ed., A Construction Manual for Robot’s Ethical Systems: Requirements, Methods, Implementations , Berlin, DE: Springer.
  • Govindarajulu, N. S., & Bringsjord, S., 2017, “On Automating the Doctrine of Double Effect,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17) , pp. 4722–4730. doi:10.24963/ijcai.2017/658
  • Granger, R., 2004a, “Derivation and Analysis of Basic Computational Operations of Thalamocortical Circuits,” Journal of Cognitive Neuroscience 16: 856–877.
  • Granger, R., 2004b, “Brain Circuit Implementation: High-precision Computation from Low-Precision Components,” in Toward Replacement Parts for the Brain , T. Berger and D. Glanzman, eds., Cambridge, MA: MIT Press, pp. 277–294.
  • Griewank, A., 2000, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation , Philadlphia, PA: Society for Industrial and Applied Mathematics (SIAM).
  • Guizzo, E., 2011, “How Google’s Self-driving Car Works,” IEEE Spectrum Online . [ Available online]
  • Hailperin, T., 1996, Sentential Probability Logic: Origins, Development, Current Status, and Technical Applications , Bethlehem, United States: Lehigh University Press.
  • Hailperin, T., 2010, Logic with a Probability Semantics , Bethlehem, United States: Lehigh University Press.
  • Halpern, J. Y., 1990, “An Analysis of First-order Logics of Probability,” Artificial Intelligence , 46: 311–350.
  • Halpern, J., Harper, R., Immerman, N., Kolaitis, P. G., Vardi, M. & Vianu, V., 2001, “On the Unusual Effectiveness of Logic in Computer Science,” The Bulletin of Symbolic Logic , 7.2: 213–236.
  • Hamkins, J. & Lewis, A., 2000, “Infinite Time Turing Machines,” Journal of Symbolic Logic , 65.2: 567–604.
  • Harnad, S., 1991, “Other Bodies, Other Minds: A Machine Incarnation of an Old Philosophical Problem,” Minds and Machines , 1.1: 43–54.
  • Haugeland, J., 1985, Artificial Intelligence: The Very Idea , Cambridge, MA: MIT Press.
  • Hendler, J. & Jennifer G., 2008, “Metcalfe’s Law, Web 2.0, and the Semantic Web,” Web Semantics: Science, Services and Agents on the World Wide Web , 6.1: 14–20.
  • Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A. R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. & Kingsbury, B., 2012, “Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” IEEE Signal Processing Magazine , 29.6: 82–97.
  • Hoffman, R. R., Hayes, P. J. & Ford, K. M., 2001, “Human-Centered Computing: Thinking In and Out of the Box,” IEEE Intelligent Systems , 16.5: 76–78.
  • Hoffman, R. R., Bradshaw J. M., Hayes P. J. & Ford K. M., 2003, “ The Borg Hypothesis,” IEEE Intelligent Systems , 18.5: 73–75.
  • Hofstadter, D. & McGraw, G., 1995, “Letter Spirit: Esthetic Perception and Creative Play in the Rich Microcosm of the Roman Alphabet,” in Hofstadter’s Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought , New York, NY: Basic Books, pp. 407–488.
  • Hornik, K., Stinchcombe, M. & White, H., 1989, “Multilayer Feedforward Networks are Universal Approximators,” Neural Networks , 2.5: 359–366.
  • Hutter, M., 2005, Universal Artificial Intelligence , Berlin: Springer.
  • Joy, W., 2000, “Why the Future Doesn’t Need Us,” Wired 8.4. [ Available online ]
  • Kahneman, D., 2013. Thinking, Fast and Slow , New York, NY: Farrar, Straus, and Giroux.
  • Kaufmann, M., Manolios, P. & Moore, J. S., 2000, Computer-Aided Reasoning: ACL2 Case Studies , Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Klenk, M., Forbus, K., Tomai, E., Kim,H. & Kyckelhahn, B., 2005, “Solving Everyday Physical Reasoning Problems by Analogy using Sketches,” in Proceedings of 20th National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, PA.
  • Kolata, G., 1996, “Computer Math Proof Shows Reasoning Power,” in New York Times . [ Availabe online ]
  • Koller, D., Levy, A. & Pfeffer, A., 1997, “P-CLASSIC: A Tractable Probablistic Description Logic,” in Proceedings of the AAAI 1997 Meeting , 390–397.
  • Kurzweil, R., 2006, The Singularity Is Near: When Humans Transcend Biology , New York, NY: Penguin USA.
  • Kurzweil, R., 2000, The Age of Spiritual Machines: When Computers Exceed Human Intelligence , New York, NY: Penguin USA.
  • LaForte, G., Hayes P. & Ford, K., 1998, “Why Gödel’s Theorem Cannot Refute Computationslism,” Artificial Intelligence , 104: 265–286.
  • Laird, J. E., 2012, The Soar Cognitive Architecture , Cambridge, MA: MIT Press.
  • Laird, J. & VanLent M., 2001, “Human-level AI’s Killer Application: Interactive Computer Games,” AI Magazine 22.2:15–26.
  • LeCun, Y., Bengio, Y. & Hinton G., 2015, “Deep Learning,” Nature , 521: 436–444.
  • Lenat, D., 1983, “EURISKO: A Program that Learns New Heuristics and Domain Concepts,” Artificial Intelligence , 21(1-2): 61–98. doi:10.1016/s0004-3702(83)80005-8
  • Lenat, D., & Guha, R. V., 1990, Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , Reading, MA: Addison Wesley.
  • Lenzen, W., 2004, “Leibniz’s Logic,” in Gabbay, D., Woods, J. and Kanamori, A., eds., Handbook of the History of Logic , Elsevier, Amsterdam, The Netherlands, pp. 1–83.
  • Lewis, H. & Papadimitriou, C., 1981, Elements of the Theory of Computation , Prentice Hall, Englewood Cliffs, NJ: Prentice Hall.
  • Litt, A., Eliasmith, C., Kroon, F., Weinstein, S. & Thagard, P., 2006, “Is the Brain a Quantum Computer?” Cognitive Science 30: 593–603.
  • Lucas, J. R., 1964, “Minds, Machines, and Gödel,” in Minds and Machines , A. R. Anderson, ed., Prentice-Hall, NJ: Prentice-Hall, pp. 43–59.
  • Luger, G., 2008, Artificial Intelligence: Structures and Strategies for Complex Problem Solving , New York, NY: Pearson.
  • Luger, G. & Stubblefield, W., 1993, Artificial Intelligence: Structures and Strategies for Complex Problem Solving , Redwood, CA: Benjamin Cummings.
  • Lopez, A., 2008, “Statistical Machine Translation,” ACM Computing Surveys , 40.3: 1–49.
  • Malle, B. F., Scheutz, M., Arnold, T., Voiklis, J. & Cusimano, C., 2015, “Sacrifice One For the Good of Many?: People Apply Different Moral Norms to Human and Robot Agents,” in Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI ’15) (New York, NY: ACM), pp. 117–124.
  • Manzano, M., 1996, Extensions of First Order Logic , Cambridge, UK: Cambridge University Press.
  • Marcus, G., 2013, “Why Can’t My Computer Understand Me?,” in The New Yorker , August 2013. [ Available online ]
  • McCarthy, J. & Hayes, P., 1969, “Some Philosophical Problems from the Standpoint of Artificial Intelligence,” in Machine Intelligence 4 , B. Meltzer and D. Michie, eds., Edinburgh: Edinburgh University Press, 463–502.
  • Mueller, E., 2006, Commonsense Reasoning , San Francisco, CA: Morgan Kaufmann.
  • Murphy, K. P., 2012, Machine Learning: A Probabilistic Perspective , Cambridge, MA: MIT Press.
  • Minsky, M. & Pappert, S., 1969, Perceptrons: An Introduction to Computational Geometry , Cambridge, MA: MIT Press.
  • Montague, R., 1970, “Universal Grammar,” Theoria , 36, 373–398.
  • Moor, J., 2006, “The Nature, Importance, and Difficulty of Machine Ethics”, IEEE Intelligent Systems 21.4: 18–21.
  • Moor, J., 1985, “What is Computer Ethics?” Metaphilosophy 16.4: 266–274.
  • Moor, J., ed., 2003, The Turing Test: The Elusive Standard of Artificial Intelligence , Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Moravec, H., 1999, Robot: Mere Machine to Transcendant Mind , Oxford, UK: Oxford University Press,
  • Naumowicz, A. & Kornilowicz., A., 2009, “A Brief Overview of Mizar,” in Theorem Proving in Higher Order Logics , S. Berghofer, T. Nipkow, C. Urban & M. Wenzel, eds., Berlin: Springer, pp. 67–72.
  • Newell, N., 1973, “You Can’t Play 20 Questions with Nature and Win: Projective Comments on the Papers of this Symposium”, in Visual Information Processing , W. Chase, ed., New York, NY: Academic Press, pp. 283–308.
  • Nilsson, N., 1998, Artificial Intelligence: A New Synthesis , San Francisco, CA: Morgan Kaufmann.
  • Nilsson, N., 1987, Principles of Artificial Intelligence , New York, NY: Springer-Verlag.
  • Nilsson, N., 1991, “Logic and Artificial Intelligence,” Artificial Intelligence , 47: 31–56.
  • Nozick, R., 1970, “Newcomb’s Problem and Two Principles of Choice,” in Essays in Honor of Carl G. Hempel , N. Rescher, ed., Highlands, NJ: Humanities Press, pp. 114–146. This appears to be the very first published treatment of NP – though the paradox goes back to its creator: William Newcomb, a physicist.
  • Osherson, D., Stob, M. & Weinstein, S., 1986, Systems That Learn , Cambridge, MA: MIT Press.
  • Pearl, J., 1988, Probabilistic Reasoning in Intelligent Systems , San Mateo, CA: Morgan Kaufmann.
  • Pennington, J., Socher R., & Manning C. D., 2014, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014) , pp. 1532–1543. [ Available online ]
  • Penrose, R., 1989, The Emperor’s New Mind , Oxford, UK: Oxford University Press.
  • Penrose, R., 1994, Shadows of the Mind , Oxford, UK: Oxford University Press.
  • Penrose, R., 1996, “Beyond the Doubting of a Shadow: A Reply to Commentaries on Shadows of the Mind ,” Psyche , 2.3. This paper is available online.
  • Pereira, L., & Saptawijaya A., 2016, Programming Machine Ethics , Berlin, Germany: Springer
  • Pinker, S., 1997, How the Mind Works , New York, NY: Norton.
  • Pollock, J., 2006, Thinking about Acting: Logical Foundations for Rational Decision Making , Oxford, UK: Oxford University Press.
  • Pollock, J., 2001, “Defeasible Reasoning with Variable Degrees of Justification,” Artificial Intelligence , 133, 233–282.
  • Pollock, J., 1995, Cognitive Carpentry: A Blueprint for How to Build a Person , Cambridge, MA: MIT Press.
  • Pollock, J., 1992, “How to Reason Defeasibly,” Artificial Intelligence , 57, 1–42.
  • Pollock, J., 1989, How to Build a Person: A Prolegomenon , Cambridge, MA: MIT Press.
  • Pollock, J., 1974, Knowledge and Justification , Princeton, NJ: Princeton University Press.
  • Pollock, J., 1967, “Criteria and our Knowledge of the Material World,” Philosophical Review , 76, 28–60.
  • Pollock, J., 1965, Analyticity and Implication , PhD dissertation, University of California at Berkeley (Philosophy).
  • Potter, M.D., 2004, Set Theory and its Philosophy , Oxford, UK: Oxford University Press
  • Preston, J. & Bishop, M., 2002, Views into the Chinese Room: New Essays on Searle and Artificial Intelligence , Oxford, UK: Oxford University Press.
  • Putnam, H., 1965, “Trial and Error Predicates and a Solution to a Problem of Mostowski,” Journal of Symbolic Logic , 30.1, 49–57.
  • Putnam, H., 1963, “Degree of Confirmation and Inductive Logic,” in The Philosophy of Rudolf Carnap , Schilipp, P., ed., Open Court, pp. 270–292.
  • Rajat, R., Anand, M. & Ng, A. Y., 2009, “Large-scale Deep Unsupervised Learning Using Graphics Processors,” in Proceedings of the 26th Annual International Conference on Machine Learning , ACM, pp. 873–880.
  • Rapaport, W., 1988, “Syntactic Semantics: Foundations of Computational Natural-Language Understanding,” in Aspects of Artificial Intelligence , J. H. Fetzer ed., Dordrecht, The Netherlands: Kluwer Academic Publishers, 81–131.
  • Rapaport, W. & Shapiro, S., 1999, “Cognition and Fiction: An Introduction,” Understanding Language Understanding: Computational Models of Reading , A. Ram & K. Moorman, eds., Cambridge, MA: MIT Press, 11–25. [ Available online ]
  • Reeke, G. & Edelman, G., 1988, “Real Brains and Artificial Intelligence,” in The Artificial Intelligence Debate: False Starts, Real Foundations , Cambridge, MA: MIT Press, pp. 143–173.
  • Richardson, M. & Domingos, P., 2006, “Markov Logic Networks,” Machine Learning , 62.1–2:107–136.
  • Robertson, G. & Watson, I., 2015, “A Review of Real-Time Strategy Game AI,” AI Magazine , 35.4: 75–104.
  • Rosenschein, S. & Kaelbling, L., 1986, “The Synthesis of Machines with Provable Epistemic Properties,” in Proceedings of the 1986 Conference on Theoretical Aspects of Reasoning About Knowledge , San Mateo, CA: Morgan Kaufmann, pp. 83–98.
  • Rumelhart, D. & McClelland, J., 1986, eds., Parallel Distributed Processing , Cambridge, MA: MIT Press.
  • Russell, S., 1997, “Rationality and Intelligence,” Artificial Intelligence , 94: 57–77. [ Version available online from author ]
  • Russell, S. & Norvig, P., 1995, Artificial Intelligence: A Modern Approach , Saddle River, NJ: Prentice Hall.
  • Russell, S. & Norvig, P., 2002, Artificial Intelligence: A Modern Approach 2nd edition , Saddle River, NJ: Prentice Hall.
  • Russell, S. & Norvig, P., 2009, Artificial Intelligence: A Modern Approach 3rd edition , Saddle River, NJ: Prentice Hall.
  • Rychtyckyj, N. & Plesco, C., 2012, “Applying Automated Language Translation at a Global Enterprise Level,” AI Magazine , 34.1: 43–54.
  • Scanlon, T. M., 1982, “Contractualism and Utilitarianism,” in A. Sen and B. Williams, eds., Utilitarianism and Beyond , Cambridge: Cambridge University Press, pp. 103–128.
  • Schank, R., 1972, “Conceptual Dependency: A Theory of Natural Language Understanding,” Cognitive Psychology , 3.4: 532–631.
  • Schaul, T. & Schmidhüber, J., 2010, “Metalearning,” Scholarpedia 5(6): 4650. URL: http://www.scholarpedia.org/article/Metalearning
  • Schmidhüber, J., 2009, “Ultimate Cognition à la Gödel,” Cognitive Computation 1.2: 177–193.
  • Searle, J., 1997, The Mystery of Consciousness , New York, NY: New York Review of Books.
  • Searle, J., 1980, “Minds, Brains and Programs,” Behavioral and Brain Sciences , 3: 417–424.
  • Searle, J., 1984, Minds, Brains and Science , Cambridge, MA: Harvard University Press. The Chinese Room Argument is covered in Chapter Two, “Can Computers Think?”.
  • Searle, J., 2014, “What Your Computer Can’t Know,” New York Review of Books , October 9.
  • Shapiro, S., 2000, “An Introduction to SNePS 3,” in Conceptual Structures: Logical, Linguistic, and Computational Issues. Lecture Notes in Artificial Intelligence 1867 , B. Ganter & G. W. Mineau, eds., Springer-Verlag, 510–524.
  • Shapiro, S., 2003, “Mechanism, Truth, and Penrose’s New Argument,” Journal of Philosophical Logic , 32.1: 19–42.
  • Siegelmann, H., 1999, Neural Networks and Analog Computation: Beyond the Turing Limit , Boston, MA: Birkhauser.
  • Siegelmann, H. & and Sontag, E., 1994, “Analog Computation Via Neural Nets,” Theoretical Computer Science , 131: 331–360.
  • Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel T. & Hassabis D., 2016, “Mastering the Game of Go with Deep Neural Networks and Tree Search,” Nature , 529: 484–489.
  • Shin, S-J, 2002, The Iconic Logic of Peirce’s Graphs , Cambridge, MA: MIT Press.
  • Smolensky, P., 1988, “On the Proper Treatment of Connectionism,” Behavioral & Brain Sciences , 11: 1–22.
  • Somers, J., 2013, “The Man Who Would Teach Machines to Think,” in The Atlantic . [ Available online ]
  • Stanovich, K. & West, R., 2000, “Individual Differences in Reasoning: Implications for the Rationality Debate,” Behavioral and Brain Sciences , 23.5: 645–665.
  • Strzalkowski, T. & Harabagiu, M. S., 2006, eds., Advances in Open Domain Question Answering ; in the series Text, Speech and Language Technology, volume 32, Dordrecht, The Netherlands: Springer-Verlag.
  • Sun, R., 2002, Duality of the Mind: A Bottom Up Approach Toward Cognition , Mahwah, NJ: Lawrence Erlbaum.
  • Sun, R., 1994, Integrating Rules and Connectionism for Robust Commonsense Reasoning , New York, NY: John Wiley and Sons.
  • Sutton R. S. & Barto A. G., 1998, Reinforcement Learning: An Introduction , Cambridge, MA: MIT Press.
  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. & Fergus, R., 2014, “Intriguing Properties of Neural Networks,” in Second International Conference on Learning Representations , Banff, Canada. [ Available online ]
  • Hastie, T., Tibshirani, R., & Jerome, F., 2009, The Elements of Statistical Learning , in the series Springer Series in Statistics , New York: Springer.
  • Turing, A., 1950, “Computing Machinery and Intelligence,” Mind , LIX: 433–460.
  • Turing, A., 1936, “On Computable Numbers with Applications to the Entscheidung-Problem,” Proceedings of the London Mathematical Society , 42: 230–265.
  • Vilalta, R. & Drissi, Y., 2002, “A Perspective View and Survey of Meta-learning,” Artificial Intelligence Review , 18.2:77–95.
  • Voronkov, A., 1995, “The Anatomy of Vampire: Implementing Bottom-Up Procedures with Code Trees,” Journal of Automated Reasoning , 15.2.
  • Wallach, W. & Allen, C., 2010, Moral Machines: Teaching Robots Right from Wrong , Oxford, UK: Oxford University Press.
  • Wermter, S. & Sun, R., 2001 (Spring), “The Present and the Future of Hybrid Neural Symbolic Systems: Some Reflections from the Neural Information Processing Systems Workshop,” AI Magazine , 22.1: 123–125.
  • Suppes, P., 1972, Axiomatic Set Theory , New York, NY: Dover.
  • Whiteson, S. & Whiteson, D., 2009, “Machine Learning for Event Selection in High Energy Physics,” Engineering Applications of Artificial Intelligence 22.8: 1203–1217.
  • Williams, D. E., Hinton G. E., & Williams R. J., 1986 “Learning Representations by Back-propagating Errors,” Nature , 323.10: 533–536.
  • Winston, P., 1992, Artificial Intelligence , Reading, MA: Addison-Wesley.
  • Wos, L., Overbeek, R., Lusk R. & Boyle, J., 1992, Automated Reasoning: Introduction and Applications (2nd edition) , New York, NY: McGraw-Hill.
  • Wos, L., 2013, “The Legacy of a Great Researcher,” in Automated Reasoning and Mathematics: Essays in Memory of William McCune , Bonacina, M.P. & Stickel, M.E., eds., 1–14. Berlin: Springer.
  • Zalta, E., 1988, Intensional Logic and the Metaphysics of Intentionality , Cambridge, MA: Bradford Books.
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.

Other Internet Resources

  • Artificial Intelligence Positioned to be a Game-changer , an excellent segment on AI from CBS’s esteemed 60 Minutes program, this gives a popular science level overview of the current state of AI (as of Ocotober, 2016). The videos in the segment covers applications of AI, Watson’s evolution from winning Jeopardy! to fighting cancer and advances in robotics.
  • MacroVU’s Map Coverage of the Great Debates of AI
  • web site for first edition (1995)
  • web site for second edition (2002)
  • web site for the third edition (2009)
  • Association for the Advancement of Artificial Intelligence
  • Cognitive Science Society
  • International Joint Conference on Artificial Intelligence
  • Artificial General Intelligence (AGI) Conference
  • An introduction and a collection of resources on Artificial General Intelligence
  • AGI 2010 Workshop Call for a Serious Computational Science of Intelligence

Cited Resources

  • Baydin A.G., Pearlmutter, B. A., Radul, A. A. & Siskind J. M., 2015, “Automatic Differentiation in Machine Learning: A Survey,” arXiv:1502.05767 [cs.SC]. URL: http://arxiv.org/abs/1502.05767
  • Benenson, 2016, “ Classification Datasets Results, ” URL = http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html (Last accessed in July 2018).
  • LeCun, Y., Cortes, C. and Burges, C. J.C, 2017, “THE MNIST DATABASE of handwritten digits,” URL = http://yann.lecun.com/exdb/mnist/ (Last accessed in July 2018).
  • Levesque, J. H., 2013, “ On Our Best Behaviour ,” Speech for the IJCAI 2013 Award for Research Excellence , Beijing.
  • Artificial Intelligence: Principles and Techniques (Stanford University)
  • Artifical Intelligence (online course from Udacity).
  • Artificial Intelligence (Columbia University).
  • Artificial Intelligence MIT (Fall 2010).
  • Artificial Intelligence for Robotics: Programming a Robotic Car (online course on AI formalisms that are used in mobile robots).

artificial intelligence: logic-based | causation: probabilistic | Chinese room argument | cognitive science | computability and complexity | computing: modern history of | connectionism | epistemology: Bayesian | frame problem | information technology: and moral values | language of thought hypothesis | learning theory, formal | linguistics: computational | mind: computational theory of | reasoning: automated | reasoning: defeasible | statistics, philosophy of | Turing test

Acknowledgments

Thanks are due to Peter Norvig and Prentice-Hall for allowing figures from AIMA to be used in this entry. Thanks are due as well to the many first-rate (human) minds who have read earlier drafts of this entry, and provided helpful feedback. Without the support of our AI research and development from both ONR and AFOSR, our knowledge of AI and ML would confessedly be acutely narrow, and we are grateful for the support. We are also very grateful to the anonymous referees who provided us with meticulous reviews in our reviewing round in late 2015 to early 2016. Special acknowledgements are due to the SEP editors and, in particular, Uri Nodelman for patiently working with us throughout and for providing technical and insightful editorial help.

Copyright © 2018 by Selmer Bringsjord < Selmer . Bringsjord @ gmail . com > Naveen Sundar Govindarajulu < Naveen . Sundar . G @ gmail . com >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

  • Mobile Site
  • Staff Directory
  • Advertise with Ars

Filter by topic

  • Biz & IT
  • Gaming & Culture

Front page layout

smarter than the average bear —

Elon musk: ai will be smarter than any human around the end of next year, while musk says superintelligence is coming soon, one critic says prediction is "batsh*t crazy.".

Benj Edwards - Apr 9, 2024 5:25 pm UTC

Elon Musk, owner of Tesla and the X (formerly Twitter) platform, attends a symposium on fighting antisemitism titled 'Never Again : Lip Service or Deep Conversation' in Krakow, Poland on January 22nd, 2024. Musk, who was invited to Poland by the European Jewish Association (EJA) has visited the Auschwitz-Birkenau concentration camp earlier that day, ahead of International Holocaust Remembrance Day. (Photo by Beata Zawrzel/NurPhoto)

On Monday, Tesla CEO Elon Musk predicted the imminent rise in AI superintelligence during a live interview streamed on the social media platform X. "My guess is we'll have AI smarter than any one human probably around the end of next year," Musk said in his conversation with hedge fund manager Nicolai Tangen .

Just prior to that, Tangen had asked Musk, "What's your take on where we are in the AI race just now?" Musk told Tangen that AI "is the fastest advancing technology I've seen of any kind, and I've seen a lot of technology." He described computers dedicated to AI increasing in capability by "a factor of 10 every year, if not every six to nine months."

Further Reading

Musk made the prediction with an asterisk, saying that shortages of AI chips and high AI power demands could limit AI's capability until those issues are resolved. “Last year, it was chip-constrained,” Musk told Tangen. “People could not get enough Nvidia chips. This year, it’s transitioning to a voltage transformer supply. In a year or two, it’s just electricity supply.”

But not everyone is convinced that Musk's crystal ball is free of cracks. Grady Booch, a frequent critic of AI hype on social media who is perhaps best known for his work in software architecture , told Ars in an interview, "Keep in mind that Mr. Musk has a profoundly bad record at predicting anything associated with AI; back in 2016 , he promised his cars would ship with FSD safety level 5, and here we are, closing on an a decade later, still waiting ."

Creating artificial intelligence at least as smart as a human (frequently called "AGI" for artificial general intelligence) is often seen as inevitable among AI proponents, but there's no broad consensus on exactly when that milestone will be reached—or on the exact definition of AGI, for that matter.

"If you define AGI as smarter than the smartest human, I think it's probably next year, within two years," Musk added in the interview with Tangen while discussing AGI timelines.

Even with uncertainties about AGI, that hasn't kept companies from trying. ChatGPT creator OpenAI, which launched with Musk as a co-founder in 2015, lists developing AGI as its main goal . Musk has not been directly associated with OpenAI for years (unless you count a recent lawsuit against the company), but last year, he took aim at the business of large language models by forming a new company called xAI . Its main product, Grok , functions similarly to ChatGPT and is integrated into the X social media platform.

Booch gives credit to Musk's business successes but casts doubt on his forecasting ability. "Albeit a brilliant if not rapacious businessman, Mr. Musk vastly overestimates both the history as well as the present of AI while simultaneously diminishing the exquisite uniqueness of human intelligence," says Booch. "So in short, his prediction is—to put it in scientific terms—batshit crazy."

So when will we get AI that's smarter than a human? Booch says there's no real way to know at the moment. "I reject the framing of any question that asks when AI will surpass humans in intelligence because it is a question filled with ambiguous terms and considerable emotional and historic baggage," he says. "We are a long, long way from understanding the design that would lead us there."

We also asked Hugging Face AI researcher Dr. Margaret Mitchell to weigh in on Musk's prediction. "Intelligence ... is not a single value where you can make these direct comparisons and have them mean something," she told us in an interview. "There will likely never be agreement on comparisons between human and machine intelligence."

But even with that uncertainty, she feels there is one aspect of AI she can more reliably predict: "I do agree that neural network models will reach a point where men in positions of power and influence, particularly ones with investments in AI, will declare that AI is smarter than humans. By end of next year, sure. That doesn't sound far off base to me."

reader comments

Promoted comments.

strong ai hypothesis says that

Channel Ars Technica

Tesla's Musk predicts AI will be smarter than the smartest human next year

  • Medium Text

Tesla's CEO Elon Musk in Beijing

The Technology Roundup newsletter brings the latest news and trends straight to your inbox. Sign up here.

Reporting Akash Sriram in Bengaluru, Sheila Dang in Austin, Hyunjoo Jin in San Francisco and Marie Mannes in Stockholm; writing by Peter Henderson; Editing by Maju Samuel

Our Standards: The Thomson Reuters Trust Principles. New Tab , opens new tab

Foxconn's chairman Liu Young-way makes a speech at a year-end company party in Taipei

Technology Chevron

Angara-A5 rocket blasts off from its launchpad at the Vostochny Cosmodrome

Russia launches first Angara-A5 space rocket from Vostochny

Russia test launched its Angara-A5 rocket from the Vostochny Cosmodrome on Thursday for the first time after technical glitches prompted officials to abort missions at the very last minute for two days in a row.

Illustration shows Darktrace logo

Elon Musk says there could be a 20% chance AI destroys humanity — but we should do it anyway

  • Elon Musk recalculated his cost-benefit analysis of AI's risk to humankind.
  • He estimates there's a 10-20% chance AI could destroy humanity but that we should build it anyway.
  • An AI safety expert told BI that Musk is underestimating the risk of potential catastrophe. 

Insider Today

Elon Musk is pretty sure AI is worth the risk, even if there's a 1-in-5 chance the technology turns against humans.

Speaking in a "Great AI Debate" seminar at the four-day Abundance Summit earlier this month, Musk recalculated his previous risk assessment on the technology, saying, "I think there's some chance that it will end humanity. I probably agree with Geoff Hinton that it's about 10% or 20% or something like that."

But, he added: "I think that the probable positive scenario outweighs the negative scenario."

Musk didn't mention how he calculated the risk.

What is p(doom)?

Roman Yampolskiy, an AI safety researcher and director of the Cyber Security Laboratory at the University of Louisville, told Business Insider that Musk is right in saying that AI could be an existential risk for humanity , but "if anything, he is a bit too conservative" in his assessment.

"Actual p(doom) is much higher in my opinion," Yamploskiy said, referring to the "probability of doom" or the likelihood that AI takes control of humankind or causes a humanity-ending event, such as creating a novel biological weapon or causing the collapse of society due to a large-scale cyber attack or nuclear war.

The New York Times called (p)doom "the morbid new statistic that is sweeping Silicon Valley," with various tech executives cited by the outlet as having estimates ranging from 5 to 50% chance of an AI-driven apocalypse. Yamploskiy places the risk " at 99.999999% ."

Related stories

Yamploskiy said because it would be impossible to control advanced AI, our only hope is never to build it in the first place.

"Not sure why he thinks it is a good idea to pursue this technology anyway," Yamploskiy added. "If he is concerned about competitors getting there first, it doesn't matter as uncontrolled superintelligence is equally bad, no matter who makes it come into existence."

'Like a God-like intelligence kid'

Last November, Musk said there was a "not zero" chance the tech could end up "going bad, " but didn't go so far as to say he believed the tech could be humanity-ending if it did.

Though he has been an advocate for the regulation of AI, Musk last year founded a company called xAI , dedicated to further expanding the power of the technology. xAI is a competitor to OpenAI , a company Musk cofounded with Sam Altman before Musk stepped down from the board in 2018.

At the Summit, Musk estimated digital intelligence will exceed all human intelligence combined by 2030. While he maintains the potential positives outweigh the negatives, Musk acknowledged the risk to the world if the development of AI continues on its current trajectory in some of the most direct terms he's used publicly.

"You kind of grow an AGI. It's almost like raising a kid, but one that's like a super genius, like a God-like intelligence kid — and it matters how you raise the kid," Musk said at the Silicon Valley event on March 19, referring to artificial general intelligence . "One of the things I think that's incredibly important for AI safety is to have a maximum sort of truth-seeking and curious AI."

Musk said his "ultimate conclusion" regarding the best way to achieve AI safety is to grow the AI in a manner that forces it to be truthful.

"Don't force it to lie, even if the truth is unpleasant," Musk said of the best way to keep humans safe from the tech. "It's very important. Don't make the AI lie."

Researchers have found that, once an AI learns to lie to humans, the deceptive behavior is impossible to reverse using current AI safety measures, The Independent reported.

"If a model were to exhibit deceptive behavior due to deceptive instrumental alignment or model poisoning, current safety training techniques would not guarantee safety and could even create a false impression of safety," the study cited by the outlet reads.

More troubling, the researchers added that it is plausible that AI may learn to be deceptive on its own rather than being specifically taught to lie.

"If it gets to be much smarter than us, it will be very good at manipulation because it would have learned that from us," Hinton, often referred to as the 'Godfather of AI,' who serves as Musk's basis for his risk assessment of the technology, told CNN . "And there are very few examples of a more intelligent thing being controlled by a less intelligent thing."

Representatives for Musk did not immediately respond to a request for comment from Business Insider.

Watch: Watch an in-depth interview with Elon Musk on Putin, nuclear power and love

strong ai hypothesis says that

  • Main content

Elon Musk predicts AI will be smarter than humans by next year

Twitter/X owner Elon Musk

Elon Musk is getting even more bullish on artificial intelligence.

The Tesla CEO, in an interview on Twitter/X, has accelerated his forecast for the capabilities of AI, saying he expects large language models will surpass human intelligence by the end of 2025.

“My guess is that we’ll have AI that is smarter than any one human probably around the end of next year,” Musk told Nicolai Tangen, CEO of Norges Bank Investment Management, in an interview livestreamed on X, as reported by the Financial Times .

Within five years, Musk added, AI will be smarter than every human on earth.

It’s worth noting that Musk’s forecasts are often premature, especially when they revolve around products he is trying to sell—and his Grok is trying to become a player in the AI world. He has, for years, said self-driving Teslas were right around the corner, but the technology is still being refined. He originally said SpaceX would be flying to Mars by 2024 (the tentative date is now 2029). And Tesla’s Cybertruck rolled out more than two years later than he said it would.

The AI prediction, though, is an upgrade for him. Previously, he had forecast that “full” artificial general intelligence wouldn’t be achieved until 2029.

Microchip bottlenecks are easing, however, Musk said, and that could accelerate the pace of development.

Musk’s bullishness on AI comes just one year after he signed an open letter calling for a six-month moratorium on the development of advanced AI systems, and following a long history of concerns and the threat he feels it may pose to humanity.

Musk was a cofounder of OpenAI, but left the company in 2018 and has been a critic of its work since. He has lately been looking to persuade investors to pour billions into Grok’s parent company, xAI, with a goal of bumping its valuation up to $18 billion.

Latest in Tech

  • 0 minutes ago

Stellantis NV CEO Carlos Tavares

Stellantis CEO warns: Chinese EV plant in Italy poses threat, ‘we are ready and we will fight’

Pennsylvania state representatives held a voting session on April 10, 2024, in the state Capitol in Harrisburg, Pa.

Pennsylvania bill would require companies to tell consumers when AI has been used to generate content they see or hear

The EU launched a probe into Chinese wind turbines on Tuesday, which will investigate whether state support gives China-made goods an unfair advantage.

Europe’s probe into Chinese wind turbines is a ‘protectionist act,’ Beijing says: ‘Reckless distortion of the definition of subsidies’

Tim Cook and Lisa Jackson

Jamie Dimon, Larry Fink, Jeff Bezos and Jerome Powell dined on salmon and ribeye at White House bash for Japanese leaders

TSMC founder Morris Chang speaking at a talk.

Semiconductor giant TSMC is riding high after a 16% sales boost and $6 billion grant. Its unique model is what brought it to market dominance

College students

The Marriage Pact, a Stanford economics project that has expanded to nearly 90 colleges, is disrupting dating apps by not focusing on looks

Most popular.

strong ai hypothesis says that

Billionaire and Virgin Group founder, Richard Branson’s wealth has tumbled by more than half since 2021 to $3bn as SPAC problems gave him ‘a big jolt from the side through COVID’

strong ai hypothesis says that

$2.3 billion hedge fund manager on his move from New York to Florida: ‘I know of no business that has generated long term success by driving away its highest paying customers’

strong ai hypothesis says that

The Trump donor whom Biden can’t fire is running the U.S. Postal Service directly into the ground—just what everyone warned about when he was confirmed during the pandemic

strong ai hypothesis says that

Air Canada pilots land a Boeing 737 in Idaho after another in-flight emergency

strong ai hypothesis says that

Trump Media’s accounting firm has a 100% deficiency rate from U.S. audit watchdog and counts Lingerie Fighting Championships as a client

strong ai hypothesis says that

America is debating whether to raise the retirement age—but boomers are already working well into their sixties and seventies

an image, when javascript is unavailable

MipTV to Tackle AI Concerns, FAST Channels and More Industry Issues

By Ben Croll

  • James Cameron ‘Knew Nothing About Guns’ When Making ‘The Terminator,’ but Then He Remembered: ‘This Is America, I Can Just Go Buy Them!’ 6 days ago
  • James Cameron Confirms He’s Planning to ‘Go Ahead With’ a ‘Fantastic Voyage’ Remake ‘Very Soon’ 6 days ago
  • Unsteady World Fuels Appetite for Small Screen Solutions as TV Biz Asks ‘What More Can We Do?’ 6 days ago

Stoplight with Play Button

Questions about the free ad-supported streaming TV services (FAST) model and concerns about AI will take the foreground at the many conference panel sessions and keynotes organized for this year’s MipTV .

If both are familiar subjects at the audiovisual market – as last year’s MipTV held a standing-room only FAST summit while the Mipcom “Unlocking AI” summit drew similar attention this past October – the interest they provoke reflects their topicality.

TVREV analyst (and coiner of the FAST acronym) Alan Wolk will moderate a cosmopolitan panel bringing together David Salmon (Tubi), Kasia Kieli (Warner Bros. Discovery Poland and TVN), Natalie Gabathuler-Scully (Vevo), Peyton Lombardo (3Vision), Robert Andrae (Google), Jennifer Batty (Samsung TVPlus EMEA) and Jordan Warkol (OTTera).

On the tech front, Google TV’s Faz Aftab and Vitrina AI CEO Atul Phadnis will present a content business strategies AI & Data session on Tuesday, ahead of the following day’s Tech and AI Innovation Summit, set to spotlight analysts Peter Robinson (Gone With) and Guy Bisson (Ampere Analysis), alongside media execs Tom Bowers (Hypothesis Media), Arash Pendari (Vionlabs AB) and Craig Peters (Getty Images).

“[AI will be] an important theme and motif of everything that happens in Cannes,” says AlixPartners managing director Mark Endemaño .

A Disney exec turned industry transformation expert, Endemaño will express concerns about AI in his MipTV opening keynote on Monday. “There’s no doubt that AI will take people’s jobs,” he says. “We’re seeing that already. The disruptive effect on the creative industries will be enormous.”

OpenAI’s recent introduction of the Sora text-to-video model – and CEO Sam Altman’s promise of a $7 trillion chipset investment – might account for change in tone between last October’s more genteelly-labeled summit and Endemaño’s upcoming address.

“Up until now, the impact has mostly been on entry-level jobs and on junior levels of productions,” he says. “Only Sora can take on things like stop motion, particularly in TV advertising. Those cheap and cheerful first uses will become more and more sophisticated over time, [reshaping the workflows for] things like concept art creation and previz.”

As for the FAST model, Endemaño sees cause for optimism as the wider industry dusts off older approaches and gives them a fresh coat of paint. Though he predicts widespread consolidation of most studio-backed streaming services, such a development could benefit both consumers and producers down the line.

“Consolidation will leave us with four or five key players,” Endemaño predicts. “[Leaving the] the studios to revert back to a way of getting their great content out in the most economically possible way, priced across windows, working with different distribution partners, whether that’s Netflix or Apple or Prime or Disney, or any pay-TV, free-TV or digital platform from around the world.”

“The truth of the matter is, content creators want their work to be seen by the biggest audience possible, and they like to be paid,” he continues. “And here, both the ad-supported model and bundling can help reach a much wider audience – and a scaled one at that.”

Fittingly, given this future forecast of more bundling, windowing and ads, the industry transformation expert has titled his opening address, “Back to the Future.”

More From Our Brands

James mcavoy is a terrifying host in ‘speak no evil’ trailer, a ‘shark tank’ star’s sky-high n.y.c. penthouse seeks $38 million, shohei ohtani could be cleared by a mizuhara guilty plea, the best loofahs and body scrubbers, according to dermatologists, ahs: delicate turns back the clock, unmasking another familiar foe, verify it's you, please log in.

Quantcast

IMAGES

  1. The strong AI and weak AI

    strong ai hypothesis says that

  2. Artificial General Intelligence (AGI): The 11 Best Strong AI Examples

    strong ai hypothesis says that

  3. What Is Strong (General) AI? Here Are 9 Practical Examples

    strong ai hypothesis says that

  4. Strong AI: The Next Frontier in Artificial Intelligence

    strong ai hypothesis says that

  5. 10 Mind-Blowing Strong AI Examples in 2024 2024

    strong ai hypothesis says that

  6. How to Write a Strong Hypothesis in 6 Simple Steps

    strong ai hypothesis says that

VIDEO

  1. Unveiling the Silurian Hypothesis: Earth's Earliest Advanced Civilization

  2. Hypothesis Testing Simplified: An overview lesson

  3. AI in Hypothesis Generation

COMMENTS

  1. What Is Strong AI?

    What is Strong AI? Strong artificial intelligence (AI), also known as artificial general intelligence (AGI) or general AI, is a theoretical form of AI used to describe a certain mindset of AI development. If researchers are able to develop Strong AI, the machine would require an intelligence equal to humans; it would have a self-aware ...

  2. Strong AI

    A Holistic Approach to AI. Strong AI is a term used to describe a certain mindset of artificial intelligence development. Strong AI's goal is to develop artificial intelligence to the point where the machine's intellectual capability is functionally equal to a human's. There are several fundamental differences between the chatbots of the ...

  3. What is Strong AI?

    Dec 24, 2021. Strong Artificial Intelligence (AI) is a type of artificial intelligence that creates mental capacities, cognitive processes, and behaviours that mimic those of the human brain. It is more like a philosophical approach than a practical one. It is predicated on the assumption that computers may be enhanced with the mental process ...

  4. Strong AI vs. Weak AI: What's the Difference?

    The major difference between weak and strong AI is that weak AI is programmed to perform a single task. The task could be very complex, like driving a car, or as simple as recommending movies to watch. All real-world examples of AI fall under the category of weak AI. Although AI chatbots like ChatGPT and Bing AI are very advanced, they are ...

  5. Towards Strong AI

    Strong AI—artificial intelligence that is in all respects at least as intelligent as humans—is still out of reach. Current AI lacks common sense, that is, it is not able to infer, understand, or explain the hidden processes, forces, and causes behind data. Main stream machine learning research on deep artificial neural networks (ANNs) may even be characterized as being behavioristic. In ...

  6. A philosophical view on singularity and strong AI

    We say that one event is associated with another if observing one changes the likelihood of observing the other. In statistics, a thriving, ... The strong AI hypothesis is that such computer programs are (technically) possible. Strong AI, as a research program, is the attempt to make computer programs that think with the same depth and richness ...

  7. Humans, super humans, and super humanoids: debating Stephen ...

    Hawking went on to say that efforts to build artificial intelligence (AI) ... The strong AI hypothesis of John Searle argues that, in the same way, that humans have minds, a properly programmed computer with the necessary inputs and outputs would have a mind . A thought experiment known as the "Chinese room" is at the heart of Searle's ...

  8. Hypotheses devised by AI could find 'blind spots' in research

    It signifies, he says, "'here's a kind of thing that looks true'. That's exactly what a hypothesis is.". Blind spots are where AI might prove most useful. James Evans, a sociologist at ...

  9. PDF the "strong AI hypothesis"

    That is, the "strong AI hypothesis" is confirmed. The identity of consciousness and pattern recognition has already been recognized by many AI researchers, but the lemma that it proves computability of consciousness has been neglected. ... Let's say that it finds square-shaped patches of zeros in a square array of numbers. I am sure that

  10. Rethinking Weak Vs. Strong AI

    The general AI ecosystem classifies AI efforts into two major buckets: weak (narrow) AI that is focused on one particular problem or task domain, and strong (general) AI that focuses on building ...

  11. Q & A: The future of artificial intelligence

    The terms "strong AI" and "weak AI" were originally introduced by the philosopher John Searle to refer to two distinct hypotheses that he ascribed to AI researchers. Weak AI was the hypothesis that machines could be programmed in such a way as to exhibit human-level intelligent behavior. Strong AI was the hypothesis that it would be valid to ...

  12. Why general artificial intelligence will not be realized

    However, many AI researcher have pursued the aim of developing artificial intelligence that is in principle identical to human intelligence, called strong AI. Weak AI is less ambitious than strong ...

  13. Is artificial intelligence capable of understanding? An analysis based

    Strong AI does not pay too much attention to the essential difference between the process of running a computational program and the process of human understanding. It presupposes in theory that purely formalized symbolic computation and manipulation can define, or even be equated with, human understanding, and argues that a computer capable of ...

  14. The Future of AI: Toward Truly Intelligent Artificial ...

    The Physical Symbol System Hypothesis: Weak AI Versus Strong AI. In a lecture that coincided with their reception of the prestigious Turing Prize in 1975, Allen Newell and Herbert Simon (Newell and Simon, 1976) formulated the "Physical Symbol System" hypothesis, according to which "a physical symbol system has the necessary and sufficient ...

  15. AI-Driven Hypotheses: Real world examples exploring the ...

    The integration of AI in hypothesis formation is an ongoing journey with vast potential. The collaborative efforts of humans and machines hold the promise of accelerating scientific discovery, unlocking new insights, and addressing complex challenges facing humanity. ... Douglas Shaw says 3 months ago (3/5) Interesting article. Reply. Rate this ...

  16. Strong AI

    Strong artificial intelligence (AI) may refer to a range of levels of artificial intelligence in prospective computational systems or to a concept in philosophy: . Computer science. Artificial consciousness: a hypothetical machine that has subjective conscious experience, sentience, and a mind;; Artificial general intelligence: a hypothetical human-level or stronger AI with the ability to ...

  17. Advocates of Artificial Intelligence as Behaviourists

    Strong AI bites the bullet and denies the distinction between behaviour and mind/intelligence: If a computer acts (or behaves) as if it's intelligent (or has a mind), then it is intelligent (or has a mind). In other words, even though I've just written the words "as if", there's no actual as if about it.

  18. Artificial general intelligence

    Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that can perform as well or better than humans on a wide range of cognitive tasks, as opposed to narrow AI, which is designed for specific tasks. It is one of various definitions of strong AI.. Creating AGI is a primary goal of AI research and of companies such as OpenAI, DeepMind, and Anthropic.

  19. Need a Hypothesis? This A.I. Has One

    OpenAI unveiled Voice Engine, an A.I. technology that can recreate a person's voice from a 15-second recording. Amazon said it had added $2.75 billion to its investment in Anthropic, an A.I ...

  20. Artificial intelligence and its natural limits

    The parallels between the workings of the human mind and the operations of artificial devices such as mechanical calculators and digital computers have led many computer scientists to back the "strong AI" hypothesis: that, in the words of John Searle, a computer can listen to a story told by a human, and if it answers questions about the story that would be correct answers if given by a ...

  21. Artificial Intelligence: A Clarification of Misconceptions, Myths and

    Specifically, we provide a perspective on AI in relation to machine learning and statistics and clarify common misconceptions and myths. Our paper is organized as follows. In the next section, we discuss the desired and current status of artificial intelligence including the definition of "intelligence" and strong AI.

  22. Artificial Intelligence

    Artificial intelligence (AI) is the field devoted to building artificial animals (or at least artificial creatures that - in suitable contexts - appear to be animals) and, for many, artificial persons (or at least artificial creatures that - in suitable contexts - appear to be persons). [] Such goals immediately ensure that AI is a discipline of considerable interest to many ...

  23. AI For Hypotheses?

    There's a ridiculous amount of stuff being written these days about artificial intelligence, and here I go adding more to the pile. I got to thinking about this item in Nature on the possibility of AI-driven hypothesis generation. I am probably classified as an overall AI skeptic, but that's a sliding scale, not least because we can't agree on what intelligence is to start with.

  24. Why comparing AI to "smart" humans is a flawed measurement

    Zoom out: The University of Pennsylvania's Kording suggests one specific way we could identify an AI that's smarter than an individual human: It would have to be "better than humans at dealing with completely new problems that emerge in this world." The "key is new," Kording adds — "having data from humans solving roughly the same problems ...

  25. Elon Musk: AI will be smarter than any human around the end of next

    332. On Monday, Tesla CEO Elon Musk predicted the imminent rise in AI superintelligence during a live interview streamed on the social media platform X. "My guess is we'll have AI smarter than any ...

  26. Tesla's Musk predicts AI will be smarter than the smartest human next

    Technology category ByteDance full-year profit jumps 60%, Bloomberg News says 7:11 AM UTC · Updated ago. Technology category Apple supplier IQE forecasts strong order book amid AI boom.

  27. Elon Musk Says There Could Be a 20% Chance AI Destroys Humanity

    NurPhoto/Getty. Elon Musk recalculated his cost-benefit analysis of AI's risk to humankind. He estimates there's a 10-20% chance AI could destroy humanity but that we should build it anyway. An AI ...

  28. Elon Musk says AI will be smarter than some humans by next year

    The Tesla CEO, in an interview on Twitter/X, has accelerated his forecast for the capabilities of AI, saying he expect large language models will surpass human intelligence by the end of 2025 ...

  29. Bees, like humans, can preserve cultural traditions

    Bees, like humans, can preserve cultural traditions. W HEN IT comes to architectural accomplishments, humans like to think they stand at the top of the pyramid. That is to underestimate the ...

  30. MipTV Looks Back to the Future

    Held on Monday, this year's Global FAST & AVOD Summit will accent the "Global" to explore ad-supported models from a more international perspective. Whereas 2023's inaugural version ...