On May 8, 2018, Google I/O was held at Shoreline Amphitheatre in Mountain View, California. If you are wondering what Google I/O is, don’t worry, I’ve got your back.
“Google I/O brings together developers from around the globe annually for talks, hands-on learning with Google experts, and the first look at Google’s latest developer products.”
In the Keynote, Sundar Pichai, the CEO of Alphabet Inc. (Google’s parent company), shared the then-latest developments that Google had been working on. One of the projects that he spoke about was something that maybe no one saw coming; an application of Artificial Intelligence (AI), soon to be on our own smartphones, that left the world in awe. The project was called ‘Google Duplex’. This initiative enables AI to place a phone call to a hair salon, converse just like us humans, and book a haircut appointment – and the part where your jaws drop is that all of this takes place in the background on your phone, without any intervention of yours! All you have to do is utter:
“Ok Google, make me a haircut appointment on Tuesday morning anytime between 10 and 12.”
Don’t believe me? Catch the clipping at the end of the article!
The reason this is a big thing is that our AI just passed the Turing Test! If you are unaware of what a Turing Test is, imagine that you ask a question to two entities, one of which is a human and another is a machine, and you don’t know which one is the human and which machine. If the machine answers the question in a way indistinguishable from how a human would answer and you are not able to make out that it’s a machine, my friend, the machine passed the Turing Test. This is the basic idea behind a Turing Test – to find out if the machine or rather the AI is at least as good as a human. To know more, feel free to check out this link.
So it might have crossed your mind, how does AI actually do that? What more can AI do? Would the same AI model be used to converse with you and drive your car? To answer these questions, let’s dig deeper into the branches of AI. These branches will help us understand the type of solutions to apply for the kind of task at hand. Without due delay, let’s get to it.
The branches of AI are:
1. Expert systems
When you are facing a problem, say cavities in your teeth, the first thing you do is approach a dentist. You surely wouldn’t think of visiting a general physician. Why did you choose to go to a dentist rather than a general physician? Simply because a dentist is an expert in the treatment of the teeth. When it comes to AI, we can adopt a similar approach, that is, use an AI system which is an expert for the problem we intend to solve.
Thus, an Expert System, as the name suggests, is a program that specializes in one particular task like a human expert. They are designed primarily to solve intricate problems and to provide a human-like decision-making ability. It makes use of a set of rules (called inference rules) that are defined by humans and a storehouse of knowledge (called knowledge base) to solve the task at hand. The knowledge base is fed with data that is added by humans experts of a particular domain. A non-expert human uses this program to obtain some information. Thus, the primary purpose of these systems was not to replace humans for that task, but rather to assist humans to obtain better results and a higher quality of work.
The first Expert System was developed in the 1970s and it can be attributed as the first successful approach for AI. A few examples of this branch of AI are:
- DENDRAL: A chemical analysis expert system used in organic chemistry to detect organic molecules yet unknown.
- CaDeT: A diagnostic support system that can detect cancer at early stages.
2. Machine learning (ML)
The second term that comes to our mind when we talk about AI is ‘Machine Learning’. Human Learning is the process of increasing our knowledge and the ability to apply knowledge when a relevant situation is called for. Upon learning something, we can not only apply the knowledge towards a situation already explored but also towards one that we might never have faced. In the latter situation, on the basis of our previous experiences, we make a calculated guess in an attempt to solve the problem.
Similarly, Machine Learning is the ability of the machine to learn without being explicitly programmed, that is, developing the ability to make calculated guesses towards a previously unexplored observation. To enable machines to learn, we feed it historical data as its training experiences. It is based on the idea that machines can learn from past data, identify patterns, and make decisions using algorithms. These algorithms are designed in such a way that they can learn and improve their performance automatically (even we humans do that!)
Depending on the learning approach we choose (or is feasible), there are 3 categories:
A. Supervised learning
Consider the situation where a student is first taught a subject by the teacher. The student first learns it and then is tested on the concepts taught. The questions in the test can be the same as that taught by the teacher, or different but based on the concept previously taught. When a similar approach is used for machines, it is called ‘Supervised Learning’. Our AI system is evaluated based on its performance in the test. An example of this can be ‘Stock Price Prediction.’
B. Unsupervised learning
Consider a parallel situation where a student was not taught but was told to directly appear for the test. The test is to segregate the questions in the paper similar to one another. When this approach is used for machines, it is called ‘Unsupervised Learning’. The test seems strange, right? This approach is suited for tasks like data clustering or anomaly detection where each cluster of data shows a similar trend (like, on a rainy day, the weather conditions would be similar) and all anomalies behave in a similar fashion. An example of this can be ‘Identifying Bowlers and Batsmen.’
C. Reinforcement learning
Consider the third situation just like unsupervised learning, the student is not taught and is directly made to appear for the test. The difference being that the student needs to score at least a certain score in order to pass, until that time, reappear for the test again and again. The student is told only the marks he/she scored. Well, we all know that the student has no choice but to start answering by trial and error. But we know that the student will learn about various trends over time, and after every test, will improvise his/her answering strategy to figure out the optimal strategy to finally pass the test. When this approach is applied to machines, it’s called ‘Reinforcement Learning.’ An example of this can be ‘AI playing ping-pong.’
3. Robotics
Before understanding Robotics and its association with AI, let’s first look at what a robot actually is. A robot is simply a machine that is capable of carrying out a complex series of actions automatically. It can be programmed to do so using a computer. It can be controlled using an external device like a remote or its control can be embedded within itself. When we think of robots, we think of them to be something that looks like a human, where the reality is that it can look just like any other toy car.
When this robot is allowed to think and intelligently choose its actions autonomously by a piece of code, it is an AI-powered Robot. These intelligent robots can now perform the task with their own intelligence. It is generally found that if the tasks that a robot must perform get more complex, it becomes necessary to make use of AI algorithms. Nowadays, AI and machine learning are being applied to robots to manufacture intelligent robots which can also interact socially like humans. One of the best examples of AI in robotics is the ‘Sophia Robot’.
4. Computer vision (CV)
A greater part of our learning process involves our vision. We mostly learn about our environment and make our own interpretations based on what we see. Our ability to see is because the cornea focuses the light entering the eye and forms the image of what we see onto the retina. This image is then converted to electrical pulses and sent to the brain to process the image and perceive what it sees.
Can we apply this technique with machines? The answer is – Yes, we can, with AI being the brain and a camera as its eyes. This is Computer Vision.
“Computer Vision, often abbreviated as CV, is defined as a field of study that seeks to develop techniques to help computers “see” and understand the content of digital images such as photographs and videos.” – Jason Brownlee, Founding Researcher at Machine Learning Mastery
Image Recognition is a subset of Computer Vision that helps us identify objects, people, places, features, and anything that an image or a video can provide us with. While we humans can effortlessly process and recognize what we see, machines face great difficulty doing this. We will explore how this can be done in the articles to come. Some of the examples where image recognition can be used are:
- Self-Driving Cars (the cameras on the cars constantly look for and learn about the surroundings and depending on what it observes, controls the car to turn, move, or stop)
- Face unlock searches for our faces in the front-facing camera feed and if found, prompts to unlock the phone.
- Image captioning (an image fed to the AI tells us as to what it sees in the image, say a busy street, a concert, or a person approaching you)
5. Planning
Imagine that you want to buy some household stuff from the supermarket near your house. From your house, you can reach the supermarket by taking 3 different routes. Say you are working on a tight schedule and have a limited time period in which you must return back to your house. Undoubtedly, you will choose the shortest route from your house to the supermarket and back. This is Planning.
In AI Planning, an end goal needs to be achieved under some given constraints and contexts. As an analogy, the contexts in our supermarket examples are the knowledge of the routes and their respective time durations, the constraint was the limited time period and the end goal was to make a purchase and return to the house without violating the constraints. The constraint could also be to opt for the cheapest path. Whatever be the constraints, the task remains to be that of optimizing the solution up to the best or any agreeable solution. This can be done by exploring the unknown or exploiting the extant knowledge or even both.
6. Natural language processing (NLP)
Another great method by which we humans learn is when we read from somewhere and hear others speak. Both of these require us to know the language and its rules. This is what we call ‘natural language.’ But, machines can’t understand our natural language. Here comes NLP.
NLP is a field of artificial intelligence in which computers analyze, understand, and attempt to derive meaning from natural language in a smart and useful way. The event from Google I/O 2018 that I discussed in the beginning of this article falls into this branch of AI. NLP enables users to communicate with the machine directly in natural language. Woah! Cool isn’t it?
If you wish to solve a problem that is of one (or even multiple) of the following categories, NLP is the solution.
A. Machine translation
Ever used Google Translate? Yes, that’s an example of Machine Translation. Machine Translation is the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language.
B. Content extraction
Content Extraction is the task of obtaining structured data from unstructured or semi-structured data. What this basically means is to obtain relevant information from all that is available. In this world, around 85% of the data in this world is unstructured and a large portion of it is textual data. Thus, we can apply NLP techniques on this data and use it to gather relevant information or gain business insights.
C. Question answering
Question Answering, as the name suggests, is concerned with building systems that automatically answer questions posed by humans in a natural language. When we ask Siri, Google Assistant, or Cortana a question, it produces an answer using Q&A NLP techniques.
D. Text classification
If we want to classify our text into certain categories based on what the text is, we can use Text Classification NLP techniques. Examples of this can be classifying Email as spam or not, finding out whether the movie review was positive or negative, and the list is endless!
E. Text generation
We can enable AI to form sentences for us. This falls under the category of text generation. It is also sometimes called Natural Language Generation (NLG). Example: given a topic, the machine writes an entire essay about it.
F. Speech
Needless to elaborate on this! Google Assistant, Siri, Bixby are all applications of Speech. There are 2 parts to it: Speech-to-Text and Text-to-Speech. When we say:
“Hey Siri, do you watch Game of Thrones”
what we speak gets converted to text for our iPhone to understand. This is Speech-to-Text. Then, when Siri prepares the answer to our question, it’s still as plain text. It uses Text-to-Speech techniques and we will hear Siri saying:
“Yes. I’d ask Jon Snow for some hints, but he knows nothing.”
Image by Jash Rathod. This is Apple’s Voice Assistant — Siri
That’s right! Siri does actually say that.
It’s amazing what AI can actually do. But why do we need to do all this? What is the use of this except for fancy lab projects or Iron Man toys? Well, these developments can help in making our lives easier with personal assistants, can help businesses function, gain better insights, and/or automate some of their processes like chatbots, and can enable us to find what we are looking for on the internet with the help of AI-powered search engines like Google.
This was just a glimpse of what AI can do. In the articles to come, we will explore how these systems can be developed and how businesses can leverage these tools to take their businesses to greater heights.