What Is Machine Learning (The Dawn of Artificial Intelligence)
This article is brought to you
in thanks to Brilliant, a problem-solving website
that teaches you skills essential to have
in this age of automation.
In the past videos in this AI series,
we have delved quite deep
into the field of machine learning,
discussing both supervised and unsupervised learning.
The focus of this video, then,
is to consolidate many of the topics
we've discussed in the past videos
and answer the question posed
at the start of this machine learning series,
the difference between artificial intelligence
and machine learning.
As a quick recap, over the past two videos in this series,
we have discussed both supervised and unsupervised learning,
with them both being subsets
of the field of machine learning.
Supervised learning is when we have labeled,
structured data, and the algorithms we are using
determine the output based on the input data.
Unsupervised learning, on the other hand,
is for unlabeled, unstructured data,
where our algorithms of choice are tasked
with deriving structure from unstructured data
to be able to predict output data
based on input data.
Additionally, both supervised and unsupervised learning
are further subsectioned.
One, regression, a supervised learning approach
where the output is the value of a feature based
on the correlation with another feature, that being
on a continuous line of best fit our algorithm determines.
Two, classification, a supervised learning approach
where the output is the label of a data point based
on the category the point was in.
There are a number of discrete categories
whose decision boundaries are determined based
on the algorithm we choose.
Three, clustering, an unsupervised learning approach
where we must discover
the categories' various data points line based
on the relationships of their features.
Four, association, an unsupervised learning approach
where we must discover the correlations
of features in a dataset.
As stated in the past, while it is nice
to view these topics in their own little bubbles,
often, there's a lot
of crossover between various techniques,
for instance, in the case of semi-supervised learning.
This wasn't discussed previously,
but it is essentially when our dataset contains
both labeled and unlabeled data,
so on this instance, when we have both these types of data,
we may first cluster the data
and then run classification algorithms on it,
or a multitude of other combinations of techniques.
So now, with the recap out of the way,
and a general understanding of the types
of machine learning, and the knowledge
of all the terminology we have covered
in the past videos, we can now begin
to decipher what the term machine learning really means,
and how it relates
to artificial intelligence in other fields.
As stated in the first video in this series,
the term machine learning was coined
by computing pioneer Arthur Samuel,
and is a field of study that gives computers the ability
to learn without being explicitly programmed.
With such a broad definition,
one can argue, and would be correct in stating,
that all useful programs learn something.
However, the level of true learning varies.
This level of learning is dependent
on the algorithms the programs incorporate.
Now, going back a few steps,
an algorithm is a concept that has existed for centuries,
since the dawn of human civilization.
It is a term referring to a process
or set of rules to be followed
in calculations or other problem solving operations.
While anything can be referred to as an algorithm,
such as a recipe for a food dish
or the steps needed to start a fire,
it is a term most commonly used
to describe our understanding of mathematics,
and how it relates to the world around us,
the informational fabric of reality.
Progressing forward, with the rise of computing,
essentially a field built on the premise
of speeding up mathematical calculations,
gave way to the birth of computer science
in which algorithms now define the processing,
storing, and communication of digital information.
The ability to iterate through algorithms
at the lightning fast speed computers operate at
over the past century has led
to the implementation and discovery of various algorithms.
To list a few, we have sorting algorithms
like bubble sort and quick sort,
shortest path algorithms like Dijkstra and A*,
and this list can go on and on for a variety of problems.
These algorithms, while able to perform tasks
they appear to be learning,
are really just iteratively performing
pre-programed steps to achieve the results,
in stark contrast to the definition of machine learning,
to learn without explicit programming.
Reflecting back on the past few videos
in this series in which we've discussed
the types of machine learning, both supervised
and unsupervised, there's one common thread
that runs through them both,
to utilize a variety of techniques,
approaches, and algorithms to form decision boundaries
over a dataset's decision space.
This divided up decision space is referred to
as the machine learning model,
and the process of forming the model,
that being the decision boundaries in the dataset,
is referred to as training.
This training of the model draws parallels
to the first primary type of knowledge
we as humans display, declarative knowledge.
In other words, memorization,
the accumulation of individual facts.
Once we have a trained model
and it is exhibiting good accuracy on training data,
then we can use that model for the next step, inference.
This is the ability to predict the outputs,
whether that be a value or a category, of new data.
Machine learning inference draws parallels
to the second primary type of knowledge we exhibit,
imperative knowledge, in other words, generalization,
the ability to deduce new facts from old facts.
Additionally, as the model encounters new data,
it can use it to train further,
refining its decision boundaries
to become better at inferring future data.
Now, this whole process we just discussed is defining
the second most widely-used definition of machine learning,
stated by Dr. Tom Mitchell of Carnegie Mellon University.
the computer said to learn from experience E
with respect to some class of tasks T
and performance measure P if its performance
at tasks in T as measured buy P improves with experience E.
So, while it is correct in stating
that all useful programs learn something from data,
I hope the distinction between the level
of learning machine learning models
and typical algorithms is now more clear.
The rise of machine learning,
domain-specific weak artificial intelligence,
as it is referred to, has been decades in the making.
But first, what is artificial intelligence?
As I hope you've learned
from past videos in this series,
AI refers to any model that can mimic,
develop, or demonstrate human thinking,
perception, or actions.
In our case, this refers to computing-based AI.
In our first two videos in this AI series,
the history and birth of AI, we saw the development
of the field of artificial intelligence
from trying to develop a more general AI,
also called a strong AI, to focusing
on acquiring domain-specific expertise in various fields.
This turning point in the field of AI was due
to expert systems in the '80s,
essentially complex conditional logic,
that being if-then-else statements
that were tailored for a respective field of knowledge
by experts in that field.
At the end of that birth of AI video,
the time period we left off on was the AI bust,
which was at the start of the '90s,
a low point in the AI hype cycle
due to over-promises made
on what expert systems could really do.
After this point, the development
of intelligent systems went into the background
due to the lack of funding and mainstream interest
in the field, and the rapid technological progress made
in so many other fields, from the invention of the internet,
commercialization of computers, mobile phones.
The list can go on and on.
During this time period in the '90s,
expert systems and algorithms originally developed
by AI researchers began to appear
as parts of larger systems.
These algorithms had solved
a lot of very difficult problems,
and their solutions proved to be useful
throughout the technology industry,
such as data mining, industrial robotics,
logistics, speech recognition, banking software,
medical diagnosis, and Google's search engine,
to list a few.
However, the field of AI received little
or no credit for these successes
in the 1990s and early 2000s.
Many of the field of AI's greatest innovations
had been reduced to the status
of just another item
in the tool chest of computer science.
As Nick Bostrom, author of "Superintelligence,"
stated in 2006, "A lot of cutting-edge AI has filtered
"into general applications, often without being called AI
"because, once something becomes useful enough
"and common enough, it is not labeled AI anymore."
This is similar to what John McCarthy,
the father of AI, also stated back in the '80s.
So then, what started changing in the late 2000s
and at the start of this decade
that propelled the field of AI once again to the forefront?
Well, first off, we can thank the increase
of computing power and storage,
infinite computing, big data,
and various other topics we've covered in videos past.
These advances allowed for larger amounts
of data to train on, and the computing power
and storage needed to be able to do so.
Now, one can say that finding structure
in data is a human condition.
It's how we've come so far,
and these advances gave computers what they require
to do so as well.
Now, as you can see here,
the difference between various AI breakthroughs
and the date the algorithm were initially proposed
is nearly two decades.
However, on average, just three years
after the dataset for a set problem becomes available
does the breakthrough happen,
meaning that data was a huge bottleneck
in the advancement of the field of the AI.
The next reason for the rise
of machine learning is due to the rise
of a particular tribe of machine learning, connectionism,
or, as many commonly know of it, deep learning.
Before we delve into deep learning,
let's first discuss the other tribes of AI.
There are five primary tribes of machine learning,
with tribes referring to groups of people
who have different philosophies
on how to tackle AI-based problems.
We have discussed many of these tribes in past videos,
but this list below should make them more concrete.
The first tribe is the symbolists.
They focus on the premise of inverse deduction.
They don't start with a premise to work towards conclusions,
but rather use a set of premises and conclusions,
and work backwards to fill in the gaps.
We discussed this in the history of AI video,
and will focus on it more heavily
in a future video on artificial human intelligence.
The second tribe is the connectionists.
They mostly try to digitally re-engineer the brain
and all of its connections in a neural network.
The most famous example of the connectionist approach
is what is commonly known as deep learning.
We discuss parts of the rise of connectionism
in the birth of AI video.
The third tribe is the evolutionaries.
Their focus lies on applying the idea
of genomes in DNA and the evolutionary process
to data processing.
Their algorithms will constantly evolve
and adapt to unknown conditions and processes.
You have probably seen this style of approach used
in beating games such as Mario,
and we will discuss it much more
in an upcoming video on reinforcement learning.
The fourth tribe is the Bayesians.
Bayesian models will take a hypothesis
and apply a type of a priori thinking,
believing that there will be some outcomes
that are more probable.
They then update their hypothesis
as they see more data.
We discussed a bit more about this line
of thinking in our video on quantum computing.
The fifth and final tribe is the analogizers.
This machine learning tribe focuses
on techniques to match bits of data to each other.
We have been discussing this approach quite a bit
in the past few videos,
with many core concepts of supervised
and unsupervised learning tied to it.
How I think it would be best
to represent these tribes of artificial intelligence
and machine learning is in a bubble diagram format.
To start with, we have our primary AI bubble
and machine learning bubble.
We show this relationship in the first video
in our machine learning series.
Now, after this, we can add the tribe bubbles.
They are constantly moving and overlapping with each other
to produce novel ideas,
and shrinking and growing in popularity.
Once a tribe gets mainstream popularity,
such as connectionism, it pops, so to speak,
producing a new field in its wake.
In the case of connectionism, it was deep learning.
Keep in mind that, just because connectionism grew
into deep learning doesn't mean that the entire tribe
of connectionism is centered around deep learning.
The connectionism bubble and many connectionists
will continue researching new approaches
utilizing connectionist theory.
Also, deep learning isn't all connectionism.
There are many symbolist
and analogist philosophies incorporated within it as well.
You can learn more about the five tribes
of machine learning in Pedro Domingos' book
"The Master Algorithm," which goes very in depth
into the topics we just talked about,
and also goes over topics we will cover
in future videos in this series.
Coming back on topic,
so then, what is the difference
between machine learning and artificial intelligence?
Nothing and everything.
While machine learning is classified
as a type of AI since it exhibits the ability
to match and even exceed human-level perception
and action in various tasks,
it, as stated earlier, is a weak AI
since these tasks are often isolated from one another,
in other words, domain-specific.
As we've seen, machine learning can mean many things,
from millions of lines of code
with complex rules and decision trees
to statistical models, symbolist theories,
connectionism and evolution-based approaches,
and much more, all with the goal
to model the complexities of life,
just as how our brains try to do.
With the advent of big data,
the increases in computing power and storage,
and the other factors we discussed earlier
and in videos past took these models
from simpler iterative algorithms
to those involving many complex domains
of mathematics and science working together in unison,
such as knot theory, game theory,
linear algebra, and statistics, to list a few.
One important note to touch on with these models,
no matter how advanced the algorithms used,
is best said through a quote
by famous statician George Box,
"All models are wrong, but some are useful."
By this, it is meant that, in every model,
abstractions and simplifications are made
such that they will never 100% model reality.
However, simplifications of reality
can often be quite useful in solving many complex problems.
Relating to machine learning,
this means we will never have a model
that has an accuracy of 100%
in predicting an output in most real world problems,
especially in more ambiguous problems.
Two of the major assumptions made
in the field of machine learning
that is a cause of this is that, one,
we are assuming that the past,
that being the patterns of the past, predict the future,
and two, that mathematics
can truly model the entire universe.
Regardless of these assumptions,
these models can still be very useful
in a broad array of applications.
We will cover these grander societal impacts
of weak intelligence in an upcoming video
on the evolution of AI.
Additionally, a method that has been attributed
to a major rise in the accuracy of models,
and something we mentioned earlier, is deep learning,
which we will cover in the next set
of videos in this AI series.
Now, before concluding,
one important fact that I want to reiterate,
and as stated in the disclaimer
at the start of all my AI videos,
is that my goal here is to try
and simplify in reality very complex topics.
I urge you to seek out additional resources
on this platform and various others
if you wish to learn more on a much deeper level.
One such resource I use and highly recommend is Brilliant.
Do you want to learn more about machine learning,
and I mean really learn how these algorithms work,
from supervised methodologies such as regression
and classification, to unsupervised learning and more,
then brilliant.org is the place for you to go.
Now, what we love about how the topics
in these courses are presented is that,
first, an intuitive explanation is given,
and then you are taken through related problems.
If you get a problem wrong,
you see an explanation for where you went wrong,
and how to rectify that flaw.
In a world where automation
through algorithms will increasingly replace more jobs,
it is up to us as individuals
to keep our brains sharp and think
of creative solutions to multi-disciplinary problems,
and Brilliant truly is a platform
that allows you to do so.
For instance, beyond the courses Brilliant offers,
every day, there's a daily challenge
that can cover a wide variety
of topics in the STEM domain.
These challenges are crafted
in such a way in which they draw you in
and then allow you to learn a new concept
through their intuitive explanations.
To support Futurology and learn more about Brilliant,
go to brilliant.org futurology
and sign up for free.
Additionally, the first 200 people
that go to the link will get 20% off
their annual premium subscription.
(soft electronic music)
At this point, the video has concluded.
We'd like to thank you for taking the time to watch it.
If you enjoyed it, consider supporting us
on Patreon or YouTube membership to keep this brand growing,
and if you have any topic suggestions,
please leave them in the comments below.
Consider subscribing for more content,
and check out our website
and our parent company EarthOne for more information.
This has been Encore, you've been watching Futurology,
and we'll see you again soon.
