Machines cannot teach themselves

  • Written By Gaurav Bhardwaj
  • 17/06/2019

There has been a lot of talk around self-taught machines. With this blog we want to demystify the myth of machines being able to teach themselves.

Machines cannot teach themselves; a machine can only learn from the information provided by humans, for example engineers and programmers, using algorithms (also developed by humans). Machine ‘learning’ refers to the application of statistical models in order to find a solution (which are certain numbers called weights) to solve a particular problem. Today, there are two predominant approaches to machine learning, supervised and unsupervised.

Supervised machine learning uses input variables together with an output variable to create mappings and correlations between the inputs and the output with the help of an algorithm. 

Unsupervised machine learning is a technical term that refers to a class of machine learning techniques where a machine is shown only inputs to try and come up with an answer (outputs). The closest we are able to get to machines to teaching themselves is through unsupervised machine learning.

Clustering is one example of unsupervised learning. Let’s say we want to group similar customers, but we don’t know what the similarities are. Unsupervised learning will analyse all the information we have for the customers and come up with all existing similarities from which we can then select the ones most relevant to our purposes. A similarity can be in spending habits, products purchased, age group, location etc.

Another popular technique of unsupervised learning is the Principal Component Analysis (PCA). In simple terms, PCA can be used to find predominant attributes within large data sets. PCA tries to preserve the attributes that have more variation and remove the non-essential attributes with fewer variation. For example, a product might have hundreds of attributes like shape, colour, size, weight, power, price etc. PCA can help us find out which of these attributes is the most important.

Drawbacks of unsupervised machine learning

A common problem with unsupervised learning is interpretability and verifiability. It’s not easy to interpret unsupervised learning because the algorithm transforms the original input into a new representation (for example f1 and f2 could be the output to the previous example of a product’s predominant attributes). This is why in most cases, the output from the unsupervised learning analysis is fed to another Machine Learning algorithm that will produce the final result -an actual answer to our question which might be whether the product should go into mass production or not. Similarly, the results of unsupervised learning cannot be readily verified as there is no prior knowledge of the unsupervised method.

If you would like to find out more about supervised and unsupervised machine learning, then give us a call on +44 (0)203 475 7980 or email us at marketing@whishworks.com.

Other useful links:

The Business Sense of Artificial Intelligence

Data Analytics Explained

7 steps to Predictive Analytics

Latest Insights

Blogs

Introduction to: Event Streaming

In this blog we introduce the key components of event streaming, including outlining the differences between traditional batch data processing and real-time event streaming.

Dynamic Overlay for PDF Template
Blogs

Developer’s guide: creating a dynamic overlay for a PDF template

In this blog, we provide a step-by-step solution to dynamically changing the template of a PDF document using the open source software PDFbox.

Infographic Kafka banking
Blogs

Transforming Banking with Apache Kafka

In this blog (and infographic) we summarise the key takeaways from that webinar, showcasing how forward-looking banks are getting ahead of the curve with real-time streaming.