Predictive AI Layers for Databases

ODSC - Open Data Science
4 min readNov 17, 2020

Anyone that has dealt with Machine Learning (ML) understands that data is a fundamental ingredient to it. Given that a great deal of the world’s organized data already exists inside databases, doesn’t it make sense to bring machine learning capabilities straight to the database itself via predictive AI layers?

Database users meet the most important aspect of applied machine learning, which is to understand what predictive questions are important and what data is relevant to answer those questions.

Bringing machine learning to those who know their data best can significantly augment the capacity to solve important problems.

To do so, we have developed a concept called AI-Tables.

AI Tables

AI-Tables differ from normal tables in that they can generate predictions upon being queried and returning such predictions as if it was data that existed in the table. Simply put, an AI-Table allows you to use machine learning models as if they were normal database tables, in something that in plain SQL looks like this:

SELECT <predicted_variable> FROM <ML_model> WHERE <conditions>

Automated Machine Learning and AI Tables

Automated machine learning (AutoML) makes the complex Machine Learning process from Data Acquisition to making a Prediction simple. All the steps in between are abstracted by an AutoML platform.

AI-Tables are also using the power of AutoML and allow users to train and test neural-networks based Machine Learning models with the same knowledge they have of SQL.

The AutoML engine behind AI-Tables is powered by a MindsDB Open-Source Pytorch based platform.

On top of that, it has Explainability capabilities that allow users to get insights into their Machine Prediction accuracy score and evaluate its dependencies. For example, users can estimate how adding or removing certain data would impact on the effectiveness of the prediction. It can be done through a database queries metadata or using a graphical user interface.

Those users, who want to have control over ML model feature engineering would be able to bring their own models to MindsDB AI-Tables as well.

How predictive AI layers work

The whole solution consists of two important parts:

  1. The Machine Learning models are exposed as database tables (AI-Tables) that can be queried with the SELECT statements.
  2. The ML model generation and training are done through a simple INSERT statement.

The following diagram illustrates this process:

The resource-intensive Machine Learning tasks like model training are executed on a separate MindsDB server instance so that the Database performance is not affected.

To really sink in this idea, let us expand the concept through an example.

The Example of Predictive AI Layers

Imagine that you want to solve the problem of estimating the right price for a car on your website that has been selling used cars over the past 2 years.

The data is persistent in your database inside a table called used_cars_data where you keep records of every car you have sold so far, storing information such as: price, transmission, mileage, fuel_type, road_tax, mpg (Miles Per Gallon), and engine_size.

Since you have historical data, you know that you could use Machine Learning to solve this problem. Wouldn’t it be nice if you could simply tell your database server to do and manage the Machine Learning parts for you?

At MindsDB we think so too! And AI-Tables baked directly to your database are here to do exactly that.

You can for instance with a single INSERT statement, create a machine learning model/predictor trained to predict ‘price’ using the data that lives in the table sold_cars and publish it as an AI-Table called ‘used_cars_model’.

INSERT INTO
mindsdb.predictors(name, predict, select_data_query)
VALUES
('used_cars_model', 'price', 'SELECT * FROM used_cars_data);

After that you can get price predictions by querying the generated ‘used_cars_model’ AI-Table, as follows:

SELECT price,
confidence
FROM mindsdb.used_cars_model
WHERE model = "a6"
AND mileage = 36203
AND transmission = "automatic"
AND fueltype = "diesel"
AND mpg = "64.2"
AND enginesize = 2
AND year = 2016
AND tax = 20;

As you can see with AI-Tables, we are aiming to simplify Machine Learning mechanics to simple SQL queries, so that you can focus on the important part; which is to think about what predictions you need and what data you want your ML to learn from to make such predictions.

How to explore AI Tables

Currently, AI-Tables are working in the following databases and this list is constantly growing:

MySQL

PostgreSQL

MariaDB

Clickhouse

Additionally, you may visit MindsDB Github page, read the documentation, and ask questions at the community forum, to learn more about predictive AI layers.

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.