At MindsDB, we add new and improve old product features on a daily basis, whether it is in the MindsDB graphical user interface called the Scout, the server or our AutoML framework. Take a look below to see what's new and available in the past several releases from January.
When it comes to the user experience, we've gone through a few iterations to improve the MindsDB Scout design. The main focus was a minimalistic design approach to create a simple, interactive and highly usable dashboard.
The MindsDB Scout dashboard provides essential information on the data quality, data type and value distribution of your model presented visually.
We’ve included a new pie chart visualization that shows overall variable data quality. The new pie chart alongside the bar charts that present the variable occurrences in the dataset will provide the user great analysis about their data quality.
The first question that comes to the user’s mind after getting the output probabilities from their model is to “How can I measure the model’s effectiveness?” That’s where the Confusion Matrix (error matrix) comes in to help and describe the performance of a model.
A new feature shows the Confusion Matrix table layout and helps the user to visualize the performance of the learning classification.
The above picture shows how MindsDB Scout displays prediction percentages per different values diagonally in the table.The percentages of the predicted values are highlighted in green so it is easy to see. At the bottom of the confusion matrix grid, there is an option to use pagination starting from 5 to 50 different values.
To improve and secureMindsDB Scout’s connections, we’ve included the access token option, so the application can successfully authenticate and authorize access to remote MindsDB Server APIs.
If you’d like to try MindsDB Scout and have some interesting data on which you want to successfully apply predictive analytics you can download MindsDB Scout to get answers to any questions may you have.
There is a new argument available in the predict interface called run_confidence_variation_analysis. When run_confidence_variation_analysis is included in the `.predict` call, the prediction will run additional analysis where it determines confidence variations by specifying how the confidence would decrease or increase based on the columns that are present in the prediction.
This feature only works when a user makes a single value prediction and provides the `when` parameter values. Here is a simple predict call and response example with provided run_confidence_variation_analysis parameter:
So, the result object returned by calling the predict will contain an additional key called confidence_influence_scores that provides confidence_variation_score with values between -100 and 100.
The above feature is also available for visualization inside MindsDB Scout. When a user tries to run a query for the specific condition, the visualizations for predicted value will contain the Confidence Influence bar chart. If the bar is green as in the example below that means that the model is more certain about the predicted value, and less so for the negative, the red one.
A new key model_result, has been added to the TransactionOutputRow explain method. It gives the raw value + confidence produced by the underlying ML model. We usually recommend using the prediction that MindsDB gives instead of the raw prediction, but in some situations the plain ML model might be more accurate than MindsDB. Let's look at a simple example where we will try to predict daily temperature:
Should return additional model_result containing value and confidence:
Starting from MindsDB 1.11.2 version, Ludwig has been made into an optional dependency, since we currently have better benchmarks with the Lightwood. This makes the installation quicker and less prone to failure. At the end, you can still install mindsdb+ludwig by running:
You can use that instead of just pip install mindsdb.
Lightwood should now learn to make a better prediction with missing values, due to a newly added dropout training mechanism, which randomly removes some of the input features to better teach the neural network to work with incomplete information.
Also, we now use k-fold cross validation for training in order to get models that are more generalizable.
We now train multiple machine learning models (e.g. a neural network and a gradient boosting regressor or classifier) and give the user results from the model which we judge to be most accurate/confident for a given prediction.
In the latest version, Lightwood improves the way it handles text. It adopts various strategies, from basic TF-IDF vectorization to feature extraction using transformer architectures such as BERT and GPT2.
Also, we've started working on a new functionality that should enable MindsDB to take audio files as an input.
MindsDB, alongside Lightwood and MindsDB Scout, shall remain platform-independent. There were issues with the installation of Lightwood on Windows but they are fixed in the latest release. Now Lightwood and MindsDB can be installed using pip from the latest master branch: