What you want to understand in terms of trustworthiness is that these models aren’t perfect. For example, when you think about a trustworthy person, you don’t think that that person is perfect. You consider them a trustworthy person because they tell you when they’re not good, when they’re unlikely to perform well, when they’re likely to be late, etc. You consider this person trustworthy because they’re open. Trustworthiness is about being as open as you can.
In that sense, what we want to assure through soft XAI is that the user—or the person consuming these predictive capabilities—is aware of when the system can and should be trusted. When the system tells you that it can be trusted, it should also explain that it’s come to this conclusion based on the assumption that you’re feeding it the right data. When you’re feeding it data, it should tell you about any quality issues that it sees and thinks that you should be aware of. Some of these quality issues may have to do with biases.
For example, let’s say you want to analyze whether or not someone is going to be successful in performing in an MBA program at a given university. You have data on all the students who have applied and their GPAs during the program. When you have new applicants, you want to understand how they will perform relative to students who have gone through the program so you ask this question of the system. Say the system gives you a prediction that a given individual will likely not perform well in the MBA program, you want to go in and see why the system has come to this prediction.
A trustworthy system will tell you, “you have this data feature (say, gender) and for all the data that you fit in and there is an even distribution (so most of your population is, say, male) that most likely means that the system is highly biased toward a male population and you should be aware of it.” The system can try to correct this bias, tell you why doing so is important, and explain why you should try to continually improve this model. Trustworthiness in the observer sense means that it gives you as many tools to confirm that you can trust the system by telling you its flaws and—when it makes a good prediction—the confidence that it has for that specific prediction.
The approach we take right now at MindsDB in terms of confidences is, even on the prediction end, we’re not providing a single value. This is to help you understand that the systems aren’t deterministic. They don’t have the sole ability to predict the future. They can just give you an estimate and a distribution for that prediction and derive from those distributions what makes that distribution change (as to what is important for that particular prediction and whether the quality of that prediction is high or low). Trustworthiness comes with transparency and that is essentially what we want to aim for: systems that can give you predictions that are as transparent as possible and, if the predictions it gives you aren’t ones you can trust, you should know immediately so that you can take control at once.
Jorge Torres is the Co-founder & CTO of MindsDB. He is also a visiting scholar at UC Berkeley researching machine learning automation and explainability. Prior to founding MindsDB, he worked for a number of data-intensive start-ups, most recently working with Aneesh Chopra (the first CTO in the US government) building data systems that analyze billions of patients records and lead to highest savings for millions of patients. He started his work on scaling solutions using machine learning in early 2008 while working as first full time engineer at Couchsurfing where he helped grow the company from a few thousand users to a few million. Jorge had degrees in electrical engineering & computer science, including a masters degree in computer systems (with a focus on applied Machine Learning) from the Australian National University.