Recall & Precision Problem UTA7 Statistical Analysis

Welcome to the GitHub repository for our paper “BreastScreening-AI: Evaluating Medical Intelligent Agents for Human-AI Interactions” (doi.org/10.1016/j.artmed.2022.102285) publised in the top tear journal (Q1 in 2022), the Artificial Intelligence in Medicine (AIIM) journal of Elsevier! This repository aims to assemble a set of methods/ for our statistical analysis of the Recall & Precision problem from UTA7. For the purpose, we are computing several heatmaps representing our Confusion Matrix (or Error Matrix). In a nutshell, a Confusion Matrix is a short description of prediction results on a classification problem. In our publications, we are measuring the frequency of these predictions for our datasets. The datasets are representing our Machine Learning prediction results of the BIRADS values on breast cancer. From here, we will address the Recall & Precision problem. But first, lets shortly explain each. Recall is the ration for the total number of correctly classified positive BIRADS values divided by the total number of predicted positive BIRADS samples. On the other hand, Precision is the total number of correctly classified positive BIRADS samples divided by the total number of predicted positive BIRADS examples.

Often, we think that precision and recall both indicate accuracy of the model. While that is somewhat true, there is a deeper, distinct meaning of each of these terms. Precision means the percentage of your results which are relevant. On the other hand, recall refers to the percentage of total relevant results correctly classified by your algorithm. Undoubtedly, this is a hard concept to grasp in the first go. So, let me try to explain it with Jack’s example.

In “Precision vs Recall”, May 11, 2018 on Medium at the Towards Data Scienc channel.

— Shruti Saxena

The hereby repository is dependent from the sheet-reader repository, so please first of all clone that to your machine. The work is submitted to top Human-Computer Interaction (HCI) conferences of Computer Science (CS). For more information regarding our results, please follow our set of notebooks/. You can start by the confusions.ipynb file, for example. In the repository wiki we further address several other questions and results.

Citing

We kindly ask scientific works and studies that make use of the repository to cite it in their associated publications. Similarly, we ask open-source and closed-source works that make use of the repository to warn us about this use.

You can cite our work using the following BibTeX entry:

@article{CALISTO2022102285,
title = {BreastScreening-AI: Evaluating Medical Intelligent Agents for Human-AI Interactions},
journal = {Artificial Intelligence in Medicine},
volume = {127},
pages = {102285},
year = {2022},
issn = {0933-3657},
doi = {https://doi.org/10.1016/j.artmed.2022.102285},
url = {https://www.sciencedirect.com/science/article/pii/S0933365722000501},
author = {Francisco Maria Calisto and Carlos Santiago and Nuno Nunes and Jacinto C. Nascimento},
keywords = {Human-Computer Interaction, Artificial Intelligence, Healthcare, Medical Imaging, Breast Cancer},
abstract = {In this paper, we developed BreastScreening-AI within two scenarios for the classification of multimodal beast images: (1) Clinician-Only; and (2) Clinician-AI. The novelty relies on the introduction of a deep learning method into a real clinical workflow for medical imaging diagnosis. We attempt to address three high-level goals in the two above scenarios. Concretely, how clinicians: i) accept and interact with these systems, revealing whether are explanations and functionalities required; ii) are receptive to the introduction of AI-assisted systems, by providing benefits from mitigating the clinical error; and iii) are affected by the AI assistance. We conduct an extensive evaluation embracing the following experimental stages: (a) patient selection with different severities, (b) qualitative and quantitative analysis for the chosen patients under the two different scenarios. We address the high-level goals through a real-world case study of 45 clinicians from nine institutions. We compare the diagnostic and observe the superiority of the Clinician-AI scenario, as we obtained a decrease of 27% for False-Positives and 4% for False-Negatives. Through an extensive experimental study, we conclude that the proposed design techniques positively impact the expectations and perceptive satisfaction of 91% clinicians, while decreasing the time-to-diagnose by 3 min per patient.}
}

Pre-Requisites

To run the various methods available on the src/methods/ directory, it is needed:

Python 2.6 or latest;
The pip package management tool;
The sheet-reader (>= v1.2.1 version) repository;

Instructions

The instructions are as follows. We assume that you already have knowledge over Git and GitHub. If not, please follow this support information. Any need for support, just open a New issue.

Clone

To clone the hereby repository follow the guidelines. It is easy as that.

1.1. Please clone the repository by typing the command:

git clone https://github.com/mida-project/sa-uta7-recall-precision.git

1.2. Get inside of the repository directory:

cd sa-uta7-recall-precision/

1.3. For the installation and running of the source code, follow the next steps;

Install

The installation guidelines are as follows. Please, be sure that you follow it correctly.

2.1. Run the following command to install the library using pip:

pip install --upgrade google-api-python-client

2.2. Follow the next step;

Run

The running guidelines are as follows. Please, be sure that you follow it correctly.

3.1. Run the sample using the following command:

python3 src/core/main.py

3.2. Enjoy our source code!

Notebooks

You can also run a Notebook to watch some of our methods chart plots. For this goal we are using the well known Jupyter Notebook web application. To run the Jupyter Notebook just follow the steps.

4.1. Get inside our project directory:

cd sa-uta7-recall-precision/

4.2. Run Jupyter Notebook application by typing:

jupyter notebook

If you have any question regarding the Jupyter Notebook just follow their Documentation. You can also ask for help close to the Community.

Information

As far as we have to do several statictical analysis over our users, we need to address their results by calculating a set of measures. This measures will gave us a better understanding regarding how users are aiming to interact with our systems. Therefore it is of chief importance to scale this solution for a spreadsheet template-like where we can duplicate the same document and apply this group of source code to consume our data each time we need it.

The following list, represents the set of related repositories for the presented one:

Dataset Resources

For the User Test Analysis 7 (UTA7) of this project we generated a combination of interesting datasets. To publish our datasets we used a well known platform called Kaggle. To access our project’s Profile Page just follow the link. We are also working on several other User Research studies, while this repository is being an important asset to them.

Acknowledgements

We would like to convey Google from their Google Sheets API Documentation. This repository source code is based on Google’s Python Quickstart guide.

Authors

Francisco Maria Calisto [ResearchGate GitHub Twitter LinkedIn]

Supporting

Our organization is a non-profit organization. However, we have many needs across our activity. From infrastructure to service needs, we need some time and contribution, as well as help, to support our team and projects.