A diabetes prediction system is a Machine Learning Project that implements the Adaboost algorithm to train the model which is capable of predicting whether a patient is diabetic or not. This project has two API namely:
1: train
2: predict
So what this API does?
The answer is simple as their names suggest the first API is used to supply a comma-separated value (CSV) file which contains the diabetic-related numerical data which is used in training the model.
Once the model is trained then this second API comes into the play where it takes the data that is not from the above-supplied CSV file as a JSON object and then the model predicts whether the data of the patient that is supplied is having diabetes or not.
This project uses AdaBoost classifier model to train and fit the data provided to it. It is a meta-estimator that starts by fitting a classifier on the first dataset and then implements extra copies of the classifier on the same dataset but where the weights of mistakenly labelled cases are arranged such that consequent classifiers converge more on complex cases. This Adaboost classifier of python scikit-sklearn library uses AdaBoost-SAMME algorithm to model and fit the training data.
The Adaboost takes some arguments which are although optional but are provided for better fitting the model.
base_estimator: The base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes and n_classes_ attributes. If None, then the base estimator is DecisionTreeClassifier(max_depth=1).
n_estimators:The highest number of estimators at which boosting is stopped. In the case of the absolute fit, the learning procedure is closed early.
The format in which the training API takes the input is JSON in which a key named "filepath" is supplied with the URL of the raw CSV data file of Github repository. for example:
URL : [POST] localhost:8000/train
input format: JSON
{ "filepath":"https://raw.githubusercontent.com/<userid>/<path_to_csv>/diabetes.csv" }
After training a "Success" output will come. Then the prediction model comes into use.
The format in which the Predict model takes the input is as follows:
input format: JSON
{ "Pregnancies": 9, "PlasmaGlucose": 103 , "DiastolicBloodPressure" : 78, "TricepsThickness": 25 , "SerumInsulin":304 , "BMI": 29.58219193 , "DiabetesPedigree": 1.282869847 , "Age": 43 }
and the output is something like this:
{ "DiabetesPrediction": 1, "PatientID": "", "Physician": "Not found" }
Here DiabetesPredction outcome:
1: The Paitent is Diabetic
0: The Patient is not Diabetic
which shows that this data is about a patient who is suffering from Diabetes.
That's all folks for this blog. See you soon in next.