Simple data visualisation in python

Simple data visualisation in python

Some easy Python

Using some sample data from a course I'm enrolled in I decided to try my hand creating a visualization model with python. I had some prior experience being enrolled on a machine learning class, but because I feel I'm barely a beginner at this stuff I decided to practice with this data set.

The file I used for the data can be found on my Github.

The data is in a csv file and contains the following columns

column_values

Each column contains numerical or true/false values about the patient. I chose to combine insulin, age and the diabetes columns, to create a visual representation of patients who have diabetes, what their ages are and what their insulin level is at.

insulin_age_true

The code I wrote has some unnecessary lines in it, but I hope to write a predictive model out of this some day so that once you input values for all of the columns in the data the machine can estimate whether or not you have diabetes.

Mouse over code block to see the code

# Opens up csv file and reads values. Prints out to a scatterplot for visualisation

from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import os, csv

path_as_string = "path/to/your/file"
file_path = Path(path_as_string)
#print(file_path)
file_to_read = Path("pima-data1.csv")
full_path = path_as_string / file_to_read

insulin = []
ages = []
number_of_reads = 0

with open(full_path, "r") as open_file:
    as_csv = csv.reader(open_file)
    header_row = next(as_csv)

    for index,column_header in enumerate(header_row):
        print(index, column_header)
    #user_selection = input("Which column would you like visualized: ")
    #selection_to_integer = int(user_selection)
    for i in as_csv:
        if i[4] == "0":
            continue
        elif i[7] == 0:
            continue
        insulin_as_integer = int(i[4])
        ages_as_integer = int(i[7])
        if i[9] == "TRUE":
            insulin.append(insulin_as_integer)
            ages.append(ages_as_integer)
    print(insulin)

    #PLOTTING AGE AND INSULIN
    x = np.array(ages)
    y = np.array(insulin)
    fig = plt.figure(dpi = 128, figsize=(15, 6))
    plt.scatter(x, y)
    plt.title("User insulin levels and ages if diabetes = true ")
    plt.xlabel("Age")
    plt.ylabel("Insulin")
    plt.show()