Why DecisionTreeClassifier split wrongly the data with the specified criterion?

29 Views Asked by Kousha Zhiyani At 27 January 2024 at 08:24

In the first use of DecisionTreeClassifier, we reach two subtrees with sample numbers of 192 and 346, but when we use the file Counter and set the same condition as separation in the Treeclassifier decision, we reach the numbers 171 and 367. What is the sign of this difference?

DecisionTreeClassifier code block:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
import matplotlib.pyplot as plt
import numpy as np
data = pd.read_csv(r"PCOS.csv")
X = data.drop("PCOS (Y/N)", axis=1)
y = data["PCOS (Y/N)"]
model = DecisionTreeClassifier(max_depth=2, criterion="gini")
model.fit(X, y)

tree.plot_tree(model)
fn = data.columns

label = ["0", "1"]
fig, axes = plt.subplots()
tree.plot_tree(model, feature_names=fn, class_names=label, filled=True)
fig.savefig('imagenae.png')

counter code block:

import pandas as pd


def subtree(data, col):
    first_list = []
    sec_list = []
    for i in range(len(data)):
        if data[col][i] <= 7.5:
            first_list.append(data.loc[i, :].values)
        else:
            sec_list.append(data.loc[i, :].values)
    gini(first_list)
    gini(sec_list)


def gini(data):
    a, b= 0, 0
    for i in data:
        if i[-1] == 0:
            a += 1
        else:
            b += 1
    print("label 0 :", a)
    print("label 1 :", b)


col = ['Skin darkening (Y/N)', 'hair growth(Y/N)', 'Weight gain(Y/N)', 'Cycle(R/I)', 'Follicle No. (R)',
       'Fast food (Y/N)', 'Follicle No. (L)', 'PCOS (Y/N)']

data = pd.read_csv("PCOS.csv")[col]

X = data.drop("PCOS (Y/N)", axis=1)
y = data[["PCOS (Y/N)"]]

subtree(data, 'Follicle No. (L)')

result DecisionTreeClassifier: 192 and 346 result counter: 171 and 367

database: database Visualize Decision Tree: Visualize Decision Tree

Original Q&A

Why DecisionTreeClassifier split wrongly the data with the specified criterion?

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in DECISION-TREE

Related Questions in DECISIONTREECLASSIFIER

Trending Questions

Popular # Hahtags

Popular Questions