I have a data set of animal types with ID's and I want to break said data set into Test/Train data sets. I also want to keep all ID's for a respective animal within either the Train or Test data set. An example of the data is below with a random Train/Test split ratio of 80/20.
Animal  ID  Test/Train
CAT 1   TRAIN
CAT 1   TRAIN
CAT 2   TRAIN
CAT 2   TRAIN
CAT 3   TRAIN
CAT 3   TEST
CAT 4   TRAIN
CAT 4   TRAIN
CAT 5   TEST
CAT 5   TRAIN
DOG 1   TRAIN
DOG 1   TRAIN
DOG 2   TRAIN
DOG 2   TRAIN
DOG 3   TRAIN
DOG 3   TRAIN
DOG 4   TEST
DOG 4   TEST
DOG 5   TRAIN
DOG 5   TRAIN
Note how CAT with ID 3 and ID 5 exists in both Train and Test data sets.  Is there a function within scikit-learn train_test_split that enables the ability to keep all like values in a column within the same train/test data set while maintaining the test ratio? So if CAT with ID 3 has one value flagged as Train data then any other records with CAT and ID 3 would also be flagged as Train data.
                        
Did you keep the stratify parameter as yes if so then remove it and check.