I have attempted to build and solve a linear programming problem using python, however I am receiving the following error when it is ran:
KeyError: 0
The following is the code I am using:
bfast = 1
lunchanddinner = 2
mealstypes = list(set(meals_dataset.Type))
constraints = {'breakfast':bfast,'lunch_dinner':lunchanddinner}
meals = meals_dataset.Meal.tolist()
calories = dict( zip( meals, np.array(meals_dataset.Calories.tolist())))
x = pulp.LpVariable.dicts( "x", indexs = meals, lowBound=0, upBound=1, cat='Integer', indexStart=[])
proteins = meals_dataset.Protein.tolist()
def model(minprotein):
prob = pulp.LpProblem("Meal Plan", LpMinimize)
prob += pulp.lpSum( [ x[i]*calories[i] for i in meals])
for i in range(len(mealstypes)):
type_data = meals_dataset[meals_dataset.Type==mealstypes[i]]
boundary_condition = np.arange(type_data.index.min(),type_data.index.max())
const = constraints[mealstypes[i]]
prob += pulp.lpSum( [ x[j] for j in boundary_condition ] )==const
prob += pulp.lpSum( [ x[i]*proteins[i] for i in range(len(x))])<= minprotein
return prob
def find_best_plan(minprotein):
prob = model(minprotein)
prob.solve()
variables = []
values = []
for v in prob.variables():
variable = v.name
value = v.varValue
variables.append(variable)
values.append(value)
values = np.array(values).astype(int)
meal_list = pd.DataFrame(np.array([variables,values]).T,columns = ['Variable','Optimal Value'])
meal_list['Optimal Value'] = meal_list['Optimal Value'].astype(int)
squad = meal_list[meal_list['Optimal Value']!=0]
squad_meals = meals_dataset.Meal.loc[np.array(squad.Variable.str.split('_').tolist())[:,1].astype(int)]
squad_type = meals_dataset.Type.loc[np.array(squad.Variable.str.split('_').tolist())[:,1].astype(int)]
return pd.DataFrame([squad_meals,squad_type]).T
I then run the following
find_best_plan_80 = find_best_plan(80)
This results in the error
KeyError Traceback (most recent call last)
<ipython-input-38-43f957c4ddba> in <cell line: 1>()
----> 1 find_best_plan_80 = find_best_plan(80)
2 frames
<ipython-input-36-07827a9fc028> in <listcomp>(.0)
10 boundary_condition = np.arange(type_data.index.min(),type_data.index.max())
11 const = constraints[mealstypes[i]]
---> 12 prob += pulp.lpSum( [ x[j] for j in boundary_condition ] )==const
13 prob += pulp.lpSum( [ x[i]*proteins[i] for i in range(len(x))])<= minprotein
14 return prob
KeyError: 0
I am stuck here and cannot work out what my issue is. For reference, this is the database I am using:
Meal Calories Protein Type
0 weetabix_milk_applejuice 298.125 6.47025 breakfast
1 eggs_bacon_toast 635.500 32.41750 breakfast
2 proteinyog_raspberries 238.600 43.71750 breakfast
3 pancakes_syrup 418.500 9.61200 breakfast
4 turkeybacon_omelette 374.000 26.75750 breakfast
5 chicken_rice_broccoli 519.200 51.88300 lunch or dinner
6 prawn_stir_fry 559.200 37.99650 lunch or dinner
7 steak_fries_asparagus 867.100 67.16000 lunch or dinner
8 chicken_caesar_salad 242.900 35.78700 lunch or dinner
9 chicken_fajitas 804.560 58.12480 lunch or dinner
10 beefpie_mash_broccoli 736.200 23.33600 lunch or dinner
11 tuna_jp 425.600 32.28300 lunch or dinner
12 beef_pasta 652.250 55.38000 lunch or dinner
13 blt 513.000 15.46350 lunch or dinner
14 chicken_sandwich 397.200 29.76750 lunch or dinner
Any help would be greatly appreciated.
You are making some fundamental mistakes with indexing a variable. If you look at the error, it is telling you that
x[0]doesn't exist (key error)....You are defining x to be indexed by the set of Meals like {tuna_jp, blt, ...} but then you start indexing with integers from
boundary_conditionand then withrange(len(x))in the next line. Neither of those will work. Either pick the names or the integers.My strong suggestion that will alleviate a ton of confusion as you get started is to ditch
pandasandnumpyand just use basic python dictionaries. It is much cleaner and you will be able to focus on the modeling. Then, if the problem gets bigger or you have a large datafile, then maybepandas.