I have such example of my training Data(i have 1000 films for training), I need to predict a 'budget' of each film:
film_1 = {
'title': 'The Hobbit: An Unexpected Journey',
'article_size': 25000,
'producer': ['Peter Jackson', 'Fran Walsh', 'Zane Weiner'],
'release_date': some_date(2013, 11, 28),
'running_time': 169,
'country': ['New Zealand', 'UK', 'USA'],
'budget': dec('200000000')
}
The keys such as 'title', 'producer', 'country' can be viewed as features in machine learning, while values such as 'The Hobbit: An Unexpected Journey', 25000, etc.,can be viewed as values used for learning process. However, in training, the input is mostly accepted as real numbers rather than strings format. Do I need to convert such fields like 'title', 'producer', 'country' (fields which are strings) to int( such thing like classification or serialization should take place?) or some other manipulations to make me able to use these data as training set for my network?
I was wondering whether this is what you need:
Or you can use your dictionary directly,