IMDBpy - Director name is coming with the characters all divided

251 Views Asked by At

I'm trying to get some details about movies from IMDB.

For that I'm using IMDBpy with following code:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    topActors = 3
    i = i+1;
    actor_names = [actor['name'] for actor in cast[:topActors]]
    #director_name = [director['director'] for director in cast[:topActors]]
    if i <= 10:
          print(movie,  ';', ' | '.join(movie['genres']),
                        ';', ' | '.join(actor_names),
                        ';', ' | '.join(str(movie['director']))
                );
    else:
         break;

However when I run my code I am getting my results with this format:

The Shawshank Redemption ; Drama ; Tim Robbins | Morgan Freeman | Bob Gunton ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 1 | 0 | 4 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | D | a | r | a | b | o | n | t | , |   | F | r | a | n | k | _ | > | ]
The Godfather ; Crime | Drama ; Marlon Brando | Al Pacino | James Caan ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | C | o | p | p | o | l | a | , |   | F | r | a | n | c | i | s |   | F | o | r | d | _ | > | ]
The Godfather: Part II ; Crime | Drama ; Al Pacino | Robert Duvall | Diane Keaton ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | C | o | p | p | o | l | a | , |   | F | r | a | n | c | i | s |   | F | o | r | d | _ | > | ]
The Dark Knight ; Action | Crime | Drama | Thriller ; Christian Bale | Heath Ledger | Aaron Eckhart ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 6 | 3 | 4 | 2 | 4 | 0 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | N | o | l | a | n | , |   | C | h | r | i | s | t | o | p | h | e | r | _ | > | ]
12 Angry Men ; Crime | Drama ; Martin Balsam | John Fiedler | Lee J. Cobb ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 4 | 8 | 6 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | L | u | m | e | t | , |   | S | i | d | n | e | y | _ | > | ]
Schindler's List ; Biography | Drama | History ; Liam Neeson | Ben Kingsley | Ralph Fiennes ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 2 | 2 | 9 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | S | p | i | e | l | b | e | r | g | , |   | S | t | e | v | e | n | _ | > | ]
The Lord of the Rings: The Return of the King ; Action | Adventure | Drama | Fantasy ; Noel Appleby | Ali Astin | Sean Astin ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 3 | 9 | 2 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | J | a | c | k | s | o | n | , |   | P | e | t | e | r | _ | > | ]
Pulp Fiction ; Crime | Drama ; Tim Roth | Amanda Plummer | Laura Lovelace ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 2 | 3 | 3 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | T | a | r | a | n | t | i | n | o | , |   | Q | u | e | n | t | i | n | _ | > | ]
The Good, the Bad and the Ugly ; Western ; Eli Wallach | Clint Eastwood | Lee Van Cleef ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 4 | 6 | 6 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | L | e | o | n | e | , |   | S | e | r | g | i | o | _ | > | ]
Fight Club ; Drama ; Edward Norton | Brad Pitt | Meat Loaf ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 9 | 9 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | F | i | n | c | h | e | r | , |   | D | a | v | i | d | _ | > | ]

As you can see the columns for Director is returning with multipli characters...

How can I solve this?

Already solve this issue:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    directors = movie.get('director')
    topActors = 3
    i = i+1;
    actor_names = [actor['name'] for actor in cast[:topActors]]
    director_names = [director['name'] for director in directors[:1]]
    if i <= 10:
          print(movie,  '   ;    ', ' | '.join(movie['genres']),
                        '   ;    ', ' | '.join(actor_names),
                        '   ;    ', ' | '.join(director_names)
                );
    else:
         break;

Thanks!

1

There are 1 best solutions below

1
Davide Alberani On BEST ANSWER

movie['director'] is a list of Movie objects; casting it to str you will get something like "[<Object1>, <Object2>]" and then you use this string as an iterable for the join method.

You should get the directors' names exactly like you do with the cast names.

For example: print(movie, ';', ' | '.join(movie['genres']), ';', ' | '.join(actor_names), ';', ' | '.join([d['name'] for d in movie['director']]) );