Python TypeError: getattr(): attribute name must be string

92 Views Asked by At

I'm working on a Python script that uses the TMDB (The Movie Database) API to fetch movie information. However, I'm encountering a TypeError when trying to access the 'id' attribute from the API response. I'm using the TMDBv3 API wrapper in my script.

Here's the relevant code:

import pandas as pd
import numpy as np
import requests
import bs4 as bs
import urllib.request

## Extracting features of 2020 movies from Wikipedia

link = "https://en.wikipedia.org/wiki/List_of_American_films_of_2020"

source = urllib.request.urlopen(link).read()
soup = bs.BeautifulSoup(source,'lxml')

tables = soup.find_all('table',class_='wikitable sortable')

len(tables)

type(tables[0])

from io import StringIO

# Assuming 'tables' is a list containing HTML tables

df1 = pd.read_html(StringIO(str(tables[0])))[0]
df2 = pd.read_html(StringIO(str(tables[1])))[0]
df3 = pd.read_html(StringIO(str(tables[2])))[0]

# Replace "1" with '1"'
df4 = pd.read_html(StringIO(str(tables[3]).replace("'1\"\'",'"1"')))[0]

df = df1._append(df2._append(df3._append(df4,ignore_index=True),ignore_index=True),ignore_index=True)

df

df_2020 = df[['Title','Cast and crew']]

df_2020

!pip install tmdbv3api

from tmdbv3api import TMDb
import json
import requests
tmdb = TMDb()
tmdb.api_key = 'API_KEY'

import numpy as np
import requests
from tmdbv3api import Movie

tmdb_movie = Movie()

def get_genre(x):
    genres = []
    result = tmdb_movie.search(x)
    
    if not result or not hasattr(result[0], 'id'):
        return np.NaN
    
    movie_id = result[0].id
    response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}'.format(movie_id, tmdb_movie.api_key))
    data_json = response.json()
    
    if 'genres' in data_json and data_json['genres']:
        genre_str = " " 
        for i in range(0, len(data_json['genres'])):
            genres.append(data_json['genres'][i]['name'])
        
        return genre_str.join(genres)
    
    return np.NaN


df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))

Error Message: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-56-ec558471f2dd> in <module>
----> 1 df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in map(self, arg, na_action)
   4542         dtype: object
   4543         """
-> 4544         new_values = self._map_values(arg, na_action=na_action)
   4545         return self._constructor(new_values, index=self.index, copy=False).__finalize__(
   4546             self, method="map"

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\base.py in _map_values(self, mapper, na_action, convert)
    921         elif isinstance(arr, ExtensionArray):
    922             # dispatch to ExtensionArray interface
--> 923             new_values = map_array(arr, mapper, na_action=na_action, convert=convert)
    924 
    925         else:

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\algorithms.py in map_array(arr, mapper, na_action, convert)
   1814         return lib.map_infer(values, mapper, convert=convert)
   1815     else:
-> 1816         return lib.map_infer_mask(
   1817             values, mapper, mask=isna(values).view(np.uint8), convert=convert
   1818         )

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-56-ec558471f2dd> in <lambda>(x)
----> 1 df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))

<ipython-input-53-5e28b0f3e7db> in get_genre(x)
      7     if not result:
      8         return np.NaN
----> 9     movie_id = result[0].id
     10     response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}'.format(movie_id,tmdb.api_key))
     11     data_json = response.json()

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\tmdbv3api\as_obj.py in __getitem__(self, key)
     47         return getattr(self, key)
     48     else:
---> 49         return self._obj_list[key]
     50 
     51     def __iter__(self):

TypeError: getattr(): attribute name must be string

I suspect that the structure of the response object might be causing the issue.

  • How can I modify the code to handle the TypeError and ensure I'm correctly accessing the 'id' attribute from the TMDB API response?

  • Are there any additional checks I should perform on the response object to avoid such errors?

1

There are 1 best solutions below

1
CodeTherapy On

The issue seems to be with the way you're accessing the 'id' attribute from the TMDB API response. Instead of using the hasattr() function, you can directly check if the 'id' key exists in the response JSON object.

To avoid the TypeError and ensure you're correctly accessing the 'id' attribute, you can modify your code as follows:

import numpy as np

def get_genre(x):
    genres = []
    result = tmdb_movie.search(x)
    
    if not result:
        return np.NaN
    
    if 'id' in result[0]:
        movie_id = result[0]['id']
        response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}'.format(movie_id, tmdb.api_key))
        data_json = response.json()
        
        if 'genres' in data_json and data_json['genres']:
            genre_str = " " 
            for i in range(0, len(data_json['genres'])):
                genres.append(data_json['genres'][i]['name'])
            
            return genre_str.join(genres)
    
    return np.NaN

In this updated code, we are checking if 'id' is present in the TMDB API response by using the in operator. If it exists, we access it as result[0]['id'] instead of using the hasattr() function.

Similarly, we also add a check for the presence of the 'genres' key in the data_json object before accessing its value. This ensures that the code doesn't break if the 'genres' key is missing in the response.

By making these modifications, you should be able to handle the TypeError and properly access the 'id' attribute from the TMDB API response.