I am using Google's Vertex AI SDK for Python to access the Gemini Pro model and generate content. I would like to set a timeout (e.g. 30 seconds) for the generate content call to complete or raise an exception.
Setting a timeout is easy if I use an HTTP library like Requests and query the Gemini REST endpoint directly, but how can I implement the same functionality with the Vertex AI SDK for Python?
Here is a example of the code I use to generate content:
from vertexai import init
from vertexai.preview.generative_models import GenerativeModel
from google.oauth2.service_account import Credentials
credentials = Credentials.from_service_account_file(
'path/to/json/credential/file.json')
init(
project=credentials.project_id, location='northamerica-northeast1',
credentials=credentials)
model = GenerativeModel('gemini-pro')
response = model.generate_content("Pick a number")
print("Pick a number:", response.candidates[0].content.text)