I'm making AI service using spring boot and Google's vertex AI.
I have this piece of code:
public PredictedResult prediction(MultipartFile file) {
String inferSpeciesPrompt = "my prompt";
try (VertexAI vertexAi = new VertexAI("my-project", "asia-northeast3");) {
GenerationConfig generationConfig =
GenerationConfig.newBuilder()
.setMaxOutputTokens(2048)
.setTemperature(0.1F)
.setTopK(15)
.setTopP(0.7F)
.build();
GenerativeModel model = new GenerativeModel("gemini-pro-vision", generationConfig, vertexAi);
List<SafetySetting> safetySettings = Arrays.asList(
SafetySetting.newBuilder()
.setCategory(HarmCategory.HARM_CATEGORY_HATE_SPEECH)
.setThreshold(SafetySetting.HarmBlockThreshold.BLOCK_ONLY_HIGH)
.build(),
SafetySetting.newBuilder()
.setCategory(HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT)
.setThreshold(SafetySetting.HarmBlockThreshold.BLOCK_ONLY_HIGH)
.build(),
SafetySetting.newBuilder()
.setCategory(HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT)
.setThreshold(SafetySetting.HarmBlockThreshold.BLOCK_ONLY_HIGH)
.build(),
SafetySetting.newBuilder()
.setCategory(HarmCategory.HARM_CATEGORY_HARASSMENT)
.setThreshold(SafetySetting.HarmBlockThreshold.BLOCK_ONLY_HIGH)
.build()
);
List<Content> contents = new ArrayList<>();
contents.add(Content.newBuilder()
.setRole("user")
.addParts(PartMaker.fromMimeTypeAndData(file.getContentType(), file.getBytes()))
.addParts(Part.newBuilder().setText(inferSpeciesPrompt))
.build());
GenerateContentResponse generateContentResponse = model.generateContent(contents, safetySettings);
String text = generateContentResponse.getCandidates(0).getContent().getParts(0).getText().replace("```json", "");
ObjectMapper objectMapper = new ObjectMapper();
PredictedResult predictedResult = objectMapper.readValue(text, PredictedResult.class);
if (predictedResult.getLivingThings() == "false") {
throw new NoCreatureException();
}
return predictedResult;
} catch (Exception e) {
GeminiException exception = new GeminiException("validation", "try another image");
exception.addValidation("image", e.getMessage());
throw exception;
}
}
For url's request using "gemini", I created a new instance of vertex AI each time to deliver the prediction results. But I think this way of generating instances every time seems inefficient. Is there any way to efficiently utilize the ai model?
The gemini of "vetex ai" is used as a multimodal model, and when I send a specific image, I get the prediction in the json format. And once I send an image and receive the results of the json format, instead of exchanging conversations, the role of ai ends there.
Im not sure i understand your issue.
gemini-pro-vision is not conversational it is single request multimodal image and text.
If you want to hold on a conversation you need to use gemini-pro and drop the images. Even with image you are still sending the full conversation history with each added conversation input from the user.