How to : Foundry - Google Gemini Pro Vision - Text generation

Hello Community,

In this tutorial, we will explore how to use Google’s Gemini Pro Vision multimodal approach to build upon the object detection techniques covered in our previous How to (link below :arrow_down_small: ).
This new approach integrates multiple modalities such as text and images to provide additional context and improve the accuracy of object detection.
We will guide you through the process of incorporating this technique into your existing object detection workflow to enhance its capabilities and achieve more sophisticated results in Foundry.

Doc references :

Previous episode : How to : Foundry - Yolov8 - Object detection