Google Cloud Vertex AI
Google Cloud's unified ML platform that includes access to Gemini models, PaLM, and other Google AI capabilities. Vertex AI provides model serving, fine-tuning, RAG (Vertex AI Search), and agent building tools. It combines traditional ML workflows with generative AI capabilities.
Implements
Concepts this tool claims to implement:
- Inference Endpoint primary
Generative AI APIs for Gemini and PaLM models. Prediction endpoints for custom models. Online and batch prediction options.
- Model Serving primary
Managed endpoints for model deployment. Auto-scaling based on traffic. Model Garden for accessing pre-trained models.
- Retrieval-Augmented Generation secondary
Vertex AI Search for enterprise search and RAG. Grounding with Google Search. Data connectors for various sources.
- Fine-Tuning secondary
Supervised fine-tuning for Gemini models. Distillation from larger to smaller models. RLHF capabilities.
- Multimodal Model secondary
Gemini models support text, image, audio, and video inputs. Multimodal embeddings and generation.
Integration Surfaces
Details
- Vendor
- Google Cloud
- License
- Proprietary
- Runs On
- cloud
- Used By
- human, agent, system
Links
Notes
Vertex AI is Google's full-stack ML platform. It's broader than just LLMs - includes AutoML, custom training, MLOps. The Gemini integration is the main draw for generative AI use cases. Good for teams already on GCP or wanting Gemini models with enterprise features.