X
Innovation

Google unveils a slew of Vertex AI upgrades to better cater to enterprise customers

Google has upgraded its enterprise-ready AI platform with new models, features, capabilities, and more.
Written by Sabrina Ortiz, Editor
Gemini 1.5 Pro 2M token window
Screenshot by Sabrina Ortiz/ZDNET

While Google is best known for its consumer-facing Gemini chatbot, it also offers solutions for businesses through its enterprise-ready AI platform, Vertex AI. On Thursday, Google announced Vertex AI is getting new models and updates.

For starters, Google has made highly anticipated changes to its in-house models, including moving Gemini 1.5 Flash from public preview to general availability. Gemini 1.5 Flash, announced last month at Google I/O, is the fastest Gemini model in Google's API and a more cost-efficient alternative to Gemini 1.5 Pro. Despite its low latency, Gemini 1.5 Flash is a highly competitive model with a 1-million-token context window.

Also: Gmail users can now ask Google's Gemini AI to help compose and summarize emails

Google even compared the model's performance to OpenAI's GPT-3.5 Turbo, highlighting how Gemini 1.5 Flash has a token window that is approximately 60 times bigger, 40% faster on average when given an input of 10,000 characters, and has an up to four times lower input on price, with context caching enabled for inputs larger than 32,000 characters.

Google also updated Gemini 1.5 Pro, its overall best-performing model that the company announced at Google I/O. The model will now be available in Vertex AI with a 2-million-token context window, doubling its previous context window size, allowing it to process two hours of video, 22 hours of audio, over 60K lines of code, and over 1.5 million words.

Also: What does a long context window mean for an AI model, like Gemini?

Next, Google launched Imagen 3, its latest image generation foundation model, in preview for Vertex AI customers. Some highlights of this model include 40% faster generation, photo-realistic generation of groups of people, better prompt fidelity, multi-language support, and built-in safety features, according to Google.

In addition to updating its models, Google is adding more third-party and open models, including Gemma 2, available now, and Mistral, which is coming this summer.

Since keeping costs as low as possible is a priority for enterprises, Google is also rolling out context caching in public preview in Gemini 1.5 Pro and Gemini 1.5 Flash. This approach will improve how users feed the model context and should, as a result, lower costs. Additionally, the new provisioned throughput feature, generally available today, should help customers scale their use of Google's first-party models.

To address generative AI misinformation and hallucination concerns, Google plans to introduce grounding with third-party data, coming next quarter, to help enterprises incorporate their data into their generative AI agents.

Also: Google is backing these 20 startups to help improve the world with AI

Google also announced another grounding option: grounding with high fidelity uses only the provided context to generate a response, and doesn't factor in the model's world knowledge to ensure high levels of factuality. Grounding with high fidelity is available in an experimental preview and powered by a fine-tuned version of Gemini 1.5 Flash.

Lastly, to give enterprises more control over where their data is stored and processed, Google has data residency for data stored at rest in 23 countries and is planning to expand ML processing commitments to eight more.

If your enterprise is interested in learning more about getting started with Vertex AI, visit this Google Cloud webpage.

Editorial standards