Bard Meets Gemini: Google’s AI Leap
- AI revolution accelerates. Google releases OpenAI rival, Gemini, integrated into Bard AI today.
- Multimodal mastery. Gemini excels in understanding and combining text, images, audio, video and code.
- Advancing AI applications. Gemini’s varied models set for integration in Google Search, Ads and more.
Days after word got out that Google was postponing the release of its Gemini AI until January 2024, Google today, Dec. 6, announced that it is releasing its OpenAI rival today, initially as part of its Bard AI chat application. According to the announcement, Gemini has been “built to be multimodal, can generalize and seamlessly understand, operate across and combine different types of information, including text, images, audio, video and code.”
Gemini: Google’s Multimodal AI
Full details about Gemini have been released by Google in a Deepmind document titled Gemini: A Family of Highly Capable Multimodal Models. Gemini was designed and optimized for three different models, Ultra, Pro and Nano, enabling it to operate across a broad range of platforms, from data centers to mobile devices. Google Bard is now using a version of Gemini Pro that has been specifically tuned for more advanced reasoning, planning, understanding and more. The announcement stated that Google will introduce Bard Advanced early next year, which will use the most advanced model, Gemini Ultra.
Related Article: In the Age of AI, Google Experiments With Bold Changes to Search
Significant Reasoning Advancements
The Gemini Ultra model has achieved impressive results in benchmarks, including being the first to reach human-expert performance on the MMLU exam benchmark, demonstrating significant advancements in multimodal reasoning tasks.
The Gemini models exhibit impressive crossmodal reasoning abilities, allowing them to understand and reason across a sequence of audio, images and text. An example that was presented in the Google Deepmind paper features Gemini solving a physics problem depicted in a drawing and handwriting, showcasing potential applications in education and other fields. Not only was Gemini able to correct the student’s error, it was able to read their handwriting and interpret their drawing:
The paper also showed that Gemini is able to identify a location based on an image:
Gemini Nano: Compact AI Powers Pixel 8
Gemini Nano, the smaller model series, was designed for on-device deployment and excels in tasks such as summarization, reading comprehension, and text completion, and has displayed remarkable capabilities in reasoning, STEM, coding, multimodal, and multilingual tasks relative to its size. The announcement stated that Google’s Pixel 8 Pro smartphone is engineered to take advantage of Gemini Nano, leveraging it for features such as Summarize in Recorder and Smart Reply in Gboard.
Related Article: Google Ushers in New Age of AI Driven Advertising: What Marketers Need to Know
Google Integrates Gemini AI Across Services
The Google announcement stated that it is beginning to experiment with Gemini in Search, noting that it is improving the speed of its Search Generative Experience (SGE), and that Gemini will soon be used in other Google products and services including Ads, Chrome and Duet AI.
Related Article: Google, Generative Search and the Web’s Uncertain Future
AI Revolution Gains Momentum
It is exciting to be living in the midst of an AI revolution, and the industry’s momentum is showing no signs of waning. Each day brings the announcement of innovative AI applications, sophisticated large language models, and expansive feature sets, setting the stage for an exhilarating year ahead in the realm of artificial intelligence.
Have a tip to share with our editorial team? Drop us a line: