No menu items!
EletiofeEverything Google Announced at I/O 2024: Gemini, Search, Project...

Everything Google Announced at I/O 2024: Gemini, Search, Project Astra, Scam Detection

-

- Advertisment -

Google kicked off its annual I/O developer conference today. The company typically uses the Google I/O keynote to announce an array of new software updates and the occasional hunk of hardware. There was no hardware at this year’s I/O—Google had already announced its new Pixel 8A phone—but today’s presentation was a resplendent onslaught of AI software updates and a reflection of how Google aims to assert dominance over the generative AI boom of the past couple years.

Here are the biggest announcements from I/O 2024.

Gemini Steps Up

Courtesy of Google

Gemini Nano, Google’s on-device mobile large language model, is getting a boost. It’s now going to be called Gemini Nano with Multimodality, which Google CEO Sundar Pichai said onstage lets it “turn any input into any output.” That means it can pull information from text, photos, audio, web, or social videos, and live video from your phone’s camera, and then synthesize that input to summarize what’s within or to answer questions you may have about it. Google showed a video demonstrating this where someone scanned all the books on a shelf with a camera and recorded the titles in a database to recognize them later.

Developers will have access to more computing power in Gemini than they get with the other LLMs out there.

Courtesy of Google

Gemini 1.5 Pro, Google’s beefier cloud-based AI system, is available for all developers globally today. For more about all of Google’s AI ambitions, read Will Knight’s WIRED interview with the cofounder of Google’s DeepMind, Demis Hassabis.

Better Search for Photos

Ask Photos brings some new advanced search capabilities to Google Photos.

Courtesy of Google

Google has built some robust visual search tools into Google Photos. With a new feature called Ask Photos, you can ask Gemini to search your photos and deliver more granular results than before. One example: Tell it your license plate number, and it will use context clues to find your car in all of the pics you’ve ever taken.

In a Google blog post, Google Photos software engineer Jerem Selier says the feature doesn’t collect data on your photos that can be used to serve ads or train its other Gemini AI models (aside from what’s being used in Google Photos). Ask Photos is rolling out this summer.

Gemini Goes to Work

Gmail! Remember Gmail?

Photograph: Julian Chokkattu

Google is also infusing AI into its Workplace suite of office tools. Starting today, a button to toggle Google’s Gemini AI will appear in the side panel of many Google apps, including Gmail, Google Drive, Docs, Sheets, and Slides. The Gemini helper can answer questions, help you craft emails or documents, or provide summaries of long docs or email threads.

Lest you think this stuff is all about office work, Google showed off a few features that will appeal to parents, like AI chatbots that can help students with their homework or provide a summary of the PTA meetings you may have missed. Google’s Circle to Search, which debuted earlier this year, is also getting an upgrade and will soon be used to help students with schoolwork, like detailing how to solve math problems.

Workspace integrations for Gemini took up a big chunk of today’s I/O keynote.

Photograph: Julian Chokkattu

Also embedded into apps like Docs and Gmail is a Gemini-powered AI Teammate. This is like a coworker productivity buddy, which you can name anything you want. (For the purposes of today’s demo, it was named Chip.) The AI Teammate can help you better coordinate communications between your coworkers, keep track of project files, assemble to-do lists, and follow up on assignments. It’s like a turbocharged Slackbot.

We also saw a demo of Gems, a new feature that sets automated routines for things you want Gemini to do on a regular basis. You can set it up to manage various digital chores, then run those with a voice command or a text prompt. Google calls each of those routines “Gems” as a play on the Gemini name.

Read Julian Chokkattu’s story for an even deeper dive into all the big things coming to Gemini on Android. We’ll learn more about AI Teammate and the Workspace integrations soon.

New Gemini Models

Courtesy of Google

Google has two new models of its Gemini AI, focused on different types of tasks. Gemini 1.5 Flash is the faster, lower latency one, optimized for tasks where quickness is preferable.

A prerecorded demo shows Project Astra’s visual understanding and how you can interact with what it’s seeing by using your voice to ask questions.

Project Astra is a visual chatbot, and sort of a souped-up version of Google Lens. It lets users open their phone cameras and ask questions about just about anything around them by pointing the camera at things. Google showed off a video demo where somebody asked Astra a variety of questions in a row based on their surroundings. Astra has a better spatial and contextual understanding, which Google says lets users identify things out in the world like what town they are in, the inner workings of some code on a computer screen, or even coming up with a clever band name for your dog. The demo showed Astra’s voice-powered interactions working through a phone’s camera as well as a camera embedded in some (unidentified) smart glasses.

Will Knight goes deeper in his news story on Project Astra from earlier today.

Creativity Tools

The creative side of Google’s AI efforts got a nod; we saw demos for a suite of tools developed by the company’s experimental AI division at Google Labs.

These llamas are not real, sorry.

Courtesy of Google

The new thing is VideoFX, a generative video model based off Google DeepMind video generator Veo. It creates 1080p videos based on text prompts and allows for more flexibility within the production process than before. Google has also improved ImageFX, a high-resolution image generator that Google says has fewer issues with creating unwanted digital artifacts in pictures than its previous image generation. It is also better at analyzing a user’s prompts and generating text.

DJ Mode in action. Turn up the French café!

Courtesy of Google

Google also showed off its new DJ Mode in MusicFX, an AI music generator that lets musicians generate song loops and samples based on prompts. (DJ mode was shown off during the eccentric and delightful performance by musician Mark Rebillet that led into the I/O keynote.)

An Evolution in Search

From its humble beginning as a search-focused company, Google is still the most prominent player in the search industry (despite some very good, slightly more private options). Google’s newest AI updates are a seismic shift for its core product.

New contextual awareness abilities help Google Search deliver more relevant results.

Courtesy of Google

Some new capabilities include AI-organized search, which allows for more tightly presented and readable search results, as well as the ability to get better responses from longer queries and searches with photos.

We also saw AI overviews, which are short summaries that pool information from multiple sources to answer the question you entered in the search box. These summaries appear at the top of the results so you don’t even need to go to a website to get the answers you’re seeking. These overviews are already controversial, with publishers and websites fearing that a Google search that answers questions without the user needing to click any links may spell doom for sites that already have to go to extreme lengths to show up in Google’s search results in the first place. Nonetheless, these newly enhanced AI overviews are rolling out to everyone in the US starting today.

A new feature called Multi-Step Reasoning lets you find several layers of information about a topic when you’re searching for things with some contextual depth. Google used planning a trip as an example, showing how searching in Maps can help find hotels and set transit itineraries. It then went on to suggest restaurants and help with meal planning for the trip. You can deepen the search by looking for specific types of cuisine or vegetarian options. All of this info is presented to you in an organized way.

Advanced visual search in Lens.

Courtesy of Google

Lastly, we saw a quick demo of how users can rely on Google Lens to answer questions about whatever they’re pointing their camera at. (Yes, this sounds similar to what Project Astra does, but these capabilities are being built into Lens in a slightly different way.) The demo showed a woman trying to get a “broken” turntable to work, but Google identified that the record player’s tonearm simply needed adjusting, and it presented her with a few options for video- and text-based instructions on how to do just that. It even properly identified the make and model of the turntable through the camera.

WIRED’s Lauren Goode talked with Google head of search Liz Reid about all the AI updates coming to Google Search, and what it means for the internet as a whole.

Security and Safety

Scam Detection in action.

Photograph: Julian Chokkattu

One of the last noteworthy things we saw in the keynote was a new scam detection feature for Android, which can listen in on your phone calls and detect any language that sounds like something a scammer would use, like asking you to move money into a different account. If it hears you getting duped, it’ll interrupt the call and give you an onscreen prompt suggesting that you hang up. Google says the feature works on the device, so your phone calls don’t go into the cloud for analysis, making the feature more private. (Also check out WIRED’s guide to protecting yourself and your loved ones from AI scam calls.)

Google has also expanded its SynthID watermarking tool meant to distinguish media made with AI. This can help you detect misinformation, deepfakes, or phishing spam. The tool leaves an imperceptible watermark that can’t be seen with the naked eye, but can be detected by software that analyzes the pixel-level data in an image. The new updates have expanded the feature to scan content on the Gemini app, on the web, and in Veo-generated videos. Google says it plans to release SynthID as an open source tool later this summer.

Latest news

7 Best Handheld Gaming Consoles (2024): Switch, Steam Deck, and More

It feels like a distant memory by now, but right before the Nintendo Switch launched in 2017, it seemed...

The Boeing Starliner Astronauts Will Come Home on SpaceX’s Dragon Next Year

NASA has announced that astronauts Barry Wilmore and Sunita Williams will return to Earth next February aboard SpaceX’s Dragon...

How to Switch From iPhone to Android (2024)

Ignore the arguments about which is better, because iPhones and Android phones have far more in common than some...

12 Best Tablets (2024): iPads, Androids, and More Tested and Compared

Tablets often don't come with kickstands or enough ports, so it's a good idea to snag a few accessories...
- Advertisement -

Will the ‘Car-Free’ Los Angeles Olympics Work?

THIS ARTICLE IS republished from The Conversation under a Creative Commons license.With the Olympic torch extinguished in Paris, all...

Lionel Messi will return before MLS playoffs, says Inter Miami coach Tata Martino

Inter Miami head coach Tata Martino said on Friday that Lionel Messi will return to the team's lineup before...

Must read

7 Best Handheld Gaming Consoles (2024): Switch, Steam Deck, and More

It feels like a distant memory by now, but...

The Boeing Starliner Astronauts Will Come Home on SpaceX’s Dragon Next Year

NASA has announced that astronauts Barry Wilmore and Sunita...
- Advertisement -

You might also likeRELATED
Recommended to you