Google is back with yet another AI service—this time, an offline dictation program using its “Gemma” architecture. But rather than include it within the Gemini app, or as a Gemini function, the company has decided to roll it out into a dedicated iPhone app, with the very catchy name of “Google AI Edge Eloquent.”
I decided to give the app a shot on release day, though the privacy policy gave me pause. Google says that your location, contacts, identifiers, device diagnostics, contact info, user content, usage data, and “other” data can be linked to you, while purchases and other diagnostics can be collected but not linked to you. That’s a lot of data, especially for an app that advertises that “audio, confidential conversations, and personal data never leave your device,” and I’m not sure I’d be keen on downloading the app otherwise. But, as the saying goes, if a service is free, you are the product. I’ve reached out to Google for clarification here, and will update this story if I hear back.
How to try Google’s new AI transcription app
Once you download the app, setup is easy—you record a sample example phrase the app tells you to say, then make a choice: “On-device mode,” which is fully offline, and stores your conversations on your device online; or “Enhanced text polishing,” which keeps the audio on your device, but does use Gemini to “polish” your text, which requires you to send data to the cloud (and is presumably where all that aforementioned privacy policy data is going). You won’t need to keep Gemini on for the app to do a basic edit of your transcript though—by design, the app removes “filler” words like “um.” Keep in mind that the app seems to open in “Enhanced text polishing” mode by default—at least, that’s how it worked on my end. But a simple tap of a toggle in the top-right corner of the main screen switches you into “On-device mode.”
I had some trouble getting the app up and running: Every time I tried to test it, it claimed I didn’t speak at all. But after pairing AirPods with my iPhone and unpairing them, the app seemed to work. To test the app, I played the intro of this Audio University YouTube video, which is entirely dialogue-based. Once the app was working, it immediately started transcribing the video, with near perfect accuracy—at least by the end. I would watch the app enter incorrect words, then retract and replace them as subsequent words provided context. Once the recording was finished, the transcript was nearly identical to the video’s transcript, save for a couple quirks: It mistakenly thought “If this is our first time meeting” was “This is our first time meeting,” and recorded a single sentence twice. But other than that, this is a totally usable transcript of the beginning of the video.
What do you think so far?
From here, you have a number of options—especially if you invite Gemini to help. Off the bat, you can tap a pencil icon over the transcript to manually edit it, in case you want to correct any of the text the AI “polished” wrong. Above this, you can view “Usage stats,” including the number of words spoken, the words spoken per minute, and the number of edits the AI made. If you do switch on Gemini, you’ll have access to additional AI editing tools, including “Key Points,” “Formal,” “Short,” and “Long.” When you’re satisfied with the transcription, you can tap the copy button to move the text to your clipboard to paste elsewhere. In the “History” tab, you can view your previous transcriptions, and return to them to edit them (manually or with AI). In the “Dictionaries” tab, you can add obscure words that you frequently use but the AI might not pick up on, improving the accuracy of your recordings going forward.
In my brief testing, the app does work well, and I do appreciate the option to use it on-device only. I would definitely consider using it over iOS’ built-in transcriptions if it seemed quicker or more accurate, especially since there are some more robust features here—assuming that on-device really does mean keeping my data out of Google’s hands.
Source: Read Full Article
