Noted Warriors critic Charles Barkley provided a sobering assessment of Golden State for the 2024-25 season.
July 2024
Microsoft Won’t Let You Use Its New AI Voice Tool
It’s no secret that AI is getting pretty darn realistic: Companies like OpenAI are making tools that can replicate images, audio, and videos in ways that are becoming increasingly more difficult to identify as such on the fly. But while it’s bad enough that some of these programs are available to the public already, it’s concerning to hear about a tool that’s so good, it’s being kept from the rest of us.
Vall-E 2 can steal your voice
As reported by TechSpot, Microsoft has created a new version of its “neural codec language model,” Vall-E, appropriately now called Vall-E 2. Microsoft detailed Vall-E 2’s advances in a blog post, highlighting some key milestones with this latest model. Chiefly, Vall-E 2 achieves “human parity,” which seems to be a fancy way of saying, “Our model’s outputs sound like real humans.” Be afraid.
Vall-E 2 apparently achieves two key enhancements over Vall-E: The new model doesn’t have an “infinite loop” issue the original had when processing repeating tokens. The new model accounts for repeating tokens, and thus is able to decode a sample that contains them. In addition, Vall-E 2 shortens the length of a given sequence by grouping codec codes, which Microsoft says both increases interference speed, and skips over issues that arise from modeling long sequences.
If that’s all a bit technical, perhaps this won’t be: Vall-E 2 improves upon Vall-E in “speech robustness, naturalness, and speaker similarity,” and, according to Microsoft, is the first of its class to achieve human parity in these categories. In fact, the company says, “VALL-E 2 can generate accurate, natural speech in the exact voice of the original speaker, comparable to human performance.”
It’s not just theory
You don’t just have to read about Vall-E 2 to believe how good it is: Microsoft offers examples of how Vall-E 2 can take a sample recording of a voice, and replicate it when prompted with new text. The company also provided examples of the model completing a sentence after being given segments of a sample recording, in three, five, and 10-second chunks. This demonstrates the model’s ability to take a very short example of a voice, and replicate it with text that doesn’t appear in the original sample recording.
There are still plenty of the quirks you’d expect to find with any text-to-speech model (incorrect pronunciations, stuttered speech, etc.) but there’s no doubt that the Vall-E 2 examples are not only often realistic, but match the voice of the original sample quite closely. It especially does well when given a longer recording of a voice: If given three seconds of a recording, the output is still impressive, but when given a five or, especially, a 10-second recording, the output can be remarkably realistic.
If you click through the examples yourself, check out how well Vall-E 2 matches the 10-second recording when reciting “My life has changed a lot” under “VCTK Samples.” I don’t have any experience with training AI systems, but to my ear, the model nails the raspy voice of the speaker in the sample, especially after receiving the full 10-second clip. It’s jarring to hear the original speaker reading a certain sentence, then hear the model speak a new sentence in a voice that essentially matches the speaker’s.
Vall-E 2’s risks
But if you’re a bit freaked out by this whole thing, you aren’t alone. Microsoft is aware its model could be dangerous if used maliciously: In an ethics statement at the bottom of the post, the company acknowledges that, while Vall-E 2 could be used for a variety of positive tasks, it could also be used to impersonate a specific person. Microsoft says the model is meant to be used with consenting users who understand their voice is being replicated, and that the model should have a protocol to check for consent before processing a request. That said, it doesn’t seem like such a protocol actually exists right now, which is likely why Microsoft current has, “no plans to incorporate VALL-E 2 into a product or expand access to the public.”
The examples here are based on voice samples the LibriSpeech and VCTK datasets, not from samples Microsoft recorded themselves. As such, as a outside observer, it isn’t clear how this model would actually perform if given recordings of, say, President Biden, Elon Musk, or your boss. However, if we assume that Vall-E 2 can generate a realistic output when given a 10-second sample, imagine how realistic its output could be when fed with hours of samples. Couple that with a solid AI video model, and you have the perfect storm for generating misinformation, just in time for election seasons across the globe.
Twins’ Willi Castro named to AL All-Star team as replacement for Jose Altuve
Minnesota Twins utilityman Willi Castro has played at least 20 games at five positions this season.
Twins’ Willi Castro named to AL All-Star team as replacement for Jose Altuve
Minnesota Twins utilityman Willi Castro has played at least 20 games at five positions this season.
44% Of Americans Reach ‘Super Saver Status’ For Their Retirement Savings — Are You Among Them?
Inside MLB’s inaugural All-Star skills competition: Who will compete, how it will work and more
Inspired in part by a video game, the MLB Futures Skills Showcase will debut Saturday in Texas.
Stephen A. Smith fires back at Jaylen Brown after Team USA snub
Stephen A. Smith saw an opportunity to take a victory lap after Jaylen Brown wasn’t chosen as Kawhi Leonard’s replacement on Team USA.
Putter change lifts Justin Thomas to first-round lead at Scottish Open
A new putter helped put Justin Thomas in what used to be a familiar position. Atop the leaderboard.
Biden-Harris Administration Invests $110 Million in Meat and Poultry Processing to Strengthen Food Supply Chain, Increase Competition, and Lower Food Costs
WASHINGTON, July 11, 2024 – U.S. Department of Agriculture (USDA) Secretary Tom Vilsack announced today that the Biden-Harris Administration is making investments that will strengthen American food supply chains, increase independent meat and poultry processing capacity, create more, new and better markets for producers, and lower food costs.
Dodgers’ lackluster performance vs. Phillies provides sobering reminder they’re no longer NL favorites
With a sweep of L.A., Philadelphia showed that there’s a legitimate gap between these playoff contenders.