More

russ · on Aug 27, 2024

It’s insanely fast.

Here’s an AI voice assistant I built that uses it:

https://cerebras.vercel.app

joshhug · on Sept 10, 2024

That was interesting. I asked it to try to say something in another language, and she read it in a thick American accent. No surprise. Then I asked her to sing, and she said something like "asterisk in a robotic singing voice asterisk...", and then later explained that she's just text to speech. Ah, ok, that's about what I expected.

But then I asked her to integrate sin(x) * e^x and got this bizarre answer that started out as speech sounds but then degenerated into chaos. Out of curiosity, why and how did she end up generating samples that sounded rather unlike speech?

Here's a recording: https://youtu.be/wWhxF7ybiAc

FWIW, I can get this behavior pretty consistently if I chat with her a while about her voice capabilities and then go into a math question.

wenc · on Aug 28, 2024

This is pretty amazing. It's fast enough to converse with, and I can interrupt the model.

The underlying model is not voice trained -- she says things like "asterisk one" (reading out point form) -- but this is a great preview for when ChatGPT GAs their Voice Mode.

unraveller · on Aug 28, 2024

Fantastic demo. Do you know what's the difference between your stack and the livekit demo? [1] it shows your voice as text so you can see when you have to correct it.

Llama3 with ears just dropped (direct voice token input) which should be awesome with cerebras [2]

[1]: https://kitt.livekit.io [2]: https://homebrew.ltd/blog/llama3-just-got-ears

bilater · on Aug 27, 2024

Nice! What are the other pieces of the stack you'r using?

bcherry · on Aug 27, 2024

LiveKit, Cartesia, Deepgram, and Vercel

bilater · on Aug 28, 2024

awesome - might try it out

innagadadavida · on Aug 28, 2024

Oh wow, this is insanely good. Are there any model details?

russ · on Aug 27, 2024

Here’s an AI voice assistant we built this weekend that uses it:

https://x.com/dsa/status/1828481132108873979?s=46&t=uB6padbn...

russ · on May 13, 2024

LiveKit Agents, which OpenAI uses in voice mode is also open source:

https://github.com/livekit/agents

russ · on Feb 21, 2024

We recently made it a lot easier to build your own KITT too: https://github.com/livekit/agents

russ · on Feb 22, 2024

But we don't handle interruptions yet, that's some cool stuff @yanyan_evie!

AmuVarma · on Feb 22, 2024

There are a lot of good VAD open source models that are easily configurable, and can be integrating in a day or 2 - checkout the silero vad model

russ · on Sept 5, 2023

100%. My dad was an electrical engineer in the 70s and 80s and ended up transitioning to sales, given that it paid much more.

infecto · on Sept 5, 2023

Is that not still true. I have many acquaintances with EE degrees that left to work in software because of the comparatively low salary.

russ · on Aug 14, 2023

Zoom does use a custom protocol. This is why it doesn’t work nearly as well when you take a call in the browser client. Not because WebRTC isn’t up to the task, but because Zoom hasn’t invested in it.

Ignoring costs, while having someone host infra for you will always be easier than managing it yourself, I think we’ve really improved the DX of hosting your own WebRTC infra with LiveKit: https://github.com/livekit/livekit

russ · on Aug 14, 2023

You can do that with LiveKit Egress: https://github.com/livekit/egress

russ · on Aug 14, 2023

HLS is a client-driven protocol, so it has high-scale but variable latency. You could build a mostly one-way webinar experience using it, but definitely not a conference call experience.

The primary issue with traditional WebRTC media servers and services is they didn’t horizontally scale. That’s changed recently. You can get pretty high numbers of users in a single WebRTC session now.

russ · on Aug 14, 2023

It really depends on the use case. In vanilla WebRTC, all media is transmitted directly between peers. In practice, this doesn’t scale beyond 5-10 users in a session. Most home internet connections can’t sustain that amount of upstream bandwidth.

russ · on July 4, 2023

There’s a big graveyard of products/companies that have tried to kill Larry over the years.

Meta has been trying since 2009. Back then, a former, well-known Facebook employee once told me not to join Twitter (thankfully, I didn’t listen). He said they, “have a wall at the office with a list of all the things Twitter does well. Every week someone checks another item off. We’re going to kill Twitter.”

This moment is probably Meta’s best chance. I’d say, if it doesn’t happen this try, it probably never will.

romwell · on July 4, 2023

This would not be Meta killing Twitter, as much as Twitter doing a seppuku and Meta walking over Twitter's corprse.

silverlyra · on July 4, 2023

except seppuku is considered an honorable death.

russ · on July 4, 2023

Twitter has been and still is, its own worst enemy.

HN For You