For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more umaar's commentsregister

Hey HN, have seen a lot of AI demos recently using libraries/frameworks for what is effectively an API call - so thought to showcase how this could be done with vanilla JavaScript. Turns out response.body is an async iterable, so you can do something like:

  for await (const chunk of response.body) { // use chunk }


This blog post shows S3 Glacier Deep Archive costs $2.05 USD for 2 TB but on the AWS pricing page it's showing as 2,000 GB * $0.0018 per GB = $3.60 ?

Edit: this is for London pricing, guess it's a bit higher.


This is a great project and it was fun playing around with it.

When Meta started discussing this in 2021, the HN sentiment [1] was quite...cautious, making connections to how Meta can't be trusted with kids drawings. Has something changed?

[1] https://news.ycombinator.com/item?id=29580619


We've moved on to a different moral panic


I mean it gets exhausting. We had Covid (and Q), Musk taking over Twitter, war in Ukraine, fear of war in Taiwan, ... accelerated climate change and so on. There are real limits to how many things one can be upset about.


The terms of usage seems to remain the same and the demo page's privacy policy points to the common Facebook privacy policy.

I hope the terms are limited to the demo and there's no telemetry within the source.


If I had to guess it's because this one is unbranded and released as open source


I have stuck with cheap VPS servers for as long as I can remember. It takes 5 minutes to deploy a full stack node.js app, along with a database - and I've yet to exhaust the resources on my VPS, even with all my side projects (production grade and hobby stuff).

Have always found it weird how so many heroku-style hosting providers charge _per app_, things get costly, quickly, when you have lots of small apps like I do.

Just yesterday I realised I'll need a database to store job queues for https://chatbling.net - ChatGPT helped me figure out how to install it, have it persist to disk, ensure it starts up with the system etc. It's nice to not be fearful of unexpected charges hitting my card.

To anyone reading, even if it's just for learning, every now and then, skip the vercel/fly.io/netlify/planetscale/upstash/render/railway whatever cool hosting provider is out, and give a $5 VPS a try.


I think I want to do the same. Can you describe your stack please? How much downtime do you get and how do you deal with app updates and system updates to the vps machine itself? What about monitoring?


These PaaScost an arms and a leg and each one have their own DSL u have to learn.

Easiest is to just provision ur own vps and run a docker-compose or k8s


Who needs a DSL when you can just have a massively nested undocumented yaml file?


Hey HN, I made Chat Bling. An AI chat bot that works with official WhatsApp APIs. You can:

- Chat with ChatGPT directly from WhatsApp.

- Generate pretty cool images (see the home page for screenshots).

- Transcribe your WhatsApp voice messages - speech to text using Whisper.

No signup needed, thank you, let me know if you have any questions or feedback.


Made something similar recently, but for WhatsApp: https://chatbling.net/

What behaviour would users prefer when uploading a voice message, a) the voice message is transcribed, so speech to text? Or b) the voice message is treated as a query, so you receive a text answer to your voice query?

I've done a) for now as mobile devices already let you type with your voice.


I'd quite like a twilio script I could host that enables voice to voice with ChatGPT over a phone call, but for messaging apps (I'm gonna to try yours, though would prefer Signal) I'd personally prefer to stick with typing and use Apple's transcription (the default microphone on iOS keyboard) for any voice stuff - still wanting text back.

This is (in addition to the fact that Apple's works pretty well for me) mostly because that way I get to see the words appear as I'm speaking, and can fix any problems in real-time rather than waiting until I've finished leaving a voice note to find out it messed up. Bing AI chat, for example, trying to use their microphone button just leads to frustration as it regularly fails to understand me. But maybe Whisper is so good that I'd hardly ever need to care about errors?

I do suspect I'm an outlier in terms of how I use dictation, checking as I go - at least based on family members, they seem to either speak a sentence then look at it, or speak and then send without looking - so for them, off-device transcription would probably be welcome as long as it even slightly improves accuracy rates.


I see my server has restarted a few times! I imagine it's folks here since I haven't shared Chat Bling elsewhere yet. Sorry to anyone who started generating images, but haven't received a response. The 'jobs' for images generations are stored entirely within memory, so a server restart will lose all of that.

Going forward, I'll explore storing image jobs in redis or something, which will be more resilient to server crashes.

As for conversation history, I'll continue to keep that in memory for now (messages are evicted after a short time period, or if messages consume too many OpenAI tokens) - even that's lost during a server restart/crash. Feels like quite a big decision to store sensitive chat history in a persistent database, from a privacy standpoint.


You could have a default "will be wiped after <x time>" policy / notification up front, plus an option to change this (in either direction, one way to "only store this in RAM not the DB, and wipe it as soon as I close this window - or maybe after an hour of inactivity", the other way to "please never delete (we reserve the right to delete anyway but will keep for at least Y days/months/whatever)". And also a "delete now" button to override. And then a cron job checking what's due to be deleted and wiping them from the DB/memory?

Of course, it maybe also adds more pressure to keep the server more secure without private conversations being accessible after a reboot...


Agreed, giving the user a choice would be best here. Something tells me most users would not change it from whatever the default is, but yeah still good to expose this as a setting which should be doable. Thanks for the input!


Np - and you're probably right that I'm in the minority of people who'd care about having as much granular control as possible... maybe most people would rather something closer to a browser's privacy mode, so just a toggle on and off between very private and don't care about private?


How did you get Meta to approve? Been trying for so long.


This is very cool. I tested it with a quick reminder request and it seemed to work. I'm a bit terrified by the privacy issue though. Combining OpenAI with WhatsApp seems like a marriage made in hell.

I guess the only solution will be to move to local bots and models on the phone which will interface out only when needed.


dude how did you get Meta to approve your WA Business? I couldn't get verified after like two weeks of trying and gave up :(


Incredible results to my questions. Do these work by finding similar pieces of text from a vector DB, and then embedding those similar pieces of text in the prompt? The answers I'm getting seem to be comprehensive, as if it has considered large amounts of book text, curious how this works as there's an OpenAI token limit. I've heard this is what tools like langchain can help with, so maybe I should play around with that as this all seems like a mystery to me.


For more context, hn post by maker here: https://news.ycombinator.com/item?id=34635338


I remember using Splunk (where the author used to work) at a previous job.

In the spirit of sharing What I learned at X, here's What I learned at Shazam:

Part 1: https://umaar.com/blog/lessons-learned-from-working-at-shaza...

Part 2: https://umaar.com/blog/lessons-learned-from-working-at-shaza...


Great posts! More engineers should read and write such things :)


Thank you!


Wouldn't this approach be quite brittle? For example, where would one define snippet boundaries - isn't it possible that extracting a snippet at arbitrary points may change the information within that snippet?

But then you have the issue of GPT3 token limits, so you're limited in how many of these relevant snippets you can embed into a prompt. Wondering if there's a better way to go about this (for your first example, rather than OPs use case).


It works surprisingly well and you can see examples if you look up the documentation of GPT-Index or Langchain (both are libraries designed to enabled these type of use-cases, among others). Also, you can get fancy, for instance, you can have GPT3 (or any LLM) create multiple "layers" of snippets (for instance, you can have snippets of the actual text, then summaries of a section, then summaries of a chapter, and embed all those and pull in the relevant pieces). Or, you can go back-and-forth with the prompt multiple times to give/get more information.

I'm sure the techniques will evolve over time, but for now, these sorts of patterns (pre-index, then augmenting the prompt at query-time) seem to work best for feeding information/context into the model that it doesn't know about. The other broad family of techniques is around trying to train the model with your custom information ("fine-tuning", etc), but I think most practitioners will agree that's currently less effective for these sorts of use-cases. (Disclaimer: I'm not an expert by any means, but I've played around with both techniques and try to keep up-to-date on what the experts are saying).


Excited to see what comes of it. Lots of people will have a private corpus, and the idea that we can semantically query it sounds so interesting.

Like asking 'what streaming services am I paying for and how much have I spent on them to date?', and some tool going over your bank statements to pick out spotify, netflix etc. I could see being useful.

https://simonwillison.net/2023/Jan/13/semantic-search-answer...


Umar Hansa: https://www.youtube.com/@UmarHansa

Made a few videos to help front-end web developers. Such as how to use DevTools, a bit about web performance and other related things.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You