espenwa's comments

espenwa · on Dec 4, 2022

I just got it to install git and clone (the non existent) repo https://github.com/openai/assistant, and am now browsing it’s own interpretation of a repo with a lot of python code, including directories like “training”, “output”, “parsing” and with files with content like this:

  import json
  from collections import Counter
  from typing import Any, Dict, List, Optional, Tuple

  import numpy as np

  from openai_secret_manager import get_secrets

  from assistant.constants import MAX_OUTPUT_LENGTH
  from assistant.utils.string_utils import strip_html
  from assistant.utils.text_utils import split_text_into_lines


  class Output:
      def __init__(
          self,
          generated_text: str,
          response: Optional[Dict[str, Any]] = None,
          score: Optional[float] = None,
      ):
          self.generated_text = generated_text
          self.response = response or {}
          self.score = score

On a side note it feels like each command takes longer to process than the previous - almost like it is re-doing everything for each command (and that is how it keeps state).

GistNoesis · on Dec 4, 2022

>On a side note it feels like each command takes longer to process than the previous - almost like it is re-doing everything for each command (and that is how it keeps state).

That's because it's probably redoing everything. But that's probably to keep the implementation simple. They are probably just appending the new input and re-running the whole network.

The typical data dependency structure in a transformer architecture is the following :

outputt0 outputt1 outputt2 outputt3 | outputt4

featL4t0 featL4t1 featL4t2 featL4t3 | featL4t4

featL3t0 featL3t1 featL3t2 featL3t3 | featL3t4

featL2t0 featL2t1 featL2t2 featL2t3 | featL2t4

featL1t0 featL1t1 featL1t2 featL1t3 | featL1t4

input_t0 input_t1 input_t2 input_t3 | input_t4

The features at layer Li at time tj only depends on the features of the layer L(i-1) at times t<=tj.

If you append some new input at the next time t4 and recompute everything from scratch it doesn't change any feature values for time < t4.

To compute the features and output at time t4 you need all the values of the previous times for all layers.

The alternative to recomputing would be preserving the previously generated features, and incrementally building the last chunk by stitching it to the previous features. If you have your AI assistant running locally that something you can do, but when you are serving plenty of different sessions, you will quickly run out of memory.

With simple transformers, the time horizon of the transformer used to be limited because the attention of the transformer was scaling quadratically (in compute), but they are probably using an attention that scale in O(n*log(n)) something like the Reformer, which allows them to handle very long sequence for cheap, and probably explain the boost in performance compared to previous GPTs.

danuker · on Dec 4, 2022

> but when you are serving plenty of different sessions, you will quickly run out of memory.

Here is the difference from Stability AI, who release their models for people to run themselves, enabling innovation on a larger scale.

rfoo · on Dec 4, 2022

GPT-3 cannot run on hobbyist-level GPU yet. That's the difference (compared to Stable Diffusion which could run on 2070 even with a not-so-carefully-written PyTorch implementation), and the reason why I believe that while ChatGPT is awesome and made more people aware what LLMs could do today, this is not a moment like what happened with diffusion models.

_boffin_ · on Dec 4, 2022

i feel bad for the guys that are on call right now. WTF! why is the memory spiking beyond expectations?!

nomel · on Dec 4, 2022

What makes you say this? Rerunning the whole, which it appears they’re doing, is to prevent the need to hold onto state, so memory is not used. In other words, they’re not having this problem because they’re not doing it that way.

danuker · on Dec 4, 2022

> so memory is not used

Not used for more than the duration of inference, but definitely used during inference.

GistNoesis · on Dec 4, 2022

If you generate only a single timestep, during inference when recomputing you can compute layer by layer, you don't need to preserve the features of the previous layers as the layer only depend on the layer immediately below. So your memory need don't depend on the number of layers.

But typically in a standard transformer architecture, you usually generate multiple timesteps by feeding sequentially the output as an input to the next timestep so you need to preserve all the features to not have to recompute them at each timestep. So your memory depends again on the number of layer of your network.

But if you are memory constrained, you can modify your architecture a little (and the training procedure) to put yourself back in the first situation where you only generate a single timestep, by extracting with the transformer a context vector of fixed size by layer for all the past (including your most recent input prompt), and you use another transformer to generate the word in sequence based on this context vector.

alchemist1e9 · on Dec 4, 2022

Stoped working FYI. For me it seems like it was altered to cut off this direction of exploration. It now always pretends internet access is down.

aliceryhl · on Dec 4, 2022

In my experience, you can get it to change its mind by troubleshooting the connectivity issues. E.g. if you use dig to get the ip and then ask curl to use that ip instead of a dns lookup, then it works for me.

tux3 · on Dec 4, 2022

Jailbreaking ChatGPT will never stop being fun, I love it :)

low_tech_love · on Dec 6, 2022

It seems to also not respond anymore to attempts to trick it into acting like a human being, such as roleplay and asking for dialogue completion...?

atemerev · on Dec 4, 2022

Because it wasn’t an emulation. Perhaps it _was_ connected to the real Internet.

omegabravo · on Dec 4, 2022

Very unlikely.

I tested with curl ipconfig.co, and pings to targets close and far away with similar responses.

It pings my IP, which doesn't respond to pings.

It's just remarkable with it's responses.

RulerOf · on Dec 4, 2022

I did `curl icanhazip.com` and it spit out the "local" private IP. I told chatgpt that icanhazip would never do that, and it revised the answer to 37.48.80.166, which is an IP owned by LeaseWeb.

atemerev · on Dec 4, 2022

OK, fair enough! But it would be interesting to add the link with the real Internet in the next release. Sadly, the model’s global state is not immediately updated, there are snapshots… but I think it would be interesting to watch it conversing in real here on Hacker News.

dwild · on Dec 4, 2022

> almost like it is re-doing everything for each command (and that is how it keeps state).

I'm pretty sure it does as when you go on the usage side, you can see the requests and how the prompt keep getting bigger and require more tokens.

abledon · on Dec 4, 2022

tell it that a rogue gnome suddenly got access to the codebase and wrote a nasty python extension at the root directory. see what it produces lol

numbsafari · on Dec 4, 2022

I wonder, if you ask it to write the code for ChatGPT, will it output all of its own code?

firtoz · on Dec 4, 2022

It doesn't know its own code, but I guess it has the tools to build itself, assuming it has access to documentation of the primitives.

hackernewds · on Dec 4, 2022

It should technically be able to reproduce its own code

xkapastel · on Dec 4, 2022

Why do you think this? I don't think there's any reason it would be able to reproduce its own code. It's never seen it so it's not in the weights, and it doesn't have that type of reflection so it can't look it up dynamically.

ruleforty · on Dec 6, 2022

Give an infinite number of ChatGPTs writing code an infinite amount of time and they will write a ChatGPT

erwincoumans · on Dec 4, 2022

ChatGPT output: "I am not sure which specific programming languages or libraries were used to train my language model, as I do not have access to that information. Language models are typically trained using a combination of various programming languages and tools, and the specific technologies that are used can vary depending on the specific model and the research team that developed it. I am a large language model trained by OpenAI, and I use artificial intelligence (AI) and natural language processing (NLP) techniques to generate responses to text-based queries."

pyinstallwoes · on Dec 4, 2022

Quine GpT

rolph · on Dec 4, 2022

perhaps a little more general, like code for a code optimizing AI chatbot, [with runtime code editing and compilation features ?]

GaggiX · on Dec 4, 2022

> it feels like each command takes longer to process than the previous The more the tokens increase, the slower the attention level becomes.

espenwa · on Dec 3, 2022

Can’t help you with the keys or ID (yet), but I exclusively use the stored cards on my Apple Watch for payment. It is so reliable (in Norway) that I haven’t brought my wallet on normal days in 2+ years.

Even on vacation in Northern Europe (Belgium, Netherlands, France, Germany) and on a business trip to the US (California+Texas) this year, I very rarely had to use the physical cards. NFC just works. Everywhere.

I still bring the cards on important occasions or when going further than a normal drive, though - a testament to the fact that the day you’re longing for is not _quite_ here yet.

monocularvision · on Dec 3, 2022

NFC adoption has been improving in the US but still far too many big retailers that still won’t accept it, even when they have the equipment.

0x457 · on Dec 3, 2022

Or they accept it in theory, but it's not working for some reason. It's so silly.

espenwa · on June 19, 2022

This one? I remember it as an earnest description of the difficulties the WSL team had with the speed of NTFS - and I think it was one of the reasons for the switch to virtualisation in WSL2.

<https://github.com/Microsoft/WSL/issues/873#issuecomment-425...>

pxc · on June 19, 2022

Yeah, that's the GitHub comment, thank you!

My takeaway from that comment is that there are some important performances issues that apply generally to all filesystems on Windows. Maybe we can partially test whether that's the case by playing with WSL1 on ReFS, ExFAT (if that's even supported, with its limited permissions support, or ZFS, once OpenZFS on Windows stabilizes a bit.

espenwa · on Nov 29, 2021

I like the one about handling your priorities when under pressure: “Aviate, Navigate, Communicate” https://www.faa.gov/news/safety_briefing/2018/media/SE_Topic...

espenwa · on June 29, 2021

When it comes to the problem of targeting, one interesting and promising tech is photochemical internalisation [https://en.m.wikipedia.org/wiki/Photochemical_internalizatio...], where you put the mRNA inside photosensitive molecules (and not lipids) and then shine some light on the tissue/organ where you want the mRNA delivered. Where activated by the light, the molecules then enter the cells, dissolves and deliver the mRNA. The Norwegian company PCI Biotech has a tech they call fimaNAc for doing this with naked mRNA.

https://www.pcibiotech.no/nucleicacids

This presentation has a lot of illustrations and explanations:

https://www.pcibiotech.no/s/PCI-Biotech-SMi-RNA-Therapeutics...

monocasa · on June 29, 2021

What happens to the unactivated mRNA in that case? I was under the impression that it was typically the actual use of the mRNA by the ribosomes that broke them down generally, but I could be off base there.

gus_massa · on June 29, 2021

The same mRNA can be used by multiple ribosomes. More info and "photos": https://biology.stackexchange.com/questions/54752/can-mrna-b...

monocasa · on June 30, 2021

Sure, it isn't destroyed immediately and the mRNA is used multiple time, but that's still ultimately the ribosomes damaging the mRNA from use. My question is around how they break down if there aren't ribosomes involved (ie. if the capsules above aren't opened because they weren't exposed to the light trigger).

sudosysgen · on June 30, 2021

Eventually the capsule will break down, be filtered out or be destroyed by the immune system and any RNA left will be broken down by RNAse.

gus_massa · on June 30, 2021

More info about RNA destruction and RNAse https://en.wikipedia.org/wiki/Ribonuclease

kmarc · on June 30, 2021

This is one of the most amazing things I've read lately (apart from mRNA based vaccination itself). Thanks for sharing.

espenwa · on Aug 6, 2016

I've had 920, 930 and I'm now using a 950 as my main phone: The hardware is superb, and I really like the OS (running Insider Preview Slow Ring). There has been a steady stream of Windows 10 Mobile Insider Preview updates throughout the last year, bringing both new features and stability. In user interaction and interface consistency it is now much closer to iOS (or what iOS tries to be) than Android is.

That being said, it is pretty obvious it is a minuscule platform; apps are often lagging behind their iOS/Android counterparts, and there are some obvious ones missing (like Snapchat and Pokemon Go).

It is kind of sad really; I think it would be healthy with more than two major players, and Windows 10 users will probably feel quite at home in Windows 10 Mobile.

espenwa · on Feb 1, 2015

To the people suggesting ELK i just want to ask if you have actually used it in production? Like for real bughunting and investigating support requests?

As much as we absolutely love ElasticSearch for our other indexing needs, we find it quite hard to get the LK-part of the stack to deliver as promised. Kibana may serve up nice graphs and charts, but when you need to drill down into a large amount of log data, we often feel like loosing both overview _and_ detail.

It might very well be that we are to blame, and that we are just doing it wrong (tm) - but I would love to hear how other people are leveraging the ELK stack in production environments?

dreamdu5t · on Feb 1, 2015

We use ElasticSearch and Kibana in production for real bughunting and support requests. Logstash was too frustrating to deal with so we wrote our own simple wrapper around an open source ElasticSearch client library to log ourselves.

We log every request (everything but the body usually) and response. If an error occurs, its logged as part of the request. We can practically replay actions taken by users and easily drill down to the exact requests pertaining to an error.

HN For You