Not complaining too loudly because improvement is magical, but trying to stay on top of model cards and knowing which one to use for specific cases is bit tedious.
I think the end game is decent local model that does 80% of the work, and that also knows when to call the cloud, and which models to call.
Yeah, mapping chinese characters to linear UTF-8 space is throwing a lot of information away. Each language brings some ideas for text processing. sentencepiece inventor is Japanese, which doesn't have explicit word delimiters, for example.
It's not throwing any information away because it can be faithfully reconstructed (via an admittedly arduous process), therefore no entropy has been lost (if you consider the sum of both "input bytes" and "knowledge of utf-8 encoding/decoding").
I've seen many comments that they are great for OCR stuff, and my usecase of receipt photo processing does have it doing better than ChatGPT , Claude or Grok.
Maybe it that's an apt analogy in more ways than one, given the recent research out of MIT on AI's impact on the brain, and previous findings about GPS use deteriorating navigation skills:
> The narrative synthesis presented negative associations between GPS use and performance in environmental knowledge and self-reported sense of direction measures and a positive association with wayfinding. When considering quantitative data, results revealed a negative effect of GPS use on environmental knowledge (r = −.18 [95% CI: −.28, −.08]) and sense of direction (r = −.25 [95% CI: −.39, −.12]) and a positive yet not significant effect on wayfinding (r = .07 [95% CI: −.28, .41]).
OpenAI cookbook says LLMs understand XML better than Markdown text, so maybe that also? Although, it should be more specified and structured, but not HTML.
OpenAI cookbook says LLMs understand XML better than Markdown text.
Yes, for prompts. Given how little XML is out on the public internet it'd be surprising if it also applies to data ingestion from web scraping functions. It'd be odd if Markdown works better than HTML to be honest, but maybe Markdown also changes the content being served e.g. there's no menu, header, or footer sent with the body content.
I also had a compiler related description come to me after using Copilot. It allows you to partially generate imperative code declaratively, by writing
a comment like
//now I will get rows X, Y, Z from ContentsProvider
then tab tab complete. You can then even tweak the generated code, very useful!
You need proof of layooff (離職票) to collect you unemployment benefits. It is illegal not to issue one, but it is possible for the company can cause you some pain in issuing it.
Here we have an interesting pattern where the childhood is actually a brittle thing:
If you have a good childhood you grow up into an well integrated adult with no issues. However, if you are having a bad childhood, not only you have the process disrupted for you but it also cascades into your family, in this case siblings. That's because the childhood is so social, and it is often the bad kind.
Many of non-neurotypical (or even physically disabled) children could grow up into integrated adults, it would just take somewhat more time. However, they often don't have access to that time as they are isolated and sidelined by the social structure.
Adult don't care about each other or meddle in each other's affairs, but children would. That's often deleterious.
It's probably more training-compute intensive, but they can do drop-out, right?
The strategy they used for ImageNet recognition, when they were using supervised learning and training data was scarse.
Dropout is one strategy for regularization but doesn't guarantee avoiding overfitting, especially now that modern AI models generalize much better than they did during the ImageNet days. Many of the big LLMs use a dropout of 0.1 though.
I think the end game is decent local model that does 80% of the work, and that also knows when to call the cloud, and which models to call.