Incredibly fast, on my 5090 with CUDA 13 (& the latest diffusers, xformers, transformers, etc...), 9 samplig steps and the "Tongyi-MAI/Z-Image-Turbo" model I get:
It stays around 26Gb at 512x512. I still haven't profiled the execution or looked much into the details of the architecture but I would assume it trades off memory for speed by creating caches for each inference step
Did you use PyTorch Native or Diffusers Inference? I couldn't get the former working yet so I used Diffusers, but it's terribly slow on my 4080 (4 min/image). Trying again with PyTorch now, seems like Diffusers is expected to be slow.
Uh, not sure? I downloaded the portable build of ComfyUI and ran the CUDA-specific batch file it comes with.
(I'm not used to using Windows and I don't know how to do anything complicated on that OS. Unfortunately, the computer with the big GPU also runs Windows.)
Unless you know and trust person X, you don't want to authorize and interact with such contracts. Scammers will leave loopholes in code so they can, for example, grab all funds deposited to the contract.
Normal contracts that involve money operations would have safeguards that disallow the owner to touch balance that is not theirs. But there's billion of creative attack vectors to bypass that, either by that person X, or any 3rd party
The end effect certainly gives off "understanding" vibe. Even if method of achieving it is different. The commenter obviously didn't mean the way human brain understands
Idk, Sonnet 4.5 score better than Sonnet 4.0 on that benchmark, but is markedly worse in my usage. The utility of the benchmark is fading as it is gamed.
Maybe if you confirm to its expectations for how you use it. 4.5 is absolutely terrible for following directions, thinks it knows better than you, and will gaslight you until specifically called out on its mistake.
I have scripted prompts for long duration automated coding workflows of the fire and forget, issue description -> pull request variety. Sonnet 4 does better than you’d expect: it generates high quality mergable code about half the time. Sonnet 4.5 fails literally every time.
I'm very happy with it TBH, it has some things that annoy me a little bit:
- slower compared to other models that will also do the job just fine (but excels at more complex tasks),
- it's very insistent on creating loads of .MD files with overly verbose documentation on what it just did (not really what I ask it to do),
- it actually deleted a file twice and went "oops, I accidentaly deleted the file, let me see if I can restore it!", I haven't seen this happen with any other agent. The task wasn't even remotely about removing anything
The last point is how it usually fails in my testing, fwiw. It usually ends up borking something up, and rather than back out and fix it, it does a 'git restore' on the file - wiping out thousands of lines of unrelated, unstaged code. It then somehow thinks it can recover this code by looking in the git history (??).
And yes, I have hooks to disable 'git reset', 'git checkout', etc., and warn the model not to use these commands and why. So it writes them to a bash script and calls that to circumvent the hook, successfully shooting itself in the foot.
Sonnet 4.5 will not follow directions. Because of this, you can't prevent it like you could with earlier models from doing something that destroys the worktree state. For longer-running tasks the probability of it doing this at some point approaches 100%.
> The last point is how it usually fails in my testing, fwiw. It usually ends up borking something up, and rather than back out and fix it, it does a 'git restore' on the file - wiping out thousands of lines of unrelated, unstaged code. It then somehow thinks it can recover this code by looking in the git history (??).
Man I've had this exact thing happen recently with Sonnet 4.5 in Claude Code!
With Claude I asked it to try tweaking the font weight of a heading to put the finishing touches on a new page we were iterating on. Looked at it and said, "Never mind, undo that" and it nuked 45 minutes worth of work by running git restore.
It immediately realized it fucked up and started running all sorts of git commands and reading its own log trying to reverse what it did and then came back 5 minutes later saying "Welp I lost everything, do you want me to manually rebuild the entire page from our conversation history?
In my CLAUDE.md I have instructions to commit unstaged changes frequently but it often forgets and sure enough, it forgot this time too. I had it read its log and write a post-mortem of WTF led it to run dangerous git commands to remove one line of CSS and then used that to write more specific rules about using git in the project CLAUDE.md, and blocked it from running "git restore" at all.
We'll see if that did the trick but it was a good reminder that even "SOTA" models in 2025 can still go insane at the drop of a hat.
The problem is that I'm trying to build workflows for generating sequences of good, high quality semantically grouped changes for pull requests. This requires having a bunch of unrelated changes existing in the work tree at the same time, doing dependency analysis on the sequence of commits, and then pulling out / staging just certain features at a time and committing those separately. It is sooo much easier to do this by explicitly avoiding the commit-every-2-seconds workaround and keeping things uncommitted in the work tree.
I have a custom checkpointing skill that I've written that it is usually good about using, making it easier to rewind state. But that requires a careful sequence of operations, and I haven't been able to get 4.5 to not go insane when it screws up.
As I said though, watch out for it learning that it can't run git restore, so it immediately jumps to Bash(echo "git restore" >file.sh && chmod +x file.sh && ./file.sh).
In this case I can't get 4.5 to follow directions. Neither can anyone else, aparantly. Search for "Sonnet 4.5 follow instructions" and you'll find plenty of examples. The current top 2:
Thoughts on possible implications for users in foreseeable future? We built a lot using dbt and can't really think of going back or switching to alternatives
Fivetran isn't really much of a transformation layer so this is likely just a move to lock-in customers of both companies by upselling an ingestion/transformation layer to existing customers.
The bigger question mark to me is that Fivetran recently acquired Tobiko, the company behind a dbt competitor SQLMesh. The Tobiko team said their focus has been on dbt-compatibility because a lot of Fivetran customers use dbt for their transformation layer. I fear it may have just been a way to get rid of competition leading up to this deal. I can't imagine Fivetran spent a ton of money just to have 2 products that do very similar things.
We use both open-source SQLMesh as well as their cloud offering Tobiko Cloud. Following the acquisition, we were annoyed that focus was going to go to dbt compatibility because there was a bunch of stuff on their roadmap that would help us that was now deprioritized. Thankfully, they still offer great support to us and delivered a few features that have given us some quality of life improvements. With this announcement, I'm worried we're going to end up being forced to migrate to dbt...
Full disclosure: I am a PM at Fivetran who is very excited about this.
We are fully committed to open source dbt and don't want to build a 'walled garden'. Interoperability is one of the key value propositions of both Fivetran and dbt. While I'm biased, I think the main implications for users is that their favorite tooling will be with one vendor who cares about what makes them great.
I'm wondering too. We run dbt on-prem. Worst that could happen is we don't get any more free updates. But we have the software and it will continue to run.
What's the problem specifically? Are you banking on some future features? Can't fix the bugs yourself? Worried it won't be compatible with future data warehouses?
I know people don't like it these days, but you can just continue to run old software.
At work, they help me to kickstart a task - taking the first step is very often the hardest part. It helps me grok new codebases and write boring parts.
But side projects is where the real fun starts - I can materialize random ideas extremely quickly. No more hours spent on writing boilerplate or fighting the tooling. I can delegate the parts I'm not good at the agent. Or one-prompt a feature, if I don't like the result or it doesn't work, I roll it back.