For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | fromregister
Emotion Concepts and Their Function in a Large Language Model (transformer-circuits.pub)
9 points by Anon84 6 hours ago | past | discuss
Emotion Concepts and Their Function in a Large Language Model (transformer-circuits.pub)
6 points by majkinetor 17 hours ago | past | discuss
Emotion Concepts and Their Function in a Large Language Model (transformer-circuits.pub)
3 points by stared 1 day ago | past | discuss
Anthropic's Interpretability Research Blog (transformer-circuits.pub)
3 points by philipfweiss 74 days ago | past | 1 comment
Emergent introspective awareness in large language models (transformer-circuits.pub)
1 point by lawrenceyan 5 months ago | past
Emergent Introspective Awareness in Large Language Models (transformer-circuits.pub)
30 points by famouswaffles 5 months ago | past | 4 comments
When models manipulate manifolds: The geometry of a counting task (transformer-circuits.pub)
98 points by vinhnx 5 months ago | past | 17 comments
When Models Manipulate Manifolds: The Geometry of a Counting Task (transformer-circuits.pub)
4 points by 1wheel 5 months ago | past
Visual Features Across Modalities: SVG and ASCII Art Cross-Modal Understanding (transformer-circuits.pub)
12 points by vismit2000 5 months ago | past | 1 comment
LLMs extract high-level semantic concepts from SVG and ASCII art (transformer-circuits.pub)
3 points by neuronerd1 5 months ago | past | 1 comment
When Models Manipulate Manifolds: The Geometry of a Counting Task (transformer-circuits.pub)
2 points by tanelpoder 5 months ago | past
When Models Manipulate Manifolds: The Geometry of a Counting Task (transformer-circuits.pub)
5 points by e_ameisen 5 months ago | past
Transformer Circuits: reverse-engineering transformers into graspable programs (transformer-circuits.pub)
1 point by dvrp 8 months ago | past
So You Want to Work in Mechanistic Interpretability? (transformer-circuits.pub)
2 points by jxmorris12 9 months ago | past
Circuit Tracing: Revealing Computational Graphs in Language Models (Anthropic) (transformer-circuits.pub)
173 points by ydnyshhh on March 31, 2025 | past | 27 comments
The Biology of a Large Language Model (transformer-circuits.pub)
117 points by frozenseven on March 28, 2025 | past | 19 comments
Circuit Tracing: Revealing Computational Graphs in Language Models (transformer-circuits.pub)
8 points by mfiguiere on March 27, 2025 | past
The Biology of a Large Language Model (transformer-circuits.pub)
3 points by mfiguiere on March 27, 2025 | past
Insights on Cross-Coder Model Diffing (transformer-circuits.pub)
1 point by gregorymichael on Feb 24, 2025 | past
Transformer Circuits Thread (transformer-circuits.pub)
1 point by fzliu on Feb 5, 2025 | past
Definitions and Motivation: Features, Directions, and Superposition (transformer-circuits.pub)
4 points by Bluestein on Dec 27, 2024 | past
Toy Models of Superposition (2022) (transformer-circuits.pub)
45 points by tessierashpool9 on Nov 8, 2024 | past
Transformer Circuits Thread (transformer-circuits.pub)
2 points by plurby on Nov 3, 2024 | past
Sparse Crosscoders for Cross-Layer Features and Model Diffing (transformer-circuits.pub)
2 points by benocodes on Oct 25, 2024 | past
A collection of small updates from the Anthropic Interpretability team (transformer-circuits.pub)
2 points by daralthus on July 31, 2024 | past
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub)
22 points by Anon84 on May 23, 2024 | past | 1 comment
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub)
2 points by wrycoder on May 22, 2024 | past | 1 comment
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub)
1 point by smaddox on May 22, 2024 | past | 1 comment
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub)
2 points by veryluckyxyz on May 22, 2024 | past
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub)
10 points by tosh on May 21, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You