For the best experience on desktop, install the
Chrome extension
to track your reading on news.ycombinator.com
×
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
history
|
from
register
Enhancing DeepSeek Models with MLA and FP8 Optimizations in VLLM
(
neuralmagic.com
)
2 points
by
hochmartinez
on Feb 24, 2025
|
past
Multimodal Model Quantization Support Through LLM Compressor by Neural Magic
(
neuralmagic.com
)
1 point
by
BUFU
on Feb 17, 2025
|
past
What happens if we remove 50 percent of Llama?
(
neuralmagic.com
)
231 points
by
BUFU
on Nov 26, 2024
|
past
|
132 comments
We Ran Over Half a Million Evaluations on Quantized LLMs
(
neuralmagic.com
)
12 points
by
eldar_ciki
on Oct 18, 2024
|
past
|
2 comments
Pushing the Boundaries of Mixed-Precision LLM Inference with Marlin
(
neuralmagic.com
)
2 points
by
mwitiderrick
on June 11, 2024
|
past
Fast Llama 2 on CPUs with Sparse Fine-Tuning and DeepSparse
(
neuralmagic.com
)
238 points
by
mwitiderrick
on Nov 23, 2023
|
past
|
26 comments
Build Scalable NLP and Computer Vision Pipelines with DeepSparse
(
neuralmagic.com
)
1 point
by
mwitiderrick
on June 8, 2023
|
past
Achieving 1,000X CPU Performance Boost with Sparse Models in MLPerf
(
neuralmagic.com
)
1 point
by
NM_Ricky
on April 5, 2023
|
past
|
1 comment
SparseGPT: Remove 100B Parameters for Free
(
neuralmagic.com
)
3 points
by
homarp
on March 24, 2023
|
past
|
1 comment
SparseGPT: Remove 100B Parameters for Free
(
neuralmagic.com
)
2 points
by
todsacerdoti
on March 24, 2023
|
past
Sparsify Image Classification Models Faster with SparseML and Deep Lake
(
neuralmagic.com
)
1 point
by
mwitiderrick
on March 16, 2023
|
past
YOLOv8 Detection 10x Faster with DeepSparse
(
neuralmagic.com
)
1 point
by
mwitiderrick
on Jan 19, 2023
|
past
Image Segmentation: Your Ultimate Guide to Easy Deployment and Fast Inferencing
(
neuralmagic.com
)
2 points
by
mwitiderrick
on Jan 5, 2023
|
past
|
2 comments
Search Documents Quickly with Extractive Question Answering
(
neuralmagic.com
)
1 point
by
mwitiderrick
on Dec 15, 2022
|
past
|
1 comment
Accelerate Customer Review Classification with Sparse Transformers
(
neuralmagic.com
)
1 point
by
mwitiderrick
on Nov 22, 2022
|
past
|
1 comment
Neural Network inference on commodity CPUs using sparsity
(
neuralmagic.com
)
2 points
by
atylerrice
on Sept 21, 2022
|
past
|
3 comments
Using compound sparsification for faster BERT on CPUs with better accuracy
(
neuralmagic.com
)
4 points
by
szpcela
on Sept 24, 2021
|
past
YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance
(
neuralmagic.com
)
121 points
by
T-A
on Sept 10, 2021
|
past
|
53 comments
Show HN: YOLOv3 – Pruning and Quantizing to Improve Object Detection Performance
(
neuralmagic.com
)
4 points
by
markurtz
on June 23, 2021
|
past
A Software Architecture for the Future of ML
(
neuralmagic.com
)
2 points
by
beefman
on May 29, 2021
|
past
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
×
HN For You
Display Mode
Highlight
Top
Only
Debug mode
Sign Out
API Key:
Connect
Create an account
to get your API key.