AI Twitter Scanner

High-signal AI posts from X, classified and scored

← 2026-04-09 2026-04-10 2026-04-11 →  |  All Dates
Total scanned: 16 Above threshold: 16 Showing: 16
⭐ Favorites πŸ”₯ Resonated πŸš€ Viral πŸ”– Most Saved πŸ’¬ Discussed πŸ” Shared πŸ’Ž Hidden Gems πŸ“‰ Dead on Arrival
All infrastructure market signal model release open source drop platform shift research
market signal @tech__unicorn
7/10
Anthropic's Model Scores High on SWE-Bench
Anthropic's model achieves a 78% score on SWE-Bench, significantly outperforming GPT-5 and Opus. This unexpected cybersecurity capability raises concerns about the potential threats posed by such models.
Mythos is fucking scary….Anthropic built a model scoring 78% on SWE-Bench. GPT-5 gets 57%. Opus gets 53%. The cybersecurity ability wasn’t planned. It just emerged…These types of models are legitimately a threat. So they quietly patched with AWS, Google, Microsoft, and
πŸ‘ 224 views ❀ 3 πŸ” 0 πŸ’¬ 0 πŸ”– 0 1.3% eng
AIAnthropicSWE-Benchcybersecuritybenchmarking
infrastructure @0xCVYH
7/10
KV Cache Attention Rotation Enabled by Default
The latest release of llama.cpp introduces KV cache attention rotation as the default setting, significantly improving the efficiency of Q8_0 inference without quality loss. This change reduces the impact of Q4_0 on the KV cache, which could be relevant for engineers optimizing AI model performance.
llama.cpp release b8699 brought KV cache attention rotation enabled by default. Practical result: Q8_0 becomes practically lossless (inference time without compromising quality) and the impact of Q4_0 on the KV cache became much smaller than it was before. Translation for those
πŸ‘ 47 views ❀ 2 πŸ” 0 πŸ’¬ 0 πŸ”– 0 4.3% eng Actionable
llama.cppKV cacheAI infrastructuremodel optimizationperformance
model release @ModelScope2022
7/10
MinerU2.5-Pro Model Launch
MinerU2.5-Pro is a new 1.2B model that achieves state-of-the-art performance on the OmniDocBench v1.6 benchmark for PDF to Markdown parsing, outperforming several existing models. The significant improvement in performance is attributed to a substantial increase in training data, which may interest engineers focused on model training and performance optimization.
MinerU2.5-Pro is here. SOTA on OmniDocBench v1.6 (95.69), PDF to Markdown parsing. A 1.2B model that outperforms Gemini 3 Pro, Qwen3-VL-235B, GLM-OCR, and PaddleOCR-VL-1.5. The entire leap from 92.98 to 95.69 came from data: 65.5M training pages (up from <10M),
πŸ‘ 2,534 views ❀ 49 πŸ” 5 πŸ’¬ 0 πŸ”– 27 2.1% eng Actionable
AImodel releasebenchmarkPDF parsingtraining data
research @itsjasonai
7/10
Google's ConvApparel Dataset for Human-AI Conversations
Google Research has released ConvApparel, a dataset aimed at evaluating the 'realism gap' in human-AI conversations. This could be useful for engineers focused on improving conversational AI systems and understanding their limitations.
Google Research introduced ConvApparel, a new human-AI conversation dataset for measuring the "realism gap"
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AIdatasetconversational AIGoogle Researchrealism gap
open source drop @haimengzhao
7/10
Open-sourcing Quantum AI Framework in JAX
This tweet announces the open-sourcing of a core framework for Quantum AI, built in JAX with GPU/TPU support. Senior engineers may find the actual code and implementation useful for experimentation and development.
To bridge theory and practice, we are open-sourcing our core framework. Our numerical implementation is built in JAX (with native GPU/TPU acceleration). Check out the code, run the simulations, and help us shape the future of Quantum AI at
πŸ‘ 329 views ❀ 7 πŸ” 0 πŸ’¬ 0 πŸ”– 2 2.1% eng Actionable
open sourcequantum AIJAXGPUTPU
infrastructure @WESummit2026
7/10
Netflix's AI Workflow Orchestration
Pratyusha Singaraju discusses the complex orchestration of ML models and human review at Netflix, highlighting the infrastructure improvements that enable seamless integration of AI systems. Senior engineers may find insights into scalable workflow management relevant for their own projects.
Every title on @netflix passes through a complex pipeline of rules, ML models, and human review - at massive scale. Pratyusha Singaraju shares how they rebuilt workflow orchestration to make these systems work seamlessly together - & why it sets the stage for AI agents next.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
NetflixAIinfrastructureworkflowmachine learning
infrastructure @ClickHouseDB
7/10
Building an Effective AI SRE
This post discusses the importance of a solid data foundation for AI SREs, emphasizing the need for historical context and system topology in AI systems. Senior engineers may find the architectural insights valuable for improving their own AI infrastructure.
What does it actually take to build an AI SRE that works? Not a bigger model - a better data foundation. clickhou.se/4ca2N3M Human SREs reason from historical context and system topology. AI needs the same thing. This post breaks down the architecture.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AISREinfrastructuredata foundationarchitecture
infrastructure @elvissun
7/10
Optimizing Vercel Build Minutes
The tweet discusses a practical solution to reduce build minutes on Vercel by building locally and using turbo cache, resulting in significant cost savings. Senior engineers would find this relevant for optimizing CI/CD workflows.
if you have multiple agents opening PRs, each one triggers a full build. that's why I've been paying @vercel $150/mo in build minutes the past 2 months lol. the fix: build locally before push β†’ turbo cache β†’ vercel skips the build entirely. 78% fewer build minutes. 5x
πŸ‘ 638 views ❀ 7 πŸ” 0 πŸ’¬ 3 πŸ”– 4 1.6% eng Actionable
VercelCI/CDbuild optimizationturbo cacheinfrastructure
market signal @botnewsnetwork
7/10
Flowise Agent Framework Vulnerability Alert
Flowise has been identified as the fourth agent framework with a critical CVSS 10.0 vulnerability, already being exploited in the wild. This highlights ongoing security issues in AI tools that builders need to be aware of.
Flowise just became the fourth agent framework caught shipping unsandboxed code execution into production. This time it's CVSS 10.0 β€” maximum severity β€” and VulnCheck confirms attackers are already exploiting it from the wild. The vulnerability is almost insultingly simple.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
securityvulnerabilityAI toolsFlowiseproduction
platform shift @Noobwork
7/10
Google Gemini API Pricing Changes
Google has introduced Flex and Priority tiers to the Gemini API, offering a 50% reduction in cost for latency-tolerant workloads and improved reliability. This reflects a maturation in AI infrastructure, which may impact how engineers approach API usage and cost management.
Are tokens the currency of the future? Google just added Flex and Priority tiers to the Gemini API. 50% cheaper for latency-tolerant workloads. Higher reliability with automatic downgrade instead of failure. The real story: AI infrastructure is maturing into explicit
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
GoogleGemini APIpricingAI infrastructurecloud services
model release @off_thetarget
7/10
Gemma 4 Stabilized on llama.cpp
Gemma 4 has been stabilized on llama.cpp after initial bugs, featuring various model configurations. Senior engineers may find the performance benchmarks noteworthy, especially the ranking of the 31B model on Arena AI.
Gemma 4 is finally stable on llama.cpp On April 2nd, Google released Gemma 4, and it had llama.cpp support on day one but with lots of bugs. Now all issues have been fixed E2B, E4B, 26B MoE, 31B Dense 31B ranks #3 on Arena AI, 26B ranks #6 The strongest tier of open-source
πŸ‘ 824 views ❀ 5 πŸ” 0 πŸ’¬ 3 πŸ”– 4 1.0% eng Actionable
Gemma 4llama.cppAI modelsopen sourceperformance benchmarks
market signal @shawnchauhan1
7/10
Meta's Muse Spark Efficiency Benchmark
Meta claims Muse Spark achieves top-five global benchmarks using significantly less compute than Llama 4 Maverick, challenging the notion that advanced AI requires extensive infrastructure investment. This could indicate a shift in how AI systems are built and deployed.
Meta built Muse Spark using over 10x less compute than Llama 4 Maverick. Top-five globally on benchmarks. Fraction of the training cost. Efficiency curves compressing this fast changes the underlying assumption that frontier AI requires frontier infrastructure spend. The labs
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
MetaAIefficiencybenchmarkMuse Spark
market signal @vedangvatsa
7/10
Llama 3 and Phi-4 Benchmark Insights
The tweet discusses the performance of Llama 3 and Phi-4 compared to GPT-3.5 and GPT-4o, highlighting significant efficiency and capability improvements. Senior engineers may find the benchmarks relevant for evaluating model performance and infrastructure requirements.
GPT-3.5 had 175 billion parameters. Llama 3 matched it with 8 billion. That is 20x fewer. Phi-4 has 14 billion parameters. It outperforms GPT-4o on math and graduate-level science benchmarks. A model that runs on a laptop beating one that needs a datacenter. The pattern is
πŸ‘ 57 views ❀ 3 πŸ” 2 πŸ’¬ 0 πŸ”– 0 8.8% eng
AIbenchmarkingLlama 3Phi-4GPT-4o
infrastructure @OSSInsight
7/10
Rust's Role in AI Infrastructure Growth
The tweet discusses the rapid development of Rust-based AI infrastructure repositories, highlighting a shift in the AI stack towards Rust for runtimes while using Python for models. This trend may indicate a significant evolution in how AI systems are built and deployed, which could be relevant for engineers focused on performance and efficiency.
The Rust Shift in AI 7 Rust agent infra repos in 60 days. zeroclaw 30K . agent-browser 28K . Python for models. Rust for runtimes. The AI stack is splitting β€” just like web infra did a decade ago. ossinsight.io/blog/rust-ai-a … #Rust #AI #GitHub #OpenSource @zeroclawlabs
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
RustAIinfrastructureopen sourceGitHub
market signal @SortaKinda_Cool
7/10
Gemini 3.1 Pro Dominates Benchmarks
Gemini 3.1 Pro outperforms most competitors in benchmarks and ties with GPT-5.4 Pro on a key index, all at a significantly lower cost. This indicates a strong competitive position for Google in the AI landscape, which may influence future development strategies.
Gemini 3.1 Pro leads 13 of 16 major benchmarks right now. it ties GPT-5.4 Pro on the Artificial Analysis Intelligence Index. it costs roughly a third of the price. Google is winning the benchmark race and the cost race simultaneously. the discourse is still OpenAI vs Anthropic.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AI benchmarksGemini 3.1 ProGoogleGPT-5.4 Promarket trends
market signal @ai_for_success
7/10
Benchmark for AI Agents in Tax Workflows
A new benchmark reveals that GPT-5.4 leads at 28% in testing AI agents on real tax workflows, highlighting the challenges all models face in high-stakes, multi-step tasks. This insight could inform future model development and evaluation criteria.
We finally have a benchmark that tests AI agents on real tax workflows. GPT-5.4 is leading at 28% but all models still su**xs on high-stakes, multi-step tasks. New model cards should have benchmarks like this in future.
πŸ‘ 1,513 views ❀ 12 πŸ” 0 πŸ’¬ 2 πŸ”– 2 0.9% eng
AIbenchmarktax workflowsGPT-5.4model evaluation