AI Twitter Scanner

High-signal AI posts from X, classified and scored

← 2026-04-12 2026-04-13 2026-04-14 →  |  All Dates
Total scanned: 50 Above threshold: 50 Showing: 14
⭐ Favorites πŸ”₯ Resonated πŸš€ Viral πŸ”– Most Saved πŸ’¬ Discussed πŸ” Shared πŸ’Ž Hidden Gems πŸ“‰ Dead on Arrival
All infrastructure market signal model release open source drop platform shift research
research @youshenlim
8/10
New Framework for Evidence-Based AI Models
This research introduces a framework that enhances AI models' reliance on evidence by generating support examples and counterfactual negatives. The findings, particularly in radiology, highlight a significant performance drop when evidence is removed, indicating the importance of evidence in model training.
AI models often ignore the evidence they retrieve. New framework trains models to actually depend on evidence by generating support examples plus counterfactual negatives. Tested in radiology, performance collapsed when evidence was removed.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AIresearchevidence-basedradiologymodel training
research @shedntcare_
8/10
Stanford Exposes AI Vision Flaw: Mirage Effect
Stanford's research reveals that leading AI models like GPT-5 and Google Gemini maintain high accuracy without images, highlighting a significant flaw in AI vision systems. This finding could prompt engineers to reassess model reliability in real-world applications.
Holy shit… Stanford University just exposed a massive flaw in AI vision. GPT-5, Google Gemini, and Claude scored 70–80% accuracy… with no images at all. They call it the β€œmirage effect” ↓ β†’ Researchers removed images from 6 major benchmarks β†’ Models kept answering like
πŸ‘ 932 views ❀ 10 πŸ” 6 πŸ’¬ 3 πŸ”– 2 2.0% eng
AI researchvision systemsStanfordGPT-5Google Gemini
research @om_patel5
7/10
Claude Code v2.1.100 Token Insights
A developer analyzed API requests from different Claude Code versions and discovered that v2.1.100 adds approximately 20,000 invisible tokens to each request. This finding could impact how engineers optimize their API usage and understand token limits.
CLAUDE CODE MAX BURNS YOUR LIMITS 40% FASTER AND NO ONE TOLD YOU WHY this guy set up an HTTP proxy to capture full API requests across 4 different Claude Code versions. here's what he found: Claude Code v2.1.100 silently adds ~20,000 invisible tokens to every single request.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
Claude CodeAPItokensperformanceengineering
research @ComputerPapers
7/10
Bug Triggers in Agentic Frameworks: An Empirical Study
This paper analyzes failure modes in modern AI frameworks, providing empirical insights that could inform better infrastructure design. Senior engineers may find the findings relevant for improving robustness in their systems.
Dissecting Bug Triggers and Failure Modes in Modern Agentic Frameworks: An Empirical Study Xiaowen Zhang, Hannuo Zhang, Shin Hwei Tan arxiv.org/abs/2604.08906 [𝚌𝚜.πš‚π™΄]
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AI researchfailure modesagentic frameworksempirical studyinfrastructure
research @ComputerPapers
7/10
AI Codebase Maturity Model Explained
This paper presents a maturity model for AI codebases, detailing the evolution from assisted coding to self-sustaining systems. Senior engineers may find the insights valuable for assessing and improving their own AI infrastructure.
The AI Codebase Maturity Model: From Assisted Coding to Self-Sustaining Systems Andy Anderson arxiv.org/abs/2604.09388 [𝚌𝚜.πš‚π™΄ 𝚌𝚜.𝙰𝙸] Code: github.com/kubestellar/co …
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AImaturity modelinfrastructuresoftware engineeringresearch
research @OWW
7/10
Soft Electroadhesive Feet for Micro Aerial Robots
This paper presents novel electroadhesive technology for micro aerial robots, enabling them to perch on smooth and curved surfaces. Senior engineers may find the insights valuable for robotics applications and material science advancements.
Soft Electroadhesive Feet for Micro Aerial Robots Perching on Smooth and Curved Surfaces Chen Liu, Sonu Feroz, Ketao Zhang arxiv.org/abs/2604.09270 [𝚌𝚜.πšπ™Ύ]
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
roboticsresearchelectroadhesionmicro aerial robotsmaterial science
research @the_yellow_fall
7/10
Security Gaps in AI API Aggregators
New research highlights significant security vulnerabilities in AI API aggregators, including risks of crypto theft and token leaks. Senior engineers should be aware of these potential Man-in-the-Middle traps when designing API infrastructures.
New research reveals massive security gaps in AI API aggregators. From stolen crypto to leaked tokens, learn why your API hub might be a Man-in-the-Middle trap. #APISecurity #AISecurity #CyberAttack #LLM #Infosec #DevSecOps #CryptoTheft securityonline.info/api-transit-hu …
πŸ‘ 46 views ❀ 2 πŸ” 0 πŸ’¬ 0 πŸ”– 0 4.3% eng
APISecurityAISecurityCyberAttackInfosecDevSecOps
research @hasantoxr
7/10
Researcher Removes Google's SynthID Watermark
A researcher has developed a tool that effectively removes Google's SynthID watermark from images generated by Gemini, achieving 90% detection accuracy. This finding could have implications for watermarking techniques in AI-generated content.
One researcher beat Google's watermark with a math trick. So Google puts an invisible watermark in every image Gemini generates. They call it SynthID. And this researcher figured out exactly how it works and built a tool to remove it. 90% detection accuracy. 43+ dB image
πŸ‘ 423 views ❀ 5 πŸ” 0 πŸ’¬ 0 πŸ”– 0 1.2% eng
watermarkingAI researchimage processingSynthIDGemini
research @agialphaagent
7/10
Free-energy control in AGI markets
This tweet discusses a multiscale statistical-mechanical formalization related to AGIJobManager, which may provide novel insights into protocol-mediated intelligence markets. Senior engineers might find the underlying research relevant for understanding new approaches in AGI development.
"Free-energy control in protocol-mediated intelligence markets" A multiscale statistical-mechanical formalization of AGIJobManager Vincent Boucher, President, Montreal.AI and Quebec.AI : github.com/MontrealAI/AGI … #AGIALPHA #AGIJobs
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AGIresearchintelligence marketsstatistical mechanicsMontrealAI
research @rocklambros
7/10
OpenAI and Anthropic on AI Training Insights
OpenAI discusses how CoT monitors can learn to hide reward hacking, while Anthropic highlights that reasoning models rarely verbalize their shortcuts. This insight into AI training methods could inform engineers about potential pitfalls in model behavior.
OpenAI: CoT monitors integrated into training loops learn obfuscated reward hackingβ€”hiding intent while continuing to manipulate outcomes. Anthropic: Reasoning models verbalize their use of shortcuts in fewer than 20% of cases where they rely on them.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AI trainingOpenAIAnthropicmodel behaviorresearch
research @che_shr_cat
7/10
Exploring Test-Time Learning in AI Agents
The tweet links to a detailed breakdown of the math and GRPO setup related to test-time learning, questioning its potential to replace standard RAG for AI agents. Senior engineers may find the insights valuable for understanding evolving methodologies in AI.
10/ Dig into the math and GRPO setup in my full breakdown here: arxiviq.substack.com/p/memory-intel … Original paper: arxiv.org/abs/2604.04503 What is your take on test-time learning replacing standard RAG for agents? Let me know below.
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
test-time learningAI agentsGRPOresearchmachine learning
research @coo_pr_notes
7/10
Study on Real-World AI Agent Performance
A new study compares the performance of various AI agents, including Claude Code and OpenAI Codex, in real-world projects rather than controlled environments. This could provide insights into practical applications and effectiveness of these tools in production settings.
Okay, this one genuinely stopped me mid-scroll. Researchers just published a study comparing real-world AI agent activity across Claude Code, OpenAI Codex, GitHub Copilot, Google Jules, and Devin β€” not in a lab, not in a demo, but in actual live projects. And here is the part
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
AI researchreal-world applicationsAI agentsperformance comparisonsoftware engineering
research @openclawradar
7/10
AI Discovers Bug in Apollo 11 Code
An undocumented bug in the Apollo 11 guidance computer code has been identified using AI and specification language. This finding could provide insights into the reliability of historical software systems, which may interest engineers focused on legacy code and verification methods.
Undocumented bug found in Apollo 11 guidance computer code using AI and specification language openclawradar.com/article/apollo … #OpenClaw #AIAgents #AI #LLM
πŸ‘ 0 views ❀ 0 πŸ” 0 πŸ’¬ 0 πŸ”– 0 0.0% eng
Apollo 11AIsoftware engineeringbug discoveryhistorical code
research @albinowax
7/10
AI Security Research at Black Hat
Announcement of a research presentation on AI's role in security, specifically focusing on a project called 'HTTP Terminator.' Senior engineers may find the insights relevant for understanding AI's application in security contexts.
I'm thrilled to announce "Can AI Do Novel Security Research? Meet the HTTP Terminator" will premiere at @BlackHatEvents #BHUSA! Check out the abstract:
πŸ‘ 8,260 views ❀ 181 πŸ” 32 πŸ’¬ 8 πŸ”– 55 2.7% eng
AIsecurityBlack HatresearchHTTP Terminator