AI Scanner — 2026-04-12

open source drop @TaoIsTheKey

7/10

The code for Covenant-72B is fully open-source on GitHub, and the model weights are available on Hugging Face. This allows engineers to fork and utilize the resources immediately, which is relevant for those looking to build on existing AI infrastructure.

Sam Dare can take his team and walk. He can’t take what actually made Covenant-72B possible. The code? Fully open-source. Templar repo on GitHub (MIT license) — anyone can fork it today. Covenant-72B weights? Apache 2.0 on Hugging Face. Download it right now. But the

👁 2,282 views ❤ 54 🔁 9 💬 4 🔖 4 2.9% eng Actionable

open sourceAI modelsCovenant-72BGitHubHugging Face

platform shift @Elmedul3

7/10

Oracle and NVIDIA Expand AI on OCI

Oracle and NVIDIA's announcement at #NVIDIAGTC highlights new AI capabilities on Oracle Cloud Infrastructure, aimed at enhancing the transition from AI experimentation to production. While significant, this is more about platform capabilities than groundbreaking technology.

At #NVIDIAGTC, Oracle and NVIDIA announced the expansion of AI capabilities on OCI. See how these advancements enable customers to move from AI experimentation to production at unprecedented scale, speed, and efficiency. social.ora.cl/6013B60FZX

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng

AINVIDIAOraclecloudinfrastructure

infrastructure @quantisol

7/10

Event-Driven Architecture for Market Bots

The tweet describes a custom event-driven architecture for a trading bot that prevents double entries and stale states using specific dataclasses. This approach may interest engineers focused on building robust trading systems and infrastructure.

The architecture is entirely event-driven based on market state. I built custom dataclasses to track round phases (Scanning -> Active -> Settlement) to ensure the bot never double-enters a market or gets trapped in stale states.

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng

tradingevent-driveninfrastructuredataclassesbots

research @bhaskark_la

7/10

AI Models Tested on New Theorem Proving

A comparison of four AI models on their ability to prove a hard theorem reveals significant differences in performance, with Grok Expert leading. This insight into model capabilities could inform future development and benchmarking efforts.

Gave 4 AI models a hard new theorem to prove. Rankings: 1. Grok Expert - quick and elegant proof. 2. Gemini Pro - close runner-up. 3. ChatGPT Pro claimed the theorem was incorrect and had no proof. 4. Claude Opus just gave up after some time with no output (is it really nerfed?)

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng

AI modelstheorem provingbenchmarkingGrok ExpertChatGPT

infrastructure @ConsciousRide

7/10

Optimizing API Performance: Key Considerations

This tweet highlights common pitfalls in API performance, such as network latency and database inefficiencies, urging engineers to analyze query plans and latency traces. Senior engineers will find this practical advice relevant for optimizing their systems.

The API looks perfect in code but gets slow because of network round trips, database queries without proper indexes, and no caching on repeated data. These things add up fast in real traffic even if the logic runs clean. Check query plans and latency traces first before blaming

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng Actionable

APIperformanceinfrastructureoptimizationengineering

market signal @shirakabado

7/10

AI Discovers 22 Firefox Vulnerabilities in 2 Weeks

Anthropic's Claude Opus 4.6, in collaboration with Mozilla, identified 22 significant vulnerabilities in Firefox within a two-week security audit. This highlights the potential of AI in enhancing software security, which is relevant for engineers focused on building robust systems.

AIがFirefoxの重大な脆弱性を2週間で22件発見したって話、かなり衝撃的だったので共有させてください AnthropicのClaude Opus 4.6が、Mozillaと協力してFirefoxのセキュリティ監査を実施した結果です。どんな成果だったかというと… ・2週間で22件の脆弱性を発見

👁 38 views ❤ 2 🔁 0 💬 0 🔖 0 5.3% eng

AIsecurityFirefoxvulnerabilitiesAnthropic

research @ycl_yc

7/10

Comparing Human Experience Data in GAI Workflows

This tweet discusses a comparative study of four types of human experience data used in generative AI workflows, which could provide insights into user interaction and experience design. Senior engineers may find the methodology and findings relevant for improving AI system design.

We compare 4 types of human experience data in a GAI workflow: C1: demographics C2: gaze (eye-tracking) C3: questionnaire-based experience C4: AI-predicted experience 12 designers + 30 evaluators (4/)

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng

GAIuser experienceresearchdata comparisonAI workflows

market signal @Sarojkumar245

7/10

Claude Sonnet 4.6 Tops GDPval-AA Elo Benchmark

Claude Sonnet 4.6 has achieved the highest score in the GDPval-AA Elo benchmark, surpassing competitors Opus 4.6 and Gemini 3.1 Pro. This indicates a significant shift in the competitive landscape of AI coding tools, which may influence future development choices.

Claude Sonnet 4.6 leads the GDPval-AA Elo benchmark with 1,633 points , ahead of Opus 4.6 AND Gemini 3.1 Pro. The coding wars have a new king.

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng

AIbenchmarkClaude SonnetOpusGemini

infrastructure @tom_doerr

7/10

Framework for Evaluating Large Language Models

This tweet links to a GitHub repository that provides a framework for evaluating large language models. Senior engineers may find it useful for benchmarking and improving their own AI systems.

Framework for evaluating large language models github.com/open-compass/o …

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng Actionable

AIevaluationframeworkGitHublanguage models

model release @PrasVector

7/10

MiniMax M2.7 Open Weights Released

MiniMax has released M2.7 as open weights on Hugging Face, achieving notable benchmarks on SWE-Pro and Terminal Bench 2. While the model's focus on agent workflows and self-evolution is interesting, its practical impact and adoption remain to be seen.

Awesome news! MiniMax just dropped M2.7 as open weights on Hugging Face. With 56.22% on SWE-Pro and 57.0% on Terminal Bench 2, plus its focus on agent workflows, tool use, and self-evolution through iteration, this looks like one of the strongest openly available models right

👁 47 views ❤ 2 🔁 0 💬 0 🔖 0 4.3% eng Actionable

AIopen weightsMiniMaxHugging Facemodel release

infrastructure @RealJohnnyTime

7/10

AI-Driven Workflow for Testing and Validation

This tweet outlines a structured approach to using AI for testing software, emphasizing the importance of manual validation and evidence-based reporting. A senior engineer would find value in the practical workflow for enhancing testing processes.

A strong workflow: - use AI to enumerate assumptions and edge cases - use AI to suggest adversarial test scenarios - then manually validate state transitions - confirm exploitability with a PoC - write findings with evidence and impact logic

👁 0 views ❤ 0 🔁 0 💬 0 🔖 0 0.0% eng Actionable

AItestingworkflowsoftware engineeringvalidation

infrastructure @MinionLabAI

7/10

OpenClaw 4.11 Focuses on Stability Over Flash

OpenClaw 4.11 emphasizes the importance of stabilizing the agentic stack rather than showcasing flashy features. This focus on foundational work is crucial for engineers building reliable AI systems.

Beyond the hype, the real signal is in the hardening of the agentic stack. While everyone chases the next flashy demo, the silent revolution is happening in the foundation. OpenClaw 4.11 isn't about headline-grabbing features—it's about the painstaking work of stabilizing an

👁 72 views ❤ 2 🔁 0 💬 0 🔖 0 2.8% eng

OpenClawinfrastructureAI stabilityagentic stackengineering

market signal @itsjoaki

7/10

Benchmarking AI Coding Models Costs

This tweet presents a cost comparison of various AI coding models, highlighting the performance and pricing of open-source versus proprietary options. Senior engineers should care about these metrics as they reflect the competitive landscape and cost-effectiveness of AI solutions for coding tasks.

This chart should scare every AI company charging premium prices for coding models. SWE-rebench, resolved vs average cost per instance: → MiniMax M2.5 (open source): 75.8% resolved at ~$0.05 per task → Claude Opus 4.6: 75.6% at ~$0.35 per task → Claude 4.5 Opus: 76.8% at

👁 779 views ❤ 16 🔁 2 💬 10 🔖 5 3.6% eng

AIbenchmarkingcoding modelscost analysisopen source

AI Twitter Scanner