AI Twitter Scanner

High-signal AI posts from X, classified and scored

All Dates  |  All Dates  |  Today
Total scanned: 988 Above threshold: 987 Showing: 6
⭐ Favorites 🔥 Resonated 🚀 Viral 🔖 Most Saved 💬 Discussed 🔁 Shared 💎 Hidden Gems 📉 Dead on Arrival
All affiliate automation pipeline builder tool content automation growth hack infrastructure learning resource market signal model release monetization offer it as a service open source drop open source gold passive income stream platform shift research
infrastructure @_philschmid
7/10
Gemini API Service Tiers Optimization
The Gemini API introduces Flex and Priority service tiers, allowing for cost and latency optimizations for production workloads with minimal changes. This is relevant for engineers looking to enhance their infrastructure efficiency without extensive modifications.
Optimizing continues, today Flex and Priority `service_tiers` for the Gemini API. Optimize costs, reliability and latency for production workloads with a single line change. **Flex Inference:** Pay 50% less for latency-tolerant workloads (no batch file management) =
👁 2,628 views ❤ 63 🔁 2 💬 7 🔖 16 2.7% eng Actionable
Gemini APIinfrastructureservice tierscost optimizationlatency
infrastructure @PawelHuryn
7/10
Gemma 4's KV Cache Architecture Explained
The tweet discusses Gemma 4's use of shared KV cache layers, which allows it to run on a laptop but also highlights a limitation in cache reuse for llama.cpp. This insight into architecture could be relevant for engineers working on efficient AI system designs.
There is a catch nobody is talking about. Gemma 4 uses shared KV cache layers - the last layers reuse K/V tensors from earlier layers instead of computing their own. That is why it fits on a laptop. But that same architecture breaks cache reuse in llama.cpp. Every request
👁 5,927 views ❤ 33 🔁 9 💬 10 🔖 39 0.9% eng
AIinfrastructurecacheGemma 4llama.cpp
infrastructure @elvissun
7/10
Optimizing Vercel Build Minutes
The tweet discusses a practical solution to reduce build minutes on Vercel by building locally and using turbo cache, resulting in significant cost savings. Senior engineers would find this relevant for optimizing CI/CD workflows.
if you have multiple agents opening PRs, each one triggers a full build. that's why I've been paying @vercel $150/mo in build minutes the past 2 months lol. the fix: build locally before push → turbo cache → vercel skips the build entirely. 78% fewer build minutes. 5x
👁 638 views ❤ 7 🔁 0 💬 3 🔖 4 1.6% eng Actionable
VercelCI/CDbuild optimizationturbo cacheinfrastructure
infrastructure @googledevs
7/10
Five Patterns for Building AI Agents
This tweet discusses architectural patterns for building production-grade AI agents, emphasizing the importance of architecture over prompts. Senior engineers may find value in the insights derived from the Google AI Bake-Off, particularly regarding multi-agent systems and deterministic execution.
Building production-grade AI agents? It's not about better prompts, it's about better architecture. Learn five patterns from the Google AI Bake-Off, from multi-agent systems to deterministic execution. Read the blog:
👁 2,054 views ❤ 7 🔁 3 💬 0 🔖 5 0.5% eng
AI agentsarchitectureGoogle AI Bake-Offmulti-agent systemsdeterministic execution
infrastructure @HuggingPapers
7/10
Tencent's DisCa for Video Diffusion Transformers
Tencent has introduced DisCa, a method that enhances video diffusion transformers' performance by 11.8× while maintaining quality. This could be relevant for engineers looking to optimize their AI video processing workflows.
Tencent just released DisCa on Hugging Face A distillation-compatible learnable feature caching method that accelerates video diffusion transformers by 11.8× while preserving generation quality.
👁 999 views ❤ 16 🔁 6 💬 0 🔖 6 2.2% eng
Tencentvideo diffusionAI infrastructureperformance optimizationHugging Face
infrastructure @rohanpaul_ai
7/10
OpenAI Launches Long-Running Agent Runtime
OpenAI's new Agents SDK allows developers to manage long-running agents with sandbox execution and direct control over memory and state, streamlining what previously required multiple components. This could simplify infrastructure for AI systems, making it relevant for engineers building complex applications.
OpenAI just turned the Agents SDK into a long-running agent runtime with sandbox execution and direct control over memory and state. Before this, developers often had to stitch together 3 separate pieces themselves: the model loop, the machine where code runs, and the memory or
👁 838 views ❤ 5 🔁 3 💬 3 🔖 7 1.3% eng Actionable
OpenAIAgents SDKinfrastructureAI developmentruntime