A major ERC-7702 exploit is compromising wallets, and a new free Telegram bot tool lets users instantly check if they're affected. Builders can leverage this trend to create timely content or services around wallet security.
excellent repoting from
@MetaFinancialAI
The ERC-7702 exploit has compromised thousands of wallets.
We just shipped a free security tool on our bot β check if YOUR wallet has been delegated to a malicious contract.
/check7702 in our Telegram bot scans 6 chains instantly:
A new AI system analyzes CEO language across earnings calls to predict company performance ahead of the market, offering a potential edge for investors and builders seeking data-driven signals.
I built a system that measures what CEOs actually think, not what they say. It tracks 199 sensors across 169,000 earnings transcripts.
It detected Apple's AI collapse one quarter early.
It flagged CVNA at $11 before the 44x run.
It caught Nadella's language running ahead
π 26,013 viewsβ€ 189π 12π¬ 10π 430.8% eng
AImarket analysisearnings callssentimentsignals
write a newsletter/blog about itpost about it on Xaudience building
Major AI releases like Cursor 3 and Gemma 4 are shifting focus from single-task tools to agentic workflows, signaling a trend toward multi-agent automation. Builders should watch this shift as it opens new opportunities for scalable, automated income streams.
Every single major AI release this week is telling the same story, and most people haven't connected the dots yet.
β Cursor 3 rebuilt its entire UI around managing agent fleets, not editing files
β Google's Gemma 4 is optimized for agentic workflows and runs locally on your
π 7,608 viewsβ€ 38π 8π¬ 3π 260.6% eng
AI agentsautomationmarket trendagentic workflows
write a newsletter/blog about itpost about it on Xaudience building
Anthropic's Claude Mythos shows significant performance advantages over OpenAI's GPT-5.4-xhigh, indicating a shift in AI capabilities that builders should monitor for potential opportunities in AI development and deployment.
Anthropic is obliterating OpenAI
Claude Mythos 77.8% on SWE-Bench Pro
20% higher than GPT-5.4-xhigh
π 20,263 viewsβ€ 425π 26π¬ 30π 352.4% eng
A new survey breaks down how AI models are evolving from simple tool calls to complex, multi-step workflows. Builders can use these insights to spot emerging automation patterns and identify where to focus product or service development.
A new survey that helps you better understand tool use in AI
Shows how models move from single tool calls to full multi-step orchestration, covering:
- Single calls vs. long-horizon workflows
- Sequential, graph-based, re-planning, feedback loops
- Trajectory synthesis and
π 6,431 viewsβ€ 104π 31π¬ 7π 972.2% eng
AI workflowstool useautomationmarket trends
write a newsletter/blog about itpost about it on Xaudience building
Grok 4.20 has achieved the top position on the BridgeBench Reasoning benchmark, outperforming GPT 5.4 and Claude Opus 4.6. This indicates a significant advancement in reasoning capabilities, which may influence future AI model development.
Grok 4.20 Reasoning just took #1 on the new BridgeBench Reasoning benchmark.
Beating GPT 5.4 and Claude Opus 4.6.
This model keeps climbing every single week.
Hallucination #1.
Now Reasoning #1.
While Anthropic is throwing 500 errors, xAI is quietly building the most
Benchmark results indicate that Claude Opus 4.5 is outperforming its successor, 4.6, in terms of hallucination rates. This raises questions about the effectiveness of the latest model and could influence future development decisions.
Claude Opus 4.5 is now OUTPERFORMING Claude Opus 4.6 on BridgeBench Hallucination.
Read that again.
The legacy model is beating the current flagship.
We benchmarked Opus 4.5 this morning to confirm what we saw yesterday.
Claude Opus 4.6 fell from #2 to #10 with a 98%
π 36,211 viewsβ€ 599π 69π¬ 58π 842.0% eng
The tweet highlights the adoption of Chinese open source AI models by notable companies like Cursor and Cognition, indicating a shift in the AI landscape. Senior engineers should note the implications of this trend on competition and innovation in AI infrastructure.
Silicon Valley is quietly running on Chinese open source AI models.
Here are the receipts:
β Cursor confirmed last month that Composer 2 is built on Moonshot's Kimi K2.5
β Cognition's SWE-1.6 model is likely post-trained on Zhipu's GLM
β Shopify saved $5M a year by
π 9,371 viewsβ€ 48π 5π¬ 13π 230.7% eng
Zuckerberg's investment in a young AI researcher has led to the launch of Muse Spark, which competes strongly against established models like Opus and GPT. This indicates a significant shift in AI capabilities and potential market direction.
Zuckerberg paid $14.3 billion for a 28-year-old who had never trained a frontier model. Nine months later, that bet just shipped.
The benchmark table tells you exactly what kind of lab Wang built. Muse Spark leads or ties Opus 4.6 and GPT 5.4 on multimodal perception, health
π 300,886 viewsβ€ 826π 84π¬ 44π 5610.3% eng
Muse Spark demonstrates notable token efficiency with 58M output tokens for its Intelligence Index, outperforming several competitors. This benchmark could inform decisions on model selection for resource-constrained applications.
Muse Spark is notably token efficient for its intelligence level. It used 58M output tokens to run the Intelligence Index, comparable to Gemini 3.1 Pro Preview (57M) and notably lower than Claude Opus 4.6 (Adaptive Reasoning, max effort, 157M), GPT-5.4 (xhigh, 120M) and GLM-5
π 23,918 viewsβ€ 143π 12π¬ 5π 160.7% eng
Mythos has achieved a 70.8% score on AA-Omniscience, surpassing the previous SOTA of Gemini 3.1 Pro at 55%. This indicates a significant advancement in AI capabilities that could influence future developments in the field.
Mythos scores 70.8% on AA-Omniscience
the previous SOTA was Gemini 3.1 Pro with 55%
also insanely high scores on SimpleQA Verified
π 10,297 viewsβ€ 325π 19π¬ 4π 283.4% eng
Anthropic's mythos-preview shows significant performance benchmarks against Claude Opus, indicating a competitive edge in AI capabilities. Senior engineers should note these metrics as they reflect evolving standards in AI model performance.
you're laughing? anthropic's mythos-preview for which normies won't get access is scoring 77.8% vs 53.4% (claude opus 4.6) in swe-bench pro, 82 vs. 65.4 in terminal bench 2.0 and 93.8% vs 80.8% (opus) in swe-bench-verified and you're laughing?
π 5,449 viewsβ€ 198π 6π¬ 12π 94.0% eng
The performance metrics of Claude Mythos and GPT-5.4-Pro highlight emerging trends in AI capabilities and pricing, providing builders with insights into competitive positioning and potential market opportunities.
Claude Mythos scores 161 on ECI
with a 95% CI from 158 to 166
GPT-5.4-Pro is at 158 which is a multi-agent system and costs $180/million
π 8,548 viewsβ€ 89π 6π¬ 4π 111.2% eng
AI performancemarket trendsClaude MythosGPT-5.4-ProAI pricing
A PhD student evaluates OpenAI's GPT-5.4 Pro, revealing its limitations in solving advanced research problems, which may inform pricing strategies and product development for AI tools.
A mathematics PhD student tested OpenAIβs GPT-5.4 Pro ($200/month)
to see if it actually justifies the price compared to the $20 plan.
Hereβs what he found:
- Research problems: Could not solve the hardest ones, still struggles at true PhD-level questions
- Paper review: Very
π 79,346 viewsβ€ 668π 52π¬ 25π 2970.9% eng
This analysis reveals how blocking AI crawlers impacts citation frequency in AI-generated content, offering insight into content visibility and potential traffic sources for builders leveraging AI-driven platforms.
Do News Publishers That Block AI Crawlers Get Cited Less Often by AI?
"Using data from Citation Labsβ AI citation-tracking tool, XOFU, we examined 4 million citations from 3,600 prompts in ChatGPT, Gemini, AI Overviews, and AI Mode, across 10 industries."
buzzstream.com/blog/ne
π 12,113 viewsβ€ 40π 19π¬ 7π 260.5% eng
AI citationsnews publisherscontent strategySEOmarket trends
write a newsletter/blog about itpost about it on Xaudience building
A massive 754B parameter AI model (1.51TB) is now available on Hugging Face, signaling rapid growth in open access to large-scale models. Builders should watch for new opportunities in leveraging or productizing such models.
754B parameters, 1.51TB on Hugging Face
π 28,317 viewsβ€ 318π 18π¬ 14π 511.2% eng
AI modelsHugging Facelarge language modelsmarket trend
A new benchmark from Collinear AI highlights major differences in planning ability among top frontier AIs, with Claude Opus 4.6 outperforming rivals in simulated financial strategy. Builders can use this insight to spot which models are most reliable for automation or investment tools.
BREAKING: Claude Opus 4.6 turned $200K into $1.27M.
> Grok 4.20 went bankrupt twice.
> Claude Sonnet wrote the correct strategy on turn 7 and immediately ignored it for the rest of the year.
Collinear AI's new benchmark just exposed the biggest planning gap in frontier AI
π 5,343 viewsβ€ 38π 3π¬ 8π 410.9% eng
AI benchmarksClaude Opusfrontier modelsplanningmarket trends
write a newsletter/blog about itpost about it on Xaudience building
This tweet shares real-world performance comparisons between leading AI models and frameworks, highlighting Gemma 4's impressive 180 tokens/sec speed. Builders can use these insights to choose faster, more efficient models for their AI products.
GPT is waiting for the MoE model to download, Opus is installing llama-cpp-python to compare against, and Kimi thinks it has a bug is in sliding attention...180 tok/s from GPT on the little Gemma 4.
π 6,936 viewsβ€ 92π 0π¬ 0π 01.3% eng
AI benchmarksmodel comparisonGemma 4performanceLLM
write a newsletter/blog about itpost about it on Xaudience building
A new tournament is forecasting how AI will impact jobs and wages through 2035, with $35,000 in prizes for predictions. Builders can use these insights to spot emerging opportunities or threats in the labor market.
How will AI reshape the labor market?
We just launched the Labor Automation Tournament to forecast how automation will affect jobs, wages, and the workforce through 2035, with $35,000 in prizes for predictions and analysis.
More info below!
π 2,776,404 viewsβ€ 409π 55π¬ 18π 360.0% eng
VTS has introduced Asset Intelligence, an AI-powered tool for lease abstraction using massive real estate data. Builders should watch this as it signals growing demand for AI automation in property management and potential SaaS opportunities.
This week in AI for Real Estate was stacked.
Here are the 7 biggest stories I'm watching:
1) VTS just launched Asset Intelligence. AI-driven lease abstraction built on 13 billion SF of data and 600,000+ leases. You can now talk to your lease portfolio in plain English through
π 14,473 viewsβ€ 78π 10π¬ 3π 1280.6% eng
A roundup of visually striking, AI-generated websites that showcase current design and tech trends. Builders can use this as inspiration for new projects or to spot emerging aesthetics and features that may attract users.
Cursor differentiates itself by routing requests to Claude/OpenAI APIs and hosting its own Composer 2 model, raising questions about their cost structure. Builders should note this hybrid approach as a signal of evolving AI SaaS strategies and potential pricing models.
Cursor is different. They route requests to Claude/OpenAI API and host their own Composer 2 model.
Iβm not sure how much they subsidize on their end.