A fundamental shift is underway in software. The centralized data store model, where users pay platform and API access fees to a standard “system of record,” is changing for the first time in a while. Three forces are driving it: personalized software, foundation models, and on-device AI.
Personalized software used to mean apps that served dynamic content based on user data. In this AI wave, it’s gone further.
Using Claude Code, Codex, Cursor, or any coding agent, users can now build tailored applications and automated workflows through natural language alone. Foundation models keep improving every few months through scaling laws, reinforcement learning, and better tool use. Where a user once searched the App Store for a transcription service, they can now prompt a coding agent to implement one using an open-source model. Software that was previously a product is becoming a prompt.
Apple is magnifying this at the hardware layer. Its Foundation Models framework gives developers access to a ~3 billion-parameter on-device LLM at zero inference cost. The M5 chip, released in October 2025, is optimized specifically for on-device AI workloads.
The architecture is tiered: routine tasks run locally, complex requests route to Apple’s Private Cloud Compute without storing user data. This vertical integration allows Apple to deliver a better user experience at a meaningfully cheaper cost in an everyday form factor. This is also why Meta is working so hard to push glasses forward – if they succeed, they may be able to take the device advantage away from Apple and Google.
For the builders out there, now’s your time.
OpenClaw showed what personal agents feel like in practice, over a weekend. It became one of the fastest-growing repositories in GitHub history (perhaps the fastest). OpenAI ended up hiring the creator so it could build some of this into the ChatGPT ecosystem.
By texting through WhatsApp, Telegram, or Slack, OpenClaw executed tasks autonomously on a scheduler without the user needing to ask again. It handled browser automation, email and calendar management and much more, with users simply supplying their own API keys or routing to local models.
Smaller models are making this accessible to everyone. Microsoft’s Phi-4 mini (3.8 billion parameters) performed well on all the standard benchmarks. Google’s Gemma 3 and Meta’s Llama 3.2 3B run on any laptop with 8GB of RAM, and some even on a Raspberry Pi. Having near state-of-the-art models that can run on devices with barely any cost also keeps proprietary providers in check, allowing for better distribution and adoption of AI throughout the world.
It’s hard to know what the future looks like. It’s hard to even predict a few weeks ahead. However, here are a few predictions:
- Every device ships with a standard on-device model calibrated to its hardware. It handles the common tasks users expect, like summarization, document extraction, and lightweight agent workflows. If the on-device model runs into a wall, it routes to a larger cloud model, gets context, and processes locally. What feels like magic today becomes mundane.
- Hardware becomes more valuable. Companies like Apple that vertically integrate silicon, OS, and model access will deliver better performance per dollar and deepen ecosystem lock-in as a result. Users will get used to the phone “just doing things” and anything that isn’t able to match the speed or charges for it will have trouble getting user adoption as users become accustomed to it.
- Disruption hits classic systems of record. For now, they can maintain locked API access and force users to pay. But as more work shifts to an on-device-to-cloud model, user data stops flowing as readily into those systems. Their data moats erode, forward growth gets impaired, and they’ll be forced to adapt.
It’s one of the most exciting times to be a builder (and investor) in AI for these reasons. I can’t wait to see the agents that will be helping me in three months, let alone a year.
For the builders out there, now’s your time.










Be First to Comment