Apple announced CoreAI at WWDC, which the post frames as a possible future replacement for CoreML and an alternative to MLX, llama.cpp, and torch for optimized on-device inference. Models still need conversion through Python scripts, and current supported models appear mostly from mid-2025. No performance data is available yet; the author expects it may trail MLX on GPU, but Apple’s 20B on-device foundation model claim suggests larger app-bundled models could become possible.
Apple’s Core AI framework is positioned as a developer stack for deploying AI models directly inside apps on Apple silicon. The documentation describes Swift APIs, `.aimodel` assets, model specialization, caching, Xcode profiling, and debugging tools. It appears aimed at developers building low-latency, privacy-conscious on-device inference workflows, though the documentation is marked as preliminary beta information.