Apple Releases Open-Source SHARP Model, 2D Photo Can Be a 3D View in Seconds
JAKARTA - Apple is again making the technology community raise eyebrows, this time not through the iPhone or Mac, but through open-source AI research. The company released a model called SHARP, a technology that is able to turn a 2D photo into a photorealistic 3D view in less than a second.
The model is presented in a study titled Sharp Monocular View Synthesis in Less Than a Second. The essence is simple but the impact is big: from a single static image, SHARP is able to reconstruct a 3D scene representation with a scale and distance that is perceptually consistent, not just a visual illusion.
The way it works relies on what's called a 3D Gaussian representation. In simple terms, this is millions of tiny "blobs" containing color and light information that are placed in three-dimensional space. When combined, these blobs-blocks reshape a scene that can be seen from different points of view, as long as it is close to the original camera position.
What makes SHARP stand out is its efficiency. Previous Gaussian splatting approaches typically require dozens to hundreds of photos from various angles to build a 3D scene. SHARP only needs one photo, processed in a single forward pass neural network, and is completed in less than a second on a standard GPU.
Apple trains SHARP using a combination of synthetic and real-world data on a large scale. As a result, the model is able to guess the depth, refine it with learned geometric patterns, and then immediately predict the position and appearance of millions of 3D Gaussian. All done without slow, per-scene optimization processes.
In terms of performance, Apple claims SHARP has made a significant leap. This model reportedly reduces visual errors drastically compared to the previous best method, while reducing synthesis time by thousands of times faster. In short, faster, more stable, and more realistic.
There is a deliberate compromise. SHARP is focused on rendering a point of view that is still close to the original photo, rather than creating a part of the scene that is completely invisible before. Users can't "walk far" around objects like in open world games. This limitation is actually the key to why SHARP can be lightning fast and still look reasonable.
Interestingly, Apple doesn't keep this technology closely guarded. SHARP is released as open-source on GitHub, and the community immediately experiments. Within days, users are already trying to apply it to video, 4D Gaussian visualization, to creative experiments beyond Apple's initial scenario.
This move shows a side of Apple that is rarely highlighted in the public sphere: aggressive in fundamental AI research and confident enough to open up its results to the world. SHARP may not be an iOS feature tomorrow morning, but it gives an idea of where Apple's future direction of visual content, AR, and spatial computing is moving.
One photo, so a three-dimensional space, almost instantly. If this is just research, the product will probably make other industries scratch their heads.