A Necessary Critique of Fontcuberta’s Algorithmic Photography

This March, Spanish conceptual artist and photographer Joan Fontcuberta published a new book in Italy. Immagini Latenti concludes with a chapter on AI and photography, referencing the debates surrounding Boris Eldagsen’s submission of an AI-generated image to the Sony World Photography Awards in 2023 and Miles Astray’s submission of a photograph in the AI category of the 1839 Award in 2024.

Over the past 2.5 years, we have repeatedly encountered similar lines of reasoning. They are not only logically inconsistent, but they are also unhelpful, both for photography and for democratic societies. For that reason, we find it necessary to respond collectively to what we consider a rudimentary theory. Below, we contrast excerpts from the book (used with permission) with our own perspective. The original Italian text has been translated into English using multiple AI tools (ChatGPT, Gemini, DeepL).

When I got married, some friends gave me a lemon tree […] We planted it and it grew happily. […] after twenty-five years […] the lemon tree began to produce oranges. […] A friend who is an expert in citrus fruits […] gave me a plausible explanation, […] our lemon tree had almost certainly been grafted onto a branch of an orange tree, and over time it began to reveal its true hybrid nature—non-binary and ambivalent.

Personally, I preferred to keep thinking that the tree had found the courage to come out of the closet. All the more so because it seemed to me a magnificent metaphor for what is happening to photography today, which is also going through a phase in which it is about to come out.

Let me explain. For two centuries, we have attributed to photography a descriptive accuracy of reality that guaranteed absolute documentary fidelity. Now, however, algorithmic photography is blending with optical photography, and we no longer know which way to turn.

Immediately, we encounter a semantic and terminological problem. There are photographic images produced by cameras and photo-optical recording systems. And there are others—apparently photographic—produced through generative AI visualization systems.

The former are children of chemistry and light; the latter of computing and darkness. We must therefore begin to decide whether both types of image should be considered photographic.

If we focus on the processes involved, it is obvious that they are different kinds of images. Yet the difficulty of finding a word capable of classifying photorealistic representations of algorithmic origin weakens the decisiveness of that answer.

These are images without a real referent—what we might call nemotypes.

Some have proposed the term promptography, because such images originate from a prompt—that is, natural-language instructions given to a system in order to obtain the desired photographic result.

There have been other attempts, such as syntography, but none have prevailed.

When photography was shaken by the arrival of digital technology, it became necessary to specify that there had been a previous form to which a distinguishing adjective was now added: we had analog photography—or photochemical photography—versus digital photography. At that time, there was no need to invent or assign a specific new name, and nothing disastrous happened. Therefore, we could probably proceed in the same way now and still understand one another perfectly.

Fontcuberta clearly recognizes the distinction between camera-made and AI-generated images at the level of process – but then argues that this distinction ultimately does not matter.

The problem is: it matters. Considerably.

A photograph is made by light bouncing off a real thing and hitting a sensor. An AI image is made by a computer calculating what a plausible image would look like, based on patterns learned from millions of prior examples. The outputs may appear identical on screen, but they emerge from fundamentally different processes. And it is precisely this process that grants photography its authority as evidence.

Calling AI images “Algorithmic Photography” treats this as a minor upgrade: a lemon tree simply producing oranges. But even in Fontcuberta’s own metaphor, a lemon is still a lemon and an orange is still an orange. Grafting doesn’t change what a fruit is. Two entirely different kinds of image are being given the same name, and that confusion has real consequences.

By this logic, a photorealistic painting would become “Acrylic Photography”. But we still call it painting, because the process matters, and it has been created with canvas, brushes, and paint.

Arguing that the lack of an adequate term for “photorealistic representations of algorithmic origin” justifies subsuming AI images under photography is weak. On the one hand, naming a new medium takes time. On the other hand, Fontcuberta remains confined within photographic thinking and fails to recognize what this new medium actually is: LATENT SPACE.

It consists of the training data of an AI model in which all media is encoded as vectors. In Latent Space, different art forms are no longer separate materials. They become different projections of the same underlying structure. A melody can morph into an image. A text description can generate a video. A sketch can become a sculpture. Latent space is a meta-medium.

This is why prompts have become multimodal. The prompt is a control interface to latent space, navigating probability.

And that is precisely why I suggested the term “promptography”. It encompasses everything produced with a prompt: text, sound, video – not just images resembling photography, but also those resembling drawing or painting.

Because Fontcuberta limits his analysis to “photorealistic representations”, he reduces the discussion to a narrow subset of outputs—and consequently struggles with the arguments that follow.

The lemon is a tricky fruit—linguistically speaking—beyond Fontcuberta’s allegory. In his home country, Spain, a lemon is called “limón”, whereas across Latin America, “limón” means lime. Lemons, limes, oranges – they are all citrus fruits, but likening them is not unlike comparing apples and oranges.

Here’s the thing, plain and simple, all fruits aside: this isn’t about wishy-washy linguistic interpretations of imagery and art; this is about solid scientific fact. Photography is written with light; AI imagery is written with code. The former captures the real world, the latter conjures imaginary worlds.

A linguistic disagreement on terminology does not translate into a scientific dispute around the factual difference between the processes involved in creating images, from paintings to photographs to AI pictures. There is a science to art, and it’s in the process.

The difference between analogue and digital photography could easily be summed up by a prefix because the underlying photographic process (capturing light) hadn‘t changed, only the means of how it was captured and stored (chemically vs. electronically, film vs. sensor). However, to arrive at an AI image, you have to take a completely different procedural route, which deserves a completely different name.

To disregard a giant procedural difference between two mediums in lieu of coming up with one little word to describe the new property is disproportionate and misdirected. That would be like calling every fruit that came after the banana—which came long before oranges and lemons—also banana, and that would be bananas.

[…] But the debate goes deeper: are we dealing with images belonging to different classes, or simply photographs of different rank?

[…] It is easy to imagine that everyone dreamed of inventing a technique capable of producing faithful representations independent of human skill—as if nature could represent itself without the mediation of pencil or brush.

The camera eventually fulfilled that role, producing rigorous and detailed visual records. Since then, billions of photographs have been produced, and these images now constitute the very material used to train generative neural networks.

In fact, AI functions like an ogre forced to devour enormous quantities of images in order to produce plausible results.

Thus, algorithmic photographic images, although derived from the visual heritage of the entire history of photography, carry an undeniable photographic DNA. For this reason, they could reasonably be considered second-generation photographs.

Roland Barthes once wrote that every photograph awaits a text. Now the situation is reversed: it is the text that generates the photograph.

Fontcuberta’s “Barthes reversal” is rhetorically appealing but conceptually shallow. In Camera Lucida, Roland Barthes argues that photographs are unstable without language. The caption stabilizes the photograph. The same photo will change its meaning with different captions.

But Fontcuberta overlooks a crucial development: prompts are not captions. They are instructions to a probabilistic system. Moreover, it is no longer simply “text” generating images. Multimodal prompting has been standard for years. Any input modality can generate any output modality within latent space. What collapses here is media categories.

The “Second-Generation Photography” argument is elegant, but it rests on a logical error. AI models are trained on millions of photographs: that’s true. But that doesn’t make their outputs photography. What the model inherits is visual style, a set of statistical patterns. It does not inherit what defines photography: a direct physical relationship between light, a real event, and a sensor.

“Reverse engineering Fontcuberta’s example and following his argument that favors rank over class, photographs of paintings “could reasonably be considered” second-generation paintings. But if we started calling that $10 Van Gogh print from the gift shop a painting, we “could reasonably be considered” madder than the Dutch master himself.

When Microsoft had an AI hallucinate “The next Rembrandt,” and a 3D printer imitate the texture of oil on canvas, we couldn’t call the result a “painting” without putting the word in quotation marks. It’s not the real deal. In the same vein, a photorealistic AI image does not become a photograph (just like a photorealistic painting does not become a photograph).

All it takes to stop this purely dialectical carousel around rank and class is common sense—we know intuitively what’s what: paintings are paintings, photographs are photographs, and AI images are AI images, because they are derived from vastly different processes and intentions.”

This terminological issue—behind which lies a deeper ontological question—came to the attention of the media when the work The Electrician, belonging to the series Pseudomnesia by the German photographer Boris Eldagsen, won the Sony World Photography Award 2023 in the “Creative” category. […]

The Canadian photographer Miles Astray, specializing in nature and travel photography, reversed the logic of Eldagsen’s action: he submitted a real photograph to the newly created AI-image category of another important competition, the Color Photography Awards.[…]

Indeed, both cases highlight an uncomfortable but unavoidable reality: the dividing line between human creation and that generated by artificial intelligence is rapidly fading, if it has not already disappeared entirely. […] Their intention was to reveal the unreliability of validation systems in competitions of this kind.

These may have been minor infractions, but they pointed toward a much more crucial issue: determining the status and labeling of images, their lineage, their pedigree.

Both initiatives might appear as provocations, but in reality, they offered a necessary critique: if a photograph taken with a camera can be mistaken for an image generated by a machine – or vice versa – then we must rethink how we define the boundaries between images, and also concepts of authorship, creativity, and visual truth. Rather than making us victims of deception, these gestures provide a useful conceptual shock.

What these two incidents actually exposed is that the institutions evaluating the images had no coherent framework for telling them apart.

If these cases teach us anything, it is this: the credibility of an image can no longer reside in the image itself. It must reside in the process—who made it, how, and under what conditions of accountability. Documentary authority does not disappear; it migrates. It becomes procedural.

This is precisely why Fontcuberta’s dismissal of process is problematic.

To correct all the false information in this passage—from my photographic focus and the intentions behind my stunt to the name of the competition I participated in—would go beyond the scope of this rebuttal. But it is important to point out that it is littered with false information. Facts still matter, whether they are captured in imagery or words. In fact, they matter more than ever in this post-truth epoch. If a text on the very topic of “documentary fidelity,” written by an intellectual with the best intentions, is riddled with mistakes, truth is put on its deathbed.

Admittedly, the concept of truth can be vague to begin with. Universal truths are hard to find, and personal truths—tethered to opinions—are abundant. Fontcuberta’s hybrid tree is both a lemon and an orange, depending on how you look at it. Opposing perspectives can coexist. The concept of reality is a little firmer than truth when you squeeze it; nonetheless, it remains foremost a concept as well. Oranges are not inherently orange—their color is not a physical property but the interaction of light with their surface, which will reflect some wavelengths and absorb others. Moreover, different animal species observe different wavelengths of light, perceiving diverging realities while cohabiting the same planet. And if that wasn’t enough confusion, reality collapses into a mere probability function at the quantum level.

However, once we return from these meta realms to our human dimension, pragmaticism is of the essence. Society frays if we cannot agree on a universal fabric holding it together. If we cannot agree on certain facts, reality becomes optional, with real consequences. Powered by social media and supercharged by AI, the exponential spread of disinformation and misinformation is already starting to erode democracies and societal cohesion around the world.“

Despite everything, the fundamental issue that troubles both specialists and the public concerns the credibility of images.

Some wonder whether a prompt-generated photograph will one day win the World Press Photo award. But perhaps the question is wrongly framed.

What should really be questioned is whether competitions like the World Press Photo still make sense.

We now live in a visual regime in which images increasingly construct the world rather than simply represent it.

[…] Perhaps we should even be grateful for their proliferation, because they remind us of the necessity of doubt.

Algorithmic photography reinforces the idea that every image is, inevitably, an illusion and forces us to reconsider the trust we place in images.

[…] Photography, therefore, has never truly been objective; we simply chose to believe that it was.

Today, with AI acting as a new demiurge, documentary photography quietly slips between historical narrative and fabricated illustration.

Deepfake technologies have opened Pandora’s box of iconography: thousands of hyperreal scenes and faces created from nothing flood our screens.

We no longer look in order to understand—we look in order to doubt.

[…] Every technology of vision has reshaped how we perceive the world.

What we are witnessing today is the transition from optical realism to informational realism—a synthetic realism summoned by commands, texts, and strings of code.

From Greek realism, to Renaissance perspective, to Enlightenment aspirations for accuracy, we have suddenly arrived at a condensed synthesis of all these visual regimes.

And now a single prompt can generate an image that might once have required centuries of technological evolution.

The claim that “every image has always been a fiction” is only half true—and half-truths are dangerous in public discourse.

Every photograph is framed, selected, edited – that’s undeniable. But a camera photograph still begins with something real: light from an actual event, recorded by a sensor. A generated image begins with statistical inference across a database of prior images. These are not the same act.

Treating them as equivalent doesn’t sharpen our critical thinking. Eliminating institutions like World Press Photo does not solve the problem either. The real task is to defend accountability: where an image comes from, who produced it, and under what conditions.

Trust is shifting—from the image to the process. Provenance, metadata, editorial chains of custody, and transparent sourcing become central. The image is no longer proof. The process is.

What is striking is that Fontcuberta does not address the democratic implications of this shift in this chapter. Public discourse depends on visual evidence. When all images become equally suspect, societies lose a crucial epistemic tool.

Doubt, in moderation, is productive. In excess, it becomes disorienting – and disorientation is easily exploited.

If any image can simulate evidence of events that never occurred, those who benefit most are those least deserving of trust. Blurring the distinction between photographic capture and synthetic generation does not liberate us from naivety. It provides cover for manipulation.

When visual evidence becomes a category of general suspicion, the burden of proof shifts in ways that favour those in power and disadvantage those trying to hold them accountable.

The answer is not to celebrate doubt as an end in itself. The answer is to construct new distinctions: between capture and synthesis, between enhancement and invention, between evidence and illustration—and to build institutions capable of maintaining those distinctions.

AI as a new visual undercurrent won’t wash away bedrock institutions like World Press Photo. It’s in the name: world. press. photo. Three pillars AI could never shake. It cannot produce real photos of a real world for real press articles.

Of course, it’s true that photography “has never truly been objective.” A photographer‘s choices—like what is left out of a frame and therefore left out of the visual narrative—have always rendered accuracy as an approximation, which is why captions must give context to documentary images exhibited by World Press Photo.

These are natural limitations that actually increase a photographer’s ambition of documentary accuracy. Doubting the continued relevance of press photos, Fontcuberta diminishes these efforts by shrugging off important distinctions of image creation and equating photographic evidence with illustrative exemplification.

As much as photography might be limited in accuracy, AI is technologically fully incapable of recording actual events. It has no bearing on such photo awards other than to contribute to the notion that they are more relevant than ever.

The statement “we no longer look in order to understand—we look in order to doubt” is catchy. Unfortunately, sober facts can look pretty boring next to such sensational one-liners, which is exactly why the press is struggling to compete for attention with viral social media accounts. The boring truth is that we still look in order to understand—hardcoded thinking related to our survival did not change overnight when LLM algorithms hijacked our brainwork in 2022. What changed is that we need to doubt more now. And maybe Fontcuberta is somewhat right when he muses whether that’s a good thing—certainly, we could use more critical thinkers.

But we’re already halfway down a slippery slope here. Historically, the veracity of images was fairly easy to establish. The manipulation of photographs was a cumbersome darkroom process that took time and skill. There were few who mastered it and many who could debunk it. That balance shifted with digital postproduction software, and fully flipped with AI. No matter how many critical thinkers we can raise, no matter how well-trained they are, you don’t stop an unchecked flood of AI slop with critical thoughts alone. Institutional guardrails and entrepreneurial ethics must serve civil society to the same degree we hold governments and the private sector accountable with our voting and purchasing decisions.

If these actors act together, Fontcuberta’s “synthetic realism” remains but a catchy phrase that tries to shrink eons of visual history—from cave paintings to pictorial messages flying through the cosmos aboard our space probes—by squeezing them into one binary contemporary age of catchall imagery.

To depict humanity’s diverse tools and methods of visual creation as culminating in an artificial smoothie is a misrepresentation of their evolution: photography is not an evolutionary progression of painting that replaced its predecessor, and AI does not replace cameras; those mediums, tools, and processes coexist, and will continue to coexist as evolved forms of expression, the same way lemons and oranges coexist as they succeed their common citric ancestor.

About the authors: Boris Eldagsen is a Berlin-based photo & video artist, investigating the unconscious mind. In search of the timeless, his visual poetry unites the sublime and the uncanny. You can find more of his work on his website, Facebook, YouTube, and Instagram.

Miles Astray is an activist artist blending writing and photography inspired by slow travel. You can find more of his work on his website and Facebook.