Common Objects in Context (2022)

Working together with artist Pekko Vasantola, Common Objects in Context delves into the grammar of machine vision and the shifting nature of photography in the algorithmic age. The exhibition title comes from the state-of-the-art dataset of the same name, better known as COCO, commonly used for training computer vision tasks such as object recognition. The exhibition examines the logic of object recognition—one of bounding boxes, discrete objects with clear outlines, crowdsourced labor, and distributed seeing.

COCO consists of more than 300,000 photographs scrapped from the photo-sharing site Flickr. Each image contains five captions written by cheap crowdsourced human workers. The workers are instructed to follow seven strict rules in describing the depicted scene such that the captions are clean, direct, and ‘objective’ for machine learning.

Describe all the important parts of the scene.
Do not start the sentences with “There is”.
Do not describe unimportant details.
Do not describe things that might have happened in the future or past.
Do not describe what a person might say.
Do not give people proper names.
The sentences should contain at least 8 words.

The exhibition renders the logic of rule-based seeing into different media. The metal objects are re-shaped according to the seven rules, smoothened and polished, with captions etched onto their surfaces. Upon closer examination, the captions reveal that human vision is full of errors, assumptions, and subconscious interpretation. Others exemplify the irreducible complexity of visual understanding. Computer vision sees objects in all their shiny sheen, without the memory of the past or imagination for the future.

The installation, composed of a boulder, three sledgehammers, and three smartphones, focuses on another aspect of computer vision: labor. The collection and cleaning of data for machine learning require labor. To look is to labor. The rock is encircled by shutter-triggered hammers activated when the audience presses the capture button or automatically according to a pre-written program. As the smartphones snap a photo of the boulder, the installation translates the non-manual labor into manual labor in the form of hammering, changing the shapes of the very boulder they are photographing.

The exhibition speculates on a world where the logic of rule-based seeing is indiscriminately applied to everything. Every object shown in the exhibition reveals the reciprocal relationship between human and computer vision—humans’ train’ computer vision to ‘see,’ which, in turn, shapes the objects we can see.

We want to thank Arts Promotion Center Finland Taike, the Finnish Cultural Foundation, and the Olga and Vilho Linnamo Foundation for their kind support of the project.

We want to give special thanks to the people who help bring our vision to reality in different capacities and forms: Annika Nyström of Nordfalk for generously donating the boulder, Heikki Pullo from Teijo Makespace for your work, Teemu Lehmusruusu and Matti Niinimaki for your guidance.