The Rock: Lights, Camera, Perception, Action

By Edward W. Otten

One of my favorite action scenes of all time is the chase between a Ferrari and a Humvee through the streets of San Francisco in the 1996 movie, The Rock. John Mason (Sean Connery) has just stolen a Humvee in an attempt to escape from the FBI. One of the FBI agents, Stanley Goodspeed (Nicholas Cage), commanderes a bright yellow Ferrari in an attempt to catch Mason. The chase leads the audience up and down the many hills of San Francisco, along the way destroying a milk truck, countless police cars, and a trolley. It is one of the most exciting and amazing chase scenes ever created.

But why?

Why is it exciting? Why is it amazing? Why do you feel like you could be standing there, watching this unfold, before your very eyes? Why, at other times, do you feel like you are in a place it would be impossible to exist in real life without being killed? Why do you cringe at each crunch of a vehicle? Why are you energized when the Goodspeed revs the engine of the Ferrari and flies off at breakneck speeds?

Most people, in thinking about those questions, would probably suggest that cognition and specifically emotions provide the answer. I have no doubt that they play a role. However, there is something much more important at work here, something which without it would make it impossible to even watch a movie, let alone be influenced by it. That something is perception.

Perception is the process that allows us to be in contact with the environment around us. It involves the pickup of information contained within the environment and the subsequent use of that information to specify that environment and the possibilities for action (behavior) within it. Perception is critical in everything we do, from picking up your coffee mug to walking down the street to watching movies.

In effect, in raising the questions above, I am really raising one overarching question: Why do movies work? Notice I did not say How do movies work?. I will be the first to admit that I know very little about lenses, lights, shutter speeds, and all the other mechanisms by which movies are made. My interest here lies in the perceptual mechanisms that allow people to watch movies. These mechanisms must be utilized by moviemakers in order to provide adequate information to the viewer about what is going on. What makes a movie exciting, however, is the manipulation of the mechanisms to create a specific effect. These manipulations are what defines a movie’s style, and consequently, a moviemaker’s style.

In the remainder of this essay, I am going to examine a few, but certainly not all, of the perceptual mechanisms that allow movies to work, and how specifically those mechanisms were used in the movie The Rock. The Rock presents many excellent examples of the use of these perceptual mechanisms, and perhaps more interestingly, many examples of the manipulation of those mechanisms for a particular effect.

Saccadic movement

I am sure everyone has experienced the following phenomenon; you are reading something when suddenly some movement in your periphery distracts you. You shift your gaze to see what is occurring in the same space. The movement seems instantaneous, and it almost is. About one-fifth of second after your eyes see the distraction, your eyes start moving. Your eyes, in the span of about 40 ms (milliseconds), reach speeds upwards of 500 degrees per second and then come to a screeching halt on the new target (Cutting, in press). This movement is called a saccade, and we make them constantly. In reading the previous line on this page, you made a saccade from each word to the next one. What makes saccades interesting, and applicable to movies, is what happens during a saccade. We are, in effect, blind to all visual information during a saccade, and a short time afterwards.

Movies consist of cuts, which separate one shot from another. Shots and cuts are combined together to create a scene. The right combination of shots and cuts is critical for creating a comprehensible scene. But think about this a little bit closer. The presence of cuts creates discontinuities in the flow of information, even if the cuts are short. Intuitively one might think that the presence of these discontinuties would lead to confusion on the part of the viewer because when we look around in the real world, everything seems continuous. But we don’t become confused. We are able to follow a scene without difficulty. Why? Because of saccades. Everything seems continuous in the real world, but it’s really not. We, just like a scene in a movie, have “cuts”, and consequently, “shots”. Even with these discontinuities, we are able to the extract information from the environment. We are able to understand our environment. And that is why we are able to understand movies.

A great example of this occurs in the climactic standoff near the end of The Rock. General Hummel (Ed Harris) is faced with the possible mutiny of his men after the bluff of killing the population of San Francisco fails. In all, six men stand in a roughly circular pattern pointing firearms at one another, all which Goodspeed and Mason watch from an adjoining room. Every actor in the scene has at least one line, and therefore the camera is constantly changing position and orientation to film the actor for their line. One might suspect that this cutting would lead to confusion as to who is talking to who, because you generally can’t see who exactly one actor is talking to. They are simply looking off camera. And yet, the scene is completely comprehensible. This is because we are accustomed to having cuts in the visual information in our environment. They occur all the time. (Incidentally, this scene contains about 80 cuts over 3 minutes. During the same period, you experience approximately 900 saccades).


As we all know, film is a two dimensional medium. However, the appearance of depth can be achieved by various techniques. One of the most common and straightforward of these techniques is that of occlusion. Basically, occlusion occurs when one object partly hides another from view. Obviously, the partially hidden object appears farther away. The use of occlusion can also help provide an understanding of position and orientation of a scene. For example, the introduction of Stanley Goodspeed involves a scene in which he is diffusing a bomb inside a glass containment unit. At various times during the scene, the camera cuts from inside the unit to the outside looking in. If the camera only filmed Goodspeed, it would be difficult to get a sense of the environment he was in, not to mention his relation to the rest of the actors in the scene. As another example, consider when Goodspeed is chasing Mason through the kitchen of the Fairmont hotel. In various shots Goodspeed is being blocked by shelves in the kitchen. Again, the occlusion of parts of his body adds depth to the scene. Now, the use of occlusion may seem very trivial, but it is in fact the most important and basic cue for creating depth. There are numerous other examples of occlusion in The Rock, including the standoff scene discussed above.

Motion perspective

The next time you are riding in a car, preferably a highway with lots of space on either side of you, do the following. Look out toward the horizon either to the right or left side of the car. Look at the buildings, mountains, or trees at the horizon. See how slow the seem to be moving past you? Now slowly begin to focus your gaze on objects closer and closer to the car, until the point you are looking straight down at the objects directly next to the car. The seem to fly by before you can even tell what they are. This phenomenon is called motion perspective. It refers to the relative motions of the objects attached to ground around a moving observer (or camera) or moving objects around a stationary observer. Objects closer to you move faster than ones farther away, and their velocity is inversely proportional to their distance from you, meaning that objects twice as far move half as fast (Cutting, in press).

In the Humvee and Ferrari chase, many of the shots had objects (especially repeating objects, like a fence) between the cars and the camera. These objects are moving by the camera faster than the car and the background. This helps create the effect that the car is moving faster than it actually is because the entire foreground appears to moving faster than the entire background. Also, in scenes where the car is very close to the camera it appears to be moving faster than in probably was in real life (compared to scenes where it the car was farther from the camera).

Combining Techniques

Talking about each of these particular techniques individually is fine, but the great thing about The Rock is that many scenes contain multiple techniques at the same time. For example, many of the exterior shots of San Francisco have both occlusion and motion perspective, especially the shots where the camera moves past the Golden Gate bridge with the city behind it. The bridge is occluding the city, and the movement of camera causes the bridge to move past faster than the rest of the background, resulting in a visually stunning scene.

Of course, perhaps the most stunning scene is one about which I am sure everyone is familiar. It has become a trademark of Michael Bay; the low angle rising shot that spins around (in this case) Goodspeed. In this case, the occlusion is Goodspeed occluding the background, a very simple occlusion to be sure. But the movement of the camera creates a constantly changing background. Goodspeed’s body is occluding one part of the background while simultaneously revealing another part of it. The is called the accretion and deletion of the background. Also, the movement of the camera causes motion perspective, meaning that Goodspeed is moving faster than the background. This, combined with the fact the the scene is in slow motion, creates one of the most incredible shots I’ve ever seen, both perceptually and cinematically.

The are hundreds of other perceptual techniques that I could talk about, including ones dealing with color, contrast, sound, etc. I chose to focus of several that I find the most interesting, especially when discussing a Michael Bay movie. His use of camera movement is unique among all of the directors I have seen. For a very long time, it was difficult for me to understand why I thought it was unique. It wasn’t until I started studying perception that I realized that the human visual system is actually critically dependent on movement. Without the movement of our bodies, and specifically our head, we probably would not be able to see, or at very least, we would be very poor at it. By moving the camera, Michael Bay in many ways is simulating our own movement and therefore our own perception. Better still, he moves the camera in ways that for us would be impossible. That is what makes his movies visually stunning. He takes human perception and pushes it to its limits. For other filmmakers, it’s “lights, camera, action.” For Bay, it’s “lights, camera, perception, action”.



Cutting, J.E. (in press). Perceving Scenes in Film and in the World. In J. D. Anderson & B. F. Anderson (Eds.) Moving Image Theory: Ecological Considerations.

Dr. Ed Otten received his Ph.D. in cognitive psychology from Miami University in Oxford, OH in 2008. His graduate research included various aspects of motion perception, including how it impacts motion sickness in flight simulators. He currently works as a Human Systems Specialist.