Using Perspective to Convey Depth in 2D Games

One of the most significant visual challenges in the development of any 2D game is creating an illusion of depth. Typically, assets are two-dimensional, the position of the camera is locked, and character movement is limited to predetermined planes. In order to compensate, a sense of depth along the z-axis (moving towards and away from the camera) is usually created by some combination of parallax scrolling, placing objects in the background (behind the character’s plane of movement) and the foreground (in front of the character), scaling sizes of assets relative to their theoretical distances from the camera, and varying light levels, focus, and saturation of resources depending on their perceived positions in space. While these methods can successfully make the limitations of 2D games relatively inconspicuous to the player, as is the case in Limbo, as I began the artistic development of Violet I was certain that a better way to create an immersive environment must exist.

Limbo uses techniques traditional to 2D platformers to create a sense of depth and space.

Using linear perspective struck me as a somewhat obvious solution, and I wondered why it seems so infrequently utilized in 2D games. The system of graphical perspective as we know it today has been in use since the early 15th century, and less sophisticated versions were practiced long before then. Surely this is not beyond the capabilities of modern game developers, I reasoned. However, not long into the creative process, I began running into challenges presented exclusively by my use of perspective. With a little experimentation these minor issues could be resolved, but as I progressed further I began to realize that those small issues were manifestations of a larger disconnect between the dynamic world of the game and my static implementation of linear perspective.

In order to understand the issues presented by using perspective, some basic knowledge of the concept is necessary. While an all-encompassing comprehension is ideal (and would prevent many potential problems in its implementation), I will provide you with a less technical explanation of the fundamental ideas of linear perspective, along with the disclaimer that I do not consider myself to be the ultimate authority on the subject.

Linear perspective relies upon the existence of a horizon line, which is fundamentally the same as the viewer’s eye level or the position of the camera. In one-point perspective, all lines moving through the z-axis converge upon a singular vanishing point located on the horizon line. This is the form of perspective I was attempting to utilize for Violet, as it allows front and back faces of cube-like forms to appear flat, while surfaces that extend along the z-axis are also visible. When I place the vanishing point in the center of an environment such as the interior of a building, this causes the floor, ceiling, and side walls to be visible as well as the back wall, which allows for exterior doors and windows to be visible and creates an illusion of expansiveness. Linear perspective also creates an optical illusion called foreshortening, in which objects closer to the viewer/camera appear larger than those that are farther away. This phenomenon occurs within individual objects as well, so there is a visual sense of expansion and contraction as forms extend towards or away from the camera.

An example of an interior space in one-point perspective

Once the vanishing point has been placed, all objects and assets should be constructed relative to the vanishing point in order to preserve the illusion of depth accurately. That is, all edges moving through the z-axis should converge upon the vanishing point. This fundamental concept, and the deviation from it, is where most conflicts I encountered originated from.

I decided to create a hallway at one end of the interior space I had constructed. I wanted the back wall of the hallway to be visible as well as the walls, ceiling, and floor, but this was only possible if I ignored the original vanishing point and made the walls converge upon a completely unrealistic point. Because the entire interior space would not be visible in actual game play, a hallway relying upon a second vanishing point can be created without it being obviously wrong to the player. However, if I decide to create wooden floors that run vertically, this causes problems; the lines of the floorboards can only converge upon one vanishing point. This conflict can be subverted by implementing flooring that runs horizontally rather than vertically, but the issues presented by the conventions of one-point perspective remain.

The edges of this hallway converge correctly on the vanishing point, so only the left wall, ceiling, and floor are visible.
This hallway is constructed incorrectly, but all five walls are now visible.
Because flooring that runs vertically can only converge upon one vanishing point, the hallway's inaccuracy becomes obvious.

In reality, the vanishing point moves relative to the movement of the viewer. This means that the appearance of all visible objects will also change relative to the vanishing point’s change in position. This is a possibility for games using real time head tracking and/or 3D modeled assets, for the change in perspective could be rendered dynamically as the camera moves through space. However, this presents serious problems for developers using 2D assets, as each object would have to be recreated in accordance with every possible vanishing point.

Trine is a primarily 2D sidescroller that uses 3D assets.

Even with 2D resources, there are still ways (many of them undiscovered or unexplored) to utilize certain facets of perspective while working around the issues presented by the visual system. If an entire space or scene is too large to be visible on screen during game play, all the assets of that scene could still be created in respect to a singular vanishing point. Think of the game play view as being simply “zoomed in” on a segment of the entire space; this way, objects do not need to be recreated multiple times for every camera position. A space and its components can be created using perspective if there is some sort of loading screen or distinct visual transition between spaces, for the disconnect between changes in perspective is not as conspicuous. This method is utilized in Teenage Mutant Ninja Turtles: Turtles in Time, as the screen fades to black between levels. Smooth transitions between perspectives are feasible, but the options are relatively unexplored; experimentation is necessary in order to determine the best way to transition in each individual situation.

Teenage Mutant Ninja Turtles: Turtles in Time displays a black screen to transition between a variety of perspectives.

Another way to represent three-dimensional assets in a 2D game is to use an axonometric projection, which rotates objects along their axes relative to a predetermined plane of projection. These kinds of representations do not rely upon a vanishing point, so many of the problems associated with linear perspective do not exist using this method. Isometric projections, the most commonly used form of axonometric projection, cause all three axes to be foreshortened equally. That is, all lines of a cube extending through the same axis are parallel and are equal in length. This means that all angles between converging lines also have the same measure.

In more practical terms, isometric projections create the illusion of objects being above or below eye level, similar to the effect achieved by two-point perspective. However, objects are not foreshortened in the same way with isometric projections; due to the lack of vanishing point, objects do not seem to expand and contract. This lack of vanishing point also prevents the need for dynamic changes in the appearance of assets, as all objects are projected in the same way despite the position of the camera. Although isometric projections do portray three-dimensionality without some of the issues associated with perspective, the lack of foreshortening can present difficulty in portraying depth and distance between resources.

A cube projected in two-point perspective versus an isometric projection of a cube (below).

Although perspective conveys the dimensions of space more believably, its use in 2D platformers and sidescrollers presents additional challenges, for it creates an illusion of depth that can be misleading when the character is locked to a rigid plane of movement. Although characters can sometimes appear to move through the z-axis in 2D games, as is the case in Turtles in Time, an open expanse of ground only highlights the lack of freedom of motion in games where movement through the z-axis is not possible. If developers wish to make it possible for the character to interact with objects on a different plane, the character must be animated on an individual basis. If the character is created using 2D animation software with a bones tool, this could be nearly impossible to achieve. If the character is traditionally animated, creating individual animations for each interaction is tedious but not impossible. If the character is 3D modeled, this is not really an issue. When a form of perspective more traditional to 2D sidescrollers is used, small differences in placement between foreground and background objects allow for easy interactions and for animations to be used in multiple interactions throughout the game.

With this much depth of ground space visible, a custom animation would have to be made in order for the character to move through the z-axis and pet the cat.
Using a form of perspective more traditional to 2D sidescrollers, the character can more easily bend over and pet the cat.

When deciding upon your approach for creating depth, it is important to consider the requirements of your individual game and the qualities you value most in your work, as well as any potential problems that could arise from your choices. However, unless the universe is particularly biased in your favor, unpredicted difficulties can and will occur. Because I want to uphold a sense of continuity throughout Violet and therefore avoid transitional screens, I have tentatively decided to scrap my initial approach and implement linear perspective only on a small scale, considering assets on an individual basis and keeping any foreshortening at an unobtrusive level. Because 2D games are fundamentally unrealistic, developers are provided with the opportunity to determine not necessarily what is right or accurate, but what is best and what contributes the most to the overall quality of their product.

Follow us @purple_pwny and the author @thisislux. thank you :)