Time-warp Explained

In the context of virtual reality, time warp is a technique to reduce the apparent latency between head movement and the the corresponding image that appears inside an HMD.

In an ideal world, the rendering engine would render an image using the measured head pose (orientation and position) immediately before the image is displayed on the screen. However, in the real world, rendering takes time, so the rendering engine uses a pose reading that is a few milliseconds before the image is displayed on the screen. During these few milliseconds, the head moves, so the displayed image lags a little bit after the actual pose reading.

Let’s take a numerical example, Let’s assume we need to render at 90 frames per second, so there are approximately 11 milliseconds for the rendering process of each frame. Let’s assume that head tracking data is available pretty much continuously but that rendering takes 10 milliseconds. Knowing the rendering time, the rendering engine starts rendering as late as possible, which is 10 milliseconds before the frame needs to be displayed. Thus, the rendering engine uses head tracking data that is 10 milliseconds old. If the head rotates at a rate of 200 degrees/second, these 10 milliseconds are equivalent to 2 degrees. If the horizontal field of view of the HMD is 100 degrees and there are 1000 pixels across the visual field, a 2-degree error means that the image lags actual head movement by about 20 pixels.

However, it turns out that even a 2 degree head rotation does not dramatically change the perspective of how the image is drawn. Thus, if there was a way to move the image by 20 pixels on the screens (e.g. 2 degrees in the example), the resultant image would be pretty much exactly what the render engine would draw if the reported head position was changed by two degrees.

That’s precisely what time-warping (or “TW” for short) does: it quickly (in less than 1 millisecond) translates the image a little bit based on how much the head rotated between the time the render engine used the head rotation reading and the time the time warping begins.

The process with time warping is fairly simple: the render engine renders and then when the render engine is done, the time-warp is quickly applied to the resultant image.

But what happens if the render engine takes more time than is available between frames? In this case, a version of time-warping, called asynchronous time-warping (“ATW”) is often used. ATW takes the last available frame and applies time-warping to it. If the render engine did not finish in time, ATW takes the previous frame, and applies time-warping to it. If the previous frame is taken, the head probably rotated even more, so a greater shift is required. While not as ideal as having the render engine finish on time, ATW on the previous frame is still better than just missing a frame which typically manifests itself in ‘judder’ – uneven movement on the screen. This is why ATW is sometimes referred to as a “safety net” for rendering, acting in case the render did not complete on time. The “Asynchronous” part of ATW comes from the fact that ATW is an independent process/thread from the main render engine, and runs at a higher priority than the render engine so that it can present an updated frame to the display even if the render engine did not finish on time.

Let’s finish with a few finer technical points:

  • The time-warping example might lead to believe that only left-right (e.g. yaw) head motion can be compensated. In practice, all three rotation directions – yaw, pitch and roll – can be compensated as well as head position under some assumptions. For instance, OSVR actually performs 6-DOF warping based in an assumption of objects that are 2 meters from the center of projection. It handles rotation about the gaze direction and approximates all other translations and rotations.
  • Moving objects in the scene – such as hands – will still exhibit judder if the render engine misses a frame, in spite of time-warping.
  • For time-warping to work well, the rendered frame needs to be somewhat bigger than the size of the display. Otherwise, when shifting the image one might end up shifting empty pixels into the visible area. Exactly how much the rendered frame needs to be larger depends on the frame rate, and the expected velocity of the head rotation. Larger frames mean more pixels to render and more memory, so time warping is not completely ‘free’
  • If the image inside the HMD is rendered onto a single display (as opposed to two displays – one per eye), time warping might want to use different warping amounts for each eye because typically one eye would be drawn on screen before the other.
  • Objects such as a menu that are in “head space” (e.g. should be fixed relative to head) need to be rendered and submitted to the time-warp code separately since they should not be modified for projected head movement.
  • Predictive tracking (estimating future pose based on previous reads of orientation, position and angular/linear velocity) can help as input to the render engine, but an actual measurement is always preferable to estimation of the future pose.
  • Depending on the configuration of the HMD displays, there may be some rendering delay between left eye and right eye (for instance, if the screen is a portrait-mode screen, renders top to bottom and the left eye maps to the top part of the screen). In this case, one can use different time warp values for each eye.

For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Related Posts