Contents as shown in below is referenced from Official VirtualGL site.

VirtualGL is an open source toolkit that gives any Unix or Linux remote display software the ability to run OpenGL applications with full 3D hardware acceleration. Some remote display solutions cannot be used with OpenGL applications at all. Others force OpenGL applications to use a slow, software-only renderer, to the detriment of performance as well as compatibility. The traditional method of displaying OpenGL applications to a remote X server (indirect rendering) supports 3D hardware acceleration, but this approach causes all of the OpenGL commands and 3D data to be sent over the network to be rendered on the client machine. This is not a tenable proposition unless the data is relatively small and static, unless the network is very fast, and unless the OpenGL application is specifically tuned for a remote X-Windows environment.

With VirtualGL, the OpenGL commands and 3D data are instead redirected to a 3D graphics accelerator (AKA “graphics processing unit” or “GPU”) in the application server, and only the rendered 3D images are sent to the client machine. VirtualGL thus “virtualizes” 3D graphics hardware, allowing it to be co-located in the “cold room” with compute and storage resources. VirtualGL also allows GPUs to be shared among multiple users, and it provides “workstation-like” levels of performance on even the most modest of networks. This makes it possible for large, noisy, hot 3D workstations to be replaced with laptops or even thinner clients. More importantly, however, VirtualGL eliminates the workstation and the network as barriers to data size. Users can now visualize huge amounts of data in real time without needing to copy any of the data over the network or sit in front of the machine that is rendering the data.

Normally, a Unix OpenGL application would send all of its drawing commands and data, both 2D and 3D, to an X-Windows server, which may be located across the network from the application server. VirtualGL, however, employs a technique called “split rendering” to force the 3D commands and data from the application to go to a GPU in the application server. VGL accomplishes this by pre-loading a dynamic shared object (DSO) into the OpenGL application at run time. This DSO intercepts a handful of GLX, OpenGL, and X11 commands necessary to perform split rendering. When the application attempts to use an X window for OpenGL rendering, VirtualGL intercepts the request, creates a corresponding 3D pixel buffer (“Pbuffer”) in video memory on the application server, and uses the Pbuffer for OpenGL rendering instead. When the application swaps the OpenGL drawing buffers or flushes the OpenGL command buffer to indicate that it has finished rendering a frame, VirtualGL reads back the pixels from the Pbuffer and sends them to the client.

The beauty of this approach is its non-intrusiveness. VirtualGL monitors a few X11 commands and events to determine when windows have been resized, etc., but it does not interfere in any way with the delivery of 2D X11 commands to the X server. For the most part, VGL does not interfere with the delivery of OpenGL commands to the GPU, either (there are some exceptions, such as the implementation of color index rendering.) VGL merely forces the OpenGL commands to be delivered to a GPU that is attached to a different X server (the “3D X server”) than the X server to which the 2D drawing commands are delivered (the “2D X server.”) Once the OpenGL rendering has been redirected to a Pbuffer, everything (including esoteric OpenGL extensions, fragment/vertex programs, etc.) should “just work.” If an application runs locally on a 3D server/workstation, then that same application should run remotely from that same machine using VirtualGL.