Menu
Lumberyard
Developer Guide (Version 1.11)

Streaming System

The Lumberyard streaming engine takes care of the streaming of meshes, textures, music, sounds, and animations.

Low-level Streaming System

CryCommon interfaces and structs

The file IStreamEngine.h in CryCommon contains all the important interfaces and structs used by the rest of the engine.

First of all there is the IStreamEngine itself. There is only one IStreamingEngine in the application and it controls all the possible I/O streams. Most of the following information comes directly from the documentation inside the code, so it's always good to read the actual code in IStreamEngine.h file for any missing information.

The most important function in IStreamEngine is the StartRead function which is used to start any streaming request.

IStreamEngine.h

Copy
UNIQUE_IFACE struct IStreamEngine { public: // Description: // Starts asynchronous read from the specified file (the file may be on a // virtual file system, in pak or zip file or wherever). // Reads the file contents into the given buffer, up to the given size. // Upon success, calls success callback. If the file is truncated or for other // reason can not be read, calls error callback. The callback can be NULL // (in this case, the client should poll the returned IReadStream object; // the returned object must be locked for that) // NOTE: the error/success/progress callbacks can also be called from INSIDE // this function // Return Value: // IReadStream is reference-counted and will be automatically deleted if // you don't refer to it; if you don't store it immediately in an auto-pointer, // it may be deleted as soon as on the next line of code, // because the read operation may complete immediately inside StartRead() // and the object is self-disposed as soon as the callback is called. virtual IReadStreamPtr StartRead (const EStreamTaskType tSource, const char* szFile, IStreamCallback* pCallback = NULL, StreamReadParams* pParams = NULL) = 0; };

The following are the currently supported streaming task types. This enum should be extended if you want to stream a new object type.

IStreamEngine.h

Copy
enum EStreamTaskType { eStreamTaskTypeCount = 13, eStreamTaskTypePak = 12, // Pak file itself eStreamTaskTypeFlash = 11, // Flash file object eStreamTaskTypeVideo = 10, // Video data (when streamed) eStreamTaskTypeReadAhead = 9, // Read ahead data used for file reading prediction eStreamTaskTypeShader = 8, // Shader combination data eStreamTaskTypeSound = 7, eStreamTaskTypeMusic = 6, eStreamTaskTypeFSBCache = 5, // Complete FSB file eStreamTaskTypeAnimation = 4, // All the possible animations types (dba, caf, ..) eStreamTaskTypeTerrain = 3, // Partial terrain data eStreamTaskTypeGeometry = 2, // Mesh or mesh lods eStreamTaskTypeTexture = 1, // Texture mip maps (currently mip0 is not streamed) };

A callback object can be provided to the StartStream function to be informed when the streaming request has finished. The callback object should implement the following StreamAsyncOnComplete and StreamOnComplete functions.

IStreamEngine.h

Copy
class IStreamCallback { public: // Description: // Signals that reading the requested data has completed (with or without error). // This callback is always called, whether an error occurs or not, and is called // from the async callback thread of the streaming engine, which happens // directly after the reading operation virtual void StreamAsyncOnComplete (IReadStream* pStream, unsigned nError) {} // Description: // Same as the StreamAsyncOnComplete, but this function is called from the main // thread and is always called after the StreamAsyncOnComplete function. virtual void StreamOnComplete (IReadStream* pStream, unsigned nError) = 0; };

When starting a read request, you can also provide the optional parameters shown in the following code.

IStreamEngine.h

Copy
struct StreamReadParams { public: // The user data that'll be used to call the callback. DWORD_PTR dwUserData; // The priority of this read EStreamTaskPriority ePriority; // Description: // The buffer into which to read the file or the file piece // if this is NULL, the streaming engine will supply the buffer. // Notes: // DO NOT USE THIS BUFFER during read operation! DO NOT READ from it, it can lead to memory corruption! void* pBuffer; // Description: // Offset in the file to read; if this is not 0, then the file read // occurs beginning with the specified offset in bytes. // The callback interface receives the size of already read data as nSize // and generally behaves as if the piece of file would be a file of its own. unsigned nOffset; // Description: // Number of bytes to read; if this is 0, then the whole file is read, // if nSize == 0 && nOffset != 0, then the file from the offset to the end is read. // If nSize != 0, then the file piece from nOffset is read, at most nSize bytes // (if less, an error is reported). So, from nOffset byte to nOffset + nSize - 1 byte in the file. unsigned nSize; // Description: // The combination of one or several flags from the stream engine general purpose flags. // See also: // IStreamEngine::EFlags unsigned nFlags; };

The return value of the StartRead function is an IReadStream object which can be optionally stored on the client. The IReadStream object is refcounted internally. When the callback object can be destroyed before the reading operation is finished, the readstream should be stored separately, and the abort should be called on it. Doing this will clean up the entire read request internally and will also call the async and sync callback functions.

The Wait function can be used to perform a blocking reading requests on the streaming engine. This function can be used from an async reading thread that uses the Lumberyard streaming system to perform the actual reading.

IStreamEngine.h

Copy
class IReadStream : public CMultiThreadRefCount { public: virtual void Abort() = 0; virtual void Wait( int nMaxWaitMillis=-1 ) = 0; };

Internal flow of a read request

The Lumberyard stream engine uses extra worker and IO threads internally. For every possible IO input, a different StreamingIOThread is created which can run independently from the others.

Currently the stream engine has the following IO threads:

  • Optical – Streaming from the optical data drive.

  • Hard disk drive (HDD) – Streaming from installed data on the hard disk drive (this could be a fully installed game, or shadow copied data).

  • Memory – Streaming from packed in-memory files, which requires very little IO.

When a reading request is made on the streaming engine, it first checks which IO thread to use, and computes the sortkey. The request is then inserted into one of the StreamingIOThread objects.

After the reading operation is finished, the request is forwarded to one of the decompression threads if the data was compressed, and then into one of the async callback threads. The amount of async callback threads is dependent on the platform, and some async callback threads are reserved for specific streaming request types such as geometry and textures. After the async callback has been processed, the finished streaming request is added to the streaming engine to be processed on the main thread. The next update on the streaming engine from the main thread will then call the sync callback (StreamOnComplete) and clean up the temporary allocated memory if needed.

For information regarding the IO/WorkerThreads please check the StreamingIOThread and StreamingWorkerThread class.

Read request sorting

Requests to the streaming engine are not processed in a the same order as which they have been requested. The system tries to internally 'optimize' the order in which to read the data, to maximize the read bandwidth.

When reading data from an optical disc , it is very important to reduce the amount of seeks. (This is also true when reading from a hard disk drive, but to a lesser extent). A single seek can take over 100 milliseconds, while the actual read time might take only a few milliseconds. Some official statistics from the 360 XDK follow.

  • Outer diameter throughput : 12x (approximately 15 MB per second).

  • Inner diameter throughput : 5x (6.8 MB per second).

  • Average seek (1/3rd stroke) time : 110 ms typical, 140 ms maximum.

  • Full stroke seek time : 180 ms typical, 240 ms maximum.

  • Layer switch time : 75 ms.

The internal sorting algorithm takes the following rules into account in the following order.

  • Priority of the request – High priority requests always take precedence, but too many of them can introduce too many extra seeks.

  • Time grouping – Requests made within a certain time are grouped together to create a continuous reading operation on the disc for every time group. The default value is 2 seconds, but can be changed using the following cvar: sys_streaming_requests_grouping_time_period. Time grouping has a huge impact on the average completion time of the requests. It increases the time of a few otherwise quick reading requests, but drastically reduces the overall completion time because most of the streaming requests are coming from random places on the disc.

  • Actual offset on disc – The actual disc offset is computed and used during the sorting. Files which have a higher offset get a higher priority, so it is important to organize the layout of the disc to reflect the desired streaming order.

For information regarding sorting, please refer to the source code in StreamAsyncFileRequest::ComputeSortKey(). The essential sorting code follows.

CAsyncIOFileRequest::ComputeSortKey

Copy
void CAsyncIOFileRequest::ComputeSortKey(uint64 nCurrentKeyInProgress) { .. compute the disc offset (can be requested using CryPak) // group items by priority, then by snapped request time, then sort by disk offset m_nDiskOffset += m_nRequestedOffset; m_nTimeGroup = (uint64)(gEnv->pTimer->GetCurrTime() / max(1, g_cvars.sys_streaming_requests_grouping_time_period)); uint64 nPrioriry = m_ePriority; int64 nDiskOffsetKB = m_nDiskOffset >> 10; // KB m_nSortKey = (nDiskOffsetKB) | (((uint64)m_nTimeGroup) << 30) | (nPrioriry << 60); }

Streaming statistics

The streaming engine can be polled for streaming statistics using the GetStreamingStatistics() function.

Most of the statistics are divided into two groups, one collected during the last second, and another from the last reset (which usually happens during level loading). Statistics can also be forcibly reset during the game.

The SMediaTypeInfo struct gives information per IO input system (hard disk drive, optical, memory).

IStreamEngine.h

Copy
struct SMediaTypeInfo { // stats collected during the last second float fActiveDuringLastSecond; float fAverageActiveTime; uint32 nBytesRead; uint32 nRequestCount; uint64 nSeekOffsetLastSecond; uint32 nCurrentReadBandwidth; uint32 nActualReadBandwidth; // only taking actual reading time into account // stats collected since last reset uint64 nTotalBytesRead; uint32 nTotalRequestCount; uint64 nAverageSeekOffset; uint32 nSessionReadBandwidth; uint32 nAverageActualReadBandwidth; // only taking actual read time into account };

The SRequestTypeInfo struct gives information about each streaming request type, such as geometry, textures, and animations.

IStreamEngine.h

Copy
struct SRequestTypeInfo { int nOpenRequestCount; int nPendingReadBytes; // stats collected during the last second uint32 nCurrentReadBandwidth; // stats collected since last reset uint32 nTotalStreamingRequestCount; uint64 nTotalReadBytes; // compressed data uint64 nTotalRequestDataSize; // uncompressed data uint32 nTotalRequestCount; uint32 nSessionReadBandwidth; float fAverageCompletionTime; // Average time it takes to fully complete a request float fAverageRequestCount; // Average amount of requests made per second };

The following example shows global statistics that contain all the gathered data.

IStreamEngine.h

Copy
struct SStatistics { SMediaTypeInfo hddInfo; SMediaTypeInfo memoryInfo; SMediaTypeInfo opticalInfo; SRequestTypeInfo typeInfo[eStreamTaskTypeCount]; uint32 nTotalSessionReadBandwidth; // Average read bandwidth in total from reset - taking full time into account from reset uint32 nTotalCurrentReadBandwidth; // Total bytes/sec over all types and systems. int nPendingReadBytes; // How many bytes still need to be read float fAverageCompletionTime; // Time in seconds on average takes to complete read request. float fAverageRequestCount; // Average requests per second being done to streaming engine int nOpenRequestCount; // Amount of open requests uint64 nTotalBytesRead; // Read bytes total from reset. uint32 nTotalRequestCount; // Number of request from reset to the streaming engine. uint32 nDecompressBandwidth; // Bytes/second for last second int nMaxTempMemory; // Maximum temporary memory used by the streaming system };

Streaming debug information

Different types of debug information can be requested using the following CVar: sys_streaming_debug x.

Streaming and Levelcache Pak Files

As mentioned earlier, it is very important to minimize the seeks and seek distances when reading from an optical media drive. For this reason, the build system is designed to optimize the internal data layout for streaming.

The easiest and fastest approach is to not do any IO at all, but read the data from compressed data in memory. For this, small paks for startup and each level are created. These are loaded into memory during level loading. Some paks remain in memory until the end of the level. Others are only used to speed up the level loading. All small files and small read requests should ideally be diverted to these paks.

A special RC_Job build file is used to generate these paks: Bin32/rc/RCJob_PerLevelCache.xml. These paks are generated during a normal build pipeline. The internal managment in the engine is done by the CResourceManager class, which uses the global SystemEvents to preload or unload the paks.

Currently, the following paks are loaded into memory during level loading (sys_PakLoadCache).

  • level.pak – Contains all actual level data, and should not be touched after level loading anymore.

  • xml.pak

  • dds0.pak – Contains all lowest mips of all the textures in the level.

  • cgf.pak and cga.pak – Only load when CGF streaming is enabled.

The following paks are cached into memory during the level load process (sys_PakStreamCache).

  • dds_cache.pak - Contains all dds files smaller than 6 KB (except for dds.0 files).

  • cgf_cache.pak - Contains all cgf files smaller than 32 KB (only when CGF streaming is enabled).

Important

Be sure that these paks are available. Without them, level loading can take up to a few minutes, and streaming performance is greatly reduced.

The information regarding all the resources of a level are stored in the resourcelist.txt and auto_resourcelist.txt. These files are generated by an automatic testing system which loads each level and executes a prerecorded playthrough on it. These resourcelist files are used during the build phase to generate the level paks.

All data not in these in memory paks is handled through IO on the optical drive or hard disk drive, and it is also best to reduce the amount of seeks here. This optimization phase is also performed during the build process using the resource compiler.

All the data which can be streamed is extracted from all the resource lists from all levels, and is removed from the default pak files (for example, objects.pak, textures.pak, animations.pak) and put into new optimized paks for streaming inside a streaming folder.

The creating of the streaming paks uses the following rules:

  • Split by extension: Different extension files are put into different paks (for example, dds, caf, dba, cgf) so that files of the same type can be put close to each other. This enables them to be read in bursts. The paks are also used to increase the priority of certain file types during request sorting by using the disc offset.

  • Split by DDS type: Different dds types are sorted differently to increase the priority of different types (for example, diffuse maps get higher priority than normal maps). The actual distance in the pak is used during the sorting of the request.

  • Split by DDS mip: The highest mips are put into a separate pak file. They usually take more than 60% of the size of all the smaller mips and can then be streamed with a lower priority. This greatly reduces the average seek time required to read the smaller textures. The texture streaming system internally optimizes the reads to reflect these split texture data.

  • Sort alphabetically: Default alphabetical sorting is required because some of the data (such as CGF's during MP level loading), are loaded in alphabetical order. Changing this sort order can have a severe impact on the loading times.

The actual sorting code is hardcoded in the resource compiler, and can be found at: Code\Tools\RC\ResourceCompiler\PakHelpers.cpp.

Important

If you make changes to the sorting operator in the resource compiler, be sure to make the same changes to the texture streaming and streaming engine sorting operators.

Single Thread IO Access and Invalid File Access

It is very important that only a single thread access a particular IO device at one time. If multiple threads read from the same IO device concurrently, then the reading speed is more than halved, and it may take a number of seconds to read just a few kilobytes. This occurs because the IO reading head will partially read a few kilobytes for one thread, and then read another few kilobytes for another thread while always performing expensive seeks in between.

The solution is to exclusively read from StreamingIOThreads during gameplay. Lumberyard will by default show an Invalid File Access warning in the top left corner when reading data from the wrong thread, and will stall deliberately for threed seconds to emulate the actual stall when reading from an optical drive.

High Level Streaming Engine Usage

It is very easy to extend the current streaming functionality using the streaming engine. In this section, a small example class is presented that shows how to add a new streaming type.

First, create a class which derives from the IStreamCallback interface, which informs about streaming completion, and add some basic functionality to read a file. The file can either be read directly or use the streaming engine. When the data is read directly, it calls the ProcessData function to parse the loaded data. The function is also called from the async callback. Some processing can be performed here on the data if needed because it does not run on the main thread.

The default parameters are used when starting a reading request on the streaming engine. It is also possible to specify the final data storage to help reduce the number of dynamic allocations performed by the streaming engine.

The class also stores the read stream object in order to get information about the streaming request or to be able to cancel the request when the callback object is destroyed. The pointer is reset in the sync callback because after the call it will no longer be referenced by the streaming engine.

CNewStreamingType

Copy
#include class CNewStreamingType : public IStreamCallback { public: CNewStreamingType() : m_pReadStream(0), m_bIsLoaded(false) {} ~CNewStreamingType() { if (m_pReadStream) m_pReadStream->Abort(); } // Start reading some data bool ReadFile(const char* acFilename, bool bUseStreamingEngine) { if (bUseStreamingEngine) { StreamReadParams params; params.dwUserData = eLoadFullData; params.ePriority = estpNormal; params.nSize = 0; // read the full file params.pBuffer = NULL; // don't provide any buffer, but copy data when streaming is done m_pReadStream = g_pISystem->GetStreamEngine()->StartRead(eStreamTaskTypeNewType, acFilename, this, &params); } else { // old way of reading file in a sync way (blocking call!) const char* acData = 0; size_t stSize = 0; .. read file directly using CryPak or fopen/fread ProcessData(acData, stSize); m_bIsLoaded = true; } return m_bIsLoaded; } // Check if the data is ready and loaded bool IsLoaded() const { return m_bIsLoaded; } protected: // implement the IStreamCallback function void StreamAsyncOnComplete(IReadStream* pStream, unsigned nError) { if(nError) { return; } const char* acData = (char*)pStream->GetBuffer(); size_t stSize= pStream->GetBytesRead(); ProcessData(acData, stSize); m_bIsLoaded = true; } void StreamOnComplete (IReadStream* pStream, unsigned nError) { m_pReadStream = 0; } // process the actual loaded data void ProcessData(const char* acData, size_t stSize); // store the stream callback object to be sure it can be canceled when the object is destroyed IReadStreamPtr m_pReadStream; // Extra flag used to check if the data is ready bool m_bIsLoaded; }