Storage System Bus Analysis – Parallel Coordinates

Steven Pungdumri
Graduate Graphics CSC 572
Zoe Wood
3/10/2011

Storage System Bus Analysis – Parallel Coordinates

Objective

Modeling SAS and SATA bus activity traces between host and storage systems for analysis would be very valuable for storage system developers and administrators. Bus analyzer software tools today often focus on bus protocol rather than giving a broad overview of what is occurring between host and storage. Analysis from such a model could potentially lead to insights of what is efficient or inefficient about a particular system design or application. This project sets out to visualize a reasonably large dataset (~55,000 records) of bus activity.

Background

Current offerings of bus trace analysis offer graphing one dimension at a time. Although these tools are simple and effective, it's not possible to view multiple dimensions of multiple records at once. This narrow view is less likely to convey a larger picture of the activity between host and storage.

Western Digital was kind enough to provide some sample data from a bus analyzer. Ideally, this application would be applicable towards any generic bus analyzer with similar data dimensions.

Procedure

Data

From a quick analysis of the provided data, trace vectors would be the unique key of the data to display. Each vector is represented by nineteen columns for various dimensions, including start time, end time, cache hit/miss, logical block address, queue starting/ending depth, command length, command completion time, intermediate command completion time, and more.

A parser was written to read in the data from the CSV file containing the test run, and was designed to store all of the data available regardless of its use in this project in order to provide the data for future work. The data is then partitioned and stored in a SQL database for retrieval in further runs, with an option to load a different data set with each run. Partitioning of the data for rendering is compressed to only those dimensions that are selected by the user to view, meaning less memory is required if certain dimensions are not valuable for analysis.

Graphics

Parallel Coordinates is used to model the data, displaying an axis for each dimension. Each vector is then plotted through each axis. The application provides the user with a choice of which dimensions to display, and one constraint to brush (color) throughout the model. Placement of axes and increments of axes values are calculated using linear interpolation, plotting positions of values as a factor of the range of values in the data set as well as the range of the available graph size.

Basic transparency is implemented for each line drawn in order to illustrate the density of certain reoccurring trends. This is very effective in comparison to simply drawing lines which that are displayed using depth with the z-buffer. The result is that lighter lines represent one record, and darker lines represent many records which follow the same path through the visualization.

From initial results, certain axes (Intermediate Command Completion Time, Completion Time, and Queue Completion Time) seemed to have many more lower values in their respective spectrums.  Substituting the linear scale for a logarithmic one made a substantial difference in visualizing the spread of the intersections at these axes.

Results

The result of this project is a working program which takes in a CSV input file, along with user specified preferences, requiring the following:

Usage: busanalysis -;i [input CSV file (optional)] -;c [color preferences file (optional)] -; e [enabled preferences file (optional)]

If a previous input file has been loaded, the vectors will still be present in the SQL database. With any run, a new file can be loaded, removing the previous records and loading in the new vectors.

A color preferences file may specify which dimension and range to color red, a file with each dimension on a new line followed by an integer indicating whether or not to color, followed by two floating point values indicating between what particular range of records to brush. For instance:

StartTime(s) 0 20.0 30.0

EndTime(s) 0

StartID 0

An enabled preferences file may specify which dimensions to display (otherwise all dimensions will be displayed). This is a file containing each dimension on a new line, followed by an integer indicating whether or not to display the dimension (0 for disabled, 1 for enabled). For instance:

Length(lba) 1

Alignment 1

FUA 0

CCT(ms) 1

Below are samples of visualizations with various axes displayed, both brushing all of the writes which occurred in this dataset.

Figure 1: Parallel coordinates visualization with write commands brushed red, all axes linear.

Figure 2: Parallel coordinates visualization with write commands brushed red, incorporates logarithmic axes.

Figure 3: Parallel coordinates visualization, incorporates logarithmic axes.

Figure 4: Parallel coordinates visualization with less axes displayed.

Future Work

Future work will consist of making this tool more usable, including but not limited to: being able to drag and rearrange axes, having a more intuitive and dynamic approach to user input for brushing records and selecting relevant axes, as well as spline curves and edge bundling to make patterns more visible. Further iterations will rely on feedback from target users as well (storage system administrators and designers).

References

Dr. Zoe Wood, Cal Poly San Luis Obispo

Western Digital Corporation

“Animation with OpenGL and GLUT.” OpenGL Programming Documentation Download Free Books. Web. 09 Mar. 2011. <http://www.opengl-doc.com/Sams-OpenGL.SuperBible.Third/0672326019/ch02lev1sec6.html>.

Lex, Alexander, Marc Streit, Christian Partl, Karl Kashofer, and Dieter Schmalstieg. Comparative Analysis of Multidimensional, Quantitative Data. Diss. 2010. Piscataway: IEEE Transactions on Visualization and Computer Graphics, 2010. Print.

“NeHe Productions: OpenGL Lesson #08.” NeHe Productions: Main Page. Web. 09 Mar. 2011. <http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=08>.

“OpenGL @ Lighthouse 3D – GLUT Tutorial.” Lighthouse3d.com. Web. 09 Mar. 2011. <http://www.lighthouse3d.com/opengl/glut/index.php?bmpfont>.