Next Previous Contents

2. Introduction

SLh5 is a S-Lang module for the HDF5 file format and software library. HDF5 is widely used in large scale science and engineering projects -- such as the FLASH simulator of thermonuclear explosions in stars ( http://www.flash.uchicago.edu/website/home/) -- because it combines data and compiler portability with very high performance, scalability to terabyte-size datasets, and support for parallel I/O. The S-Lang interpreter is part of the widely-used, open-source library of the same name, and provides a scripting environment well-suited for scientific and engineering tasks, due to the powerful, compact, fast, and robust multidimensional numerical capabilities that are native to the language.

SLh5 provides a high-level interface to HDF5, supporting the easy creation of S-Lang arrays from HDF5 files or vice versa. The entire HDF5 api is presented in terms of 6 simple functions:

        h5_new   -  create one or more new HDF5 files
        h5_open  -  open one or more HDF5 files/groups/datasets/attributes
        h5_close -  close one or more HDF5 files

        h5_read  -  read one or more datasets or attributes
        h5_write -  write a dataset or attribute
        h5_list  -  browse an HDF5 container (analogous to h5dump utility)
although users will typically need only the latter two or three. Much of the low-level HDF5 library may also be called directly from S-Lang, though there should be little need for user-level scripts to do so (and the low-level wrappers are undocumented here). There is also no need to explicitly close files, or opened datasets, attributes, groups, etc, as SLh5 will automatically do so when the variable referencing them goes out of scope. Moreover, and with the exception of h5_write, the entire top-level SLh5 interface is vectorized; this allows multiple files to be opened or created, or multiple datasets or attributes to be read, in a single call.

With SLh5 users may thus interact with HDF5 files in scriptable analysis enviroments such as ISIS ( http://space.mit.edu/cxc/isis/), with substantially smaller and cleaner code than is possible in other analysis platforms, e.g. IDL (tm), and with potentially considerable I/O performance advantages.

The present implementation is incomplete, with the major gaps being that partial I/O is not yet supported, and that compound types -- i.e. structures -- may be read but not written. S-Lang version 2.0.6 or later is required.

2.1 Examples

SLh5 can be loaded at runtime into any suitably configured application which embeds the S-Lang interpreter, using either

        require("h5");
or
        () = evalfile("h5");
Now, given the S-Lang variables
        x = [-50:50];
        y = x^2;
        z = x^3;
        attr = "StringAttribute";
one might write them to an HDF5 file as 3 datasets and an attribute via
        file = "test.h5";
        file_id = h5_new(file);
        h5_write(file_id, x, "x");
        h5_write(file_id, y, "y");
        h5_write(file_id, z, "z");
        dataset_id = H5Dopen(file_id, "z");
        h5_write(dataset_id, attr, "attr", H5I_ATTR);
        h5_close(file_id);
They may be read back in various manners, such as
        x2 = h5_read(file, "x");
or
        file_id = h5_open(file);
        y2 = h5_read(file_id, "y");
and
        attr2 = h5_read("test.h5:/z","attr");
This last statement opens the test.h5 file and the z dataset, using an identifier for the latter as the location from which the attr attribute will be read. This is equivalent to
        attr2 = h5_read("test.h5","/z/attr");
both of which are convenient aternatives to either
        f = h5_open("test.h5");
        d = H5Dopen(f, "z");
        attr2 = h5_read(d, "attr");
or
        d = h5_open("test.h5:/z");
        attr2 = h5_read(d, "attr");

The correctness of the above HDF5 I/O calls may be easily verified through

        any( x != x2 );
        any( y != y2 );
        attr == attr2;
As noted earlier, SLh5 is vectorized, which allows multiple operations to be performed with a single call. For example,
        just_two = h5_read("test.h5", ["y","x"]);
returns the x and y dataset arrays, in reverse write order, while both
        three_1 = h5_read("test.h5", ["x","y","z"]);
and
        three_2 = h5_read("test.h5", "*");
read all three dataset arrays at once, the second using wildcard shorthand notation. Each of these vectored calls returns an array of arrays, the results of which might be verified via
        any( three_1[0] != three_2[0]);
        any( y != three_1[1]);
        any(three_1[2] != three_2[2] or three_2[2] != z);

Now let's look at some scientific examples using the ISIS ( http://space.mit.edu/cxc/isis/) astrophysical modeling and analysis system (and assuming Isis_Append_Semicolon = 1). Consider

        isis>  slab = urand(10, 30, 50)
which generates a 3D grid of uniformly-distributed, double-precision random numbers
        isis>  slab
        Double_Type[10,30,50]
We can use volview, our prototype 3D volume visualizer, to get an idea of what this slab looks like:
        isis>  require("volview")
        isis>  volview(slab)






The slab may be saved as an HDF5 file via
        isis>  h5_write("slab.h5", slab)
and recreated from this file via
        isis>  slab_copy = h5_read("slab.h5")
Verifying that the original and copy are equivalent is again as easy as
        isis>  any(slab != slab_copy)
        0
As another example, consider the S-Lang function
    define cone()
    {
       variable r, h, nsteps;
       switch(_NARGS)
       { case 3: (r, h, nsteps) = (); }
       { usage("cone(radius, height, number_of_steps)"); }

       r = r * 1.0; h = h * 1.0;
       variable hsteps = [ 0.0 : h : h / nsteps ];
       variable rsteps = (r*hsteps/h)^2;
       variable xysteps = [-r: r: 2*r/nsteps]^2, xsq = xysteps;
       nsteps = length(xysteps);
       variable c = Double_Type[length(rsteps), nsteps, nsteps];

       variable i = 0;
       foreach(rsteps) {
          variable rsq = ();
          variable j = 0;
          foreach (xysteps) {
             variable ysq = ();
             c[i, j, *]  = (ysq + xsq <= rsq);
             j++;
          }
          i++;
       }
       return c;
    }
which defines a cone in 3D space, with a given size and discretization (number of subdivisions along the Z axis). The command
        isis>  c = cone(5, 10, 200)
creates a 200x200x200 cube
        isis>  c
        Double_Type[200,200,200]
where only the elements corresponding to the cone surface have non-zero values. This can again be visualized with volview
        isis>  volview(c)


(where we've applied a copper colormap, and rotated the volume to show both the interior and exterior surfaces) and then written, restored, and verified as before:
        isis>  h5_write("cone.h5", c)
        isis>  cone_copy = h5_read("cone.h5")
        isis>  any(c != cone_copy)
        0

Additional examples may be found in the regression test subdirectory (./tests) of the SLh5 distribution.


Next Previous Contents