Scripts

Home page Models Systems Tools IT staff References

There are several scripts that have been written for running the models and post-processing the results. These have proliferated in a haphazard fashion to meet immediate needs as they arose. We are now in the process of consolidating some of these scripts by making more general versions that can meet particular needs in a wider range of circumstances. The list below describes some of the main scripts. It is divided into 2 parts, the current set of scripts typically used and legacy scripts rarely if ever used anymore.

The scripts are written in Perl (with a ".pl" file extension), Python (with a ".py" file extension), or are bash shell scripts (with a ".sh" file extension or no file extension).

Installing Scripts

Installing the scripts is typically performed by making a copy of directory /home/pwolberg/Documents/LungDoc/2D-ABM-scripts on helico. Eventually these will be re-organized and placed in the version control archive. To conveniently access the scripts put the directory where they reside in your execution path.

Using Scripts

Most scripts will display a brief usage text if run without any command line arguments. Some scripts require editing before using them. This is specified in the description of those scripts. For those scripts, be sure to check the script before running it to be sure it has been edited as needed.

Main Scripts

getTimeStepsFromCsv.pl

This script extracts lines from a model run statistics file (a csv file) for a particular time step (ex. time step 117), a particular day (the last time step of that day, ex. day 200), a set of time steps using a time step interval (every 100 time steps - time steps 0, 100, 200, etc.). It is used by makeLhsPRCC.pl).

lhs-qsub.sh

A PBS script that is invoked by lhssubmit once for each replication of an LHS. For running an LHS on a cluster that uses PBS. For each replication it uses the PBS job array feature to run a set of experiments for an LHS for a particular replication. For an LHS of 100 experiments and 3 replications, this script is invoked 3 times: for replication 1 of experiments 1 to 100, replication 2 of experiments 1 to 100, and replication 3 of experiments 1 to 100.

Edit your copy of this script to set the model command line arguments. There is a section where various script variables are set to define the command line arguments. Make sure to review this carefully, especially for LHS runs, since you want the correct run options. These run options will be used for each run in an LHS.

lhsShow.pl

Shows which parameters in an LHS parameter file have a range specified. This script is used by makereportNew.pl when creating an LHS report to list the varying parameters in the report. It can also be used to see which parameters vary in an LHS instead of searching for the parameters with ranges in an editor or a tool like grep. It is not as useful as it once was (except that some other scripts depend on it) because an LHS parameter file no longer needs to have a range for all parameters. It only needs a range for those parameters that vary (those for which the max value is not the same as the min value). So now it is relatively easy to see which parameters in an LHS parameter file are being varied just by inspecting it rather than running this script.

This script expects the LHS parameter file to come from standard input. For example: "lshShow.pl < lhsparmfile.xml" rather than "lshShow.pl lhsparmfile.xml". The latter will wait for the LHS parameter file on standard input. Type Ctrl-C to abort it and retype the command as in the first example.

lhssubmit

For running an LHS on a cluster that uses PBS. This is a shell script that invokes the lhs-qsub.sh PBS script once for each replication in an LHS.

Login to the LHS head/login node of a system that has PBS installed. Edit the lhs-qsub.sh as needed - always check it to be sure the correct model run options are used. Then from the the main directory for an LHS: lhssubmit start-experiment finish-experiment replication-count Ex.: lhssubmit 1 100 3, for an LHS with 100 runs, 3 replications per run.

makeLhsPRCC.pl

This script is used to create an LHS matrix from the results of a set of runs, either an LHS or a set of robustness, depletion of knockout runs. This is used further analysis, often PRCC (Partial Rank Correlation Coefficient) analysis using Matlab.

This script uses getTimeStepsFromCsv.pl.

Run the script without any command line arguments to get usage directions on how to run the script.

The output produced by this script is one line for each replication of each experiment (each run of each parameter file). Each line has the run results for that run, for a particular time step, followed by the parameter values for the parameters that were varied for the LHS (for robustness runs, or other special runs such as depletion or knockout runs, the parameter values to include are determined from the LHS that the parameter file was originally created for).

The particular time step to use is determined by one of the command line arguments. Typically it is for the last time step.

By default the LHS matrix is not sorted. There is a command line argument that allows specifying a single column to sort on, by ordinal position of that column in the statistics file.

makemovies.pl

A script that combines png files into a movie in an mpeg or avi file. Edit the script to select which type of movie file to produce. This script will not work with the 3D lung model since it combines all png files in a directory into a movie. For the 3D lung model this would combine 2D and 3D pngs files into the same movie file.

makemovies-lung3d.pl

A script that combines 3D png files from the 3D lung model into a movie in an mpeg or avi file. Edit the script to select which type of movie file to produce.

makereportNew.pl

Processes model output and produces a Latex file. Designed for processing the results of an LHS run, but can be used for some other types of runs as well if the run parameter files follow the LHS naming convention and the result files are stored in folders using the LHS folder structure. See LHS Parameter Processing for more information on the LHS parameter file naming and folder structure.

This script requires these support tools:

lhsShow.pl
plot.sh
gnuplot

This script must be edited before it is used, to define the column positions of fields in the run result statistics files that are used in the generated report. For example, which columns have intra-cellular bacteria, which have extra-cellular bacteria, etc. See the section of the script near the start where the $model variable is defined.

Run the script without any command line arguments to get a list of the arguments.

The output of the script is a Latex file. The name of this output file is specified as a commnad line argument.

merge_csv.py

Take multiple csv files and export them to a Microsoft 2003 Excel XML document. Each csv file will have it's own worksheet named after the filename. Openoffice and Microsoft Office 2003 and newer support this format, though Openoffice requires the "java-common" package installed (if on ubuntu).

mkdirs.pl

Create a set of directories for an LHS. This is needed when running an LHS on a desktop system, but is not needed when running on a cluster, since the cluster run scripts create the directories as needed.

In the main directory for an LHS, do mkdirs.pl E R, where E is the number of LHS experiments (i.e. the LHS sample size, which is also the number of regular parameter files for an LHS) and R is the number of replications - how many times each LHS experiment is to be repeated. For mkdirs.pl 2 3 the following directory structure would be created:

exp1
	exp1-1
	exp1-2
	exp1-3
exp2
	exp2-1
	exp2-2
	exp2-3

plot.sh

This script is not invoked directly by a user. It is invoked by the makereportNew.pl script. It is used to run gnuplot to create png files of X-Y plots of simulation results.

runAbm.pl

Used to run all or some of the experiments for an LHS run. This script is used when not using a batch queuing system. For each cpu core on a system, open a terminal window and use runAbm.pl to run a subset of the experiments for an LHS. For example, suppose an LHS has 100 experiments. If a system has 2 cpu cores, you could open 2 terminal windows and invoke runAbm.pl in the first one to run experiments 1 to 50, invoke runAbm.pl in the 2nd one to run experiments 51 to 100.

runAbm.pl can also be used to re-rerun some experiments in an LHS.

runAbm.pl should be edited prior to using it, to specify the run options to use - what the simulation time peiod should be (ex. 100 days of simulation time), what statistics and/or graphics should be saved, how frequently to save, etc. This will vary according to the model being run.

Run the mkdirs.pl script prior to running runAbm.pl, to create the directory structure that runAbm.pl expects.

Type runABM.pl without any command line arguments to get a list of the arguments it expects. The first command line argument is the executable for the model to run.

snapshot_lhs.py

A script for post-processing a set of model runs, such as an LHS run, to generate graphics from saved model states produced by each model run. Run it from the main directory for an LHS. Run the command "snapshot_lhs.py -h" to see the command line arguments for the script.