User Tools

Site Tools


Sidebar

Systems Engineering Wiki

SE WiKi Information

DokuWiki Information

wiki:seclusterchi1.0

Running Chi 1.0 simulations on the SE cluster

This is a tutorial that explains how to run chi 1.0 simulations on the SE cluster. The SE cluster is a collection of CPU's and a queue. The idea is to submit jobs to the queue. The system itself determines on which CPU each job is processed.

Getting an account

To get an account on the cluster, send a request to Henk van Rooy (h.w.a.m.v.rooy@tue.nl). If the account is arranged, use SSH to connect to it. The path of the cluster is: secluster-02.se.wtb.tue.nl.

Chi commands

Chi commands do not work automatically, as on the SE-rack system after using toolselect (or chiselect). To use the chi commands from your home directory, you have to tell the system where those commands are located. In other words, you have to make links. Step by step you have to do the following:

  1. In your home directory, make a bin directory: type 'mkdir bin'.
  2. Goto this directory: type 'cd bin'
  3. Use the following three commands to create the links.
    • ln -s /share/apps/se/chi/timed-c-simulator/trunk-rev6189/bin/chic
    • ln -s /share/apps/se/chi/timed-c-simulator/trunk-rev6189/bin/startmodel
    • ln -s /share/apps/se/chi/timed-c-simulator/trunk-rev6189/bin/prettify

Now, the chi commands chic, startmodel and prettify work from every directory. This is because your bin directory is defined in your path. In the bin directory, you just made a shortcut to the real location of the chic, startmodel and prettify files.

You should check regularly if a new version of chi 1.0 is available. You can see by going to the chi directory (cd /share/apps/se/chi/timed-c-simulator) and see which versions are available. Then, you have to remove the links in your bin directory and make new ones using the path of the new chi version.

Starting a chi 1.0 job

A job has to be specified in a file, which may be a .txt file. The job in this file has to be submitted to the queue of the cluster.

A job file has to be in the following form:

LD_LIBRARY_PATH=/share/apps/se/chi/timed-c-simulator/trunk-rev6189/lib
export LD_LIBRARY_PATH
./yourchimodel --run-model 'yourchimodelinput' | prettify

In this job, the first two lines indicate where the chi libraries are located. A library location is also included in the compiled chi model, but this location only works on the rack systems. So you have to give the location of the chi library by hand.

In the third line, ./ is used to execute yourchimodel. Startmodel is not used, because it does not work yet. Because ./ is used, you have to include '–run-model' to tell linux that it should execute the whole model instead of one of the subprocesses in the model. '| prettify' is required to convert your chi output to output that can be interpreted by a text editor. (startmodel does that automatically).

Now we have a job file, we can submit it to the queue. The following command can be used:

qsub -e error.out -o ct.txt yourjob.txt

-e is used to indicate were the errors should be written to -o is used to define the output file

For other options, type info qsub.

It is usually convenient to automatically generate and submit jobs using python.

Python script

The following python script is an example of running chi simulations on the cluster. In my working directory, I created a subdirectory for jobs and one for simulation output.

#!/usr/bin/env python
#
import sys,os,string,math,random

# chi input variables
U = range(5,105,5)
P = range(0,105,5)

# chi input constants
te = 3.0
ce = 1.0
nr_lots =  100000
nr_idd = 20000

# number of simulation replications
nr_reps = 30

# setseed (see explanation below)
seeds = range(1,31*123456,123456)

# delete previous jobs and simulation output (if any) in dirs 'jobs' and 'ct' respectively. 
os.system('rm jobs/job*.txt')
os.system('rm ct/ct*.txt')

# start main loop over vhi input variables
for u in U:
    for p in P:
        # simulation specific pre-calculations.
    	p = 0.01*p
        ta_min_ss1 = 0.5 * p * te
        ta_min_ss2 = (1 - p) * te
        ta_min = max(ta_min_ss1, ta_min_ss2)
        ta = ta_min/(0.01*u)

        # start loop of simulation reps with the same input variables
        for i in range(0,nr_reps):
            # specify job path and output path
            jobpath = '/home/cveeger/dispatching/CT_TH_PM_testcase2/jobs/job_%s_%s_%s.txt'                                            
                      %(str(u),str(int(100*p)),str(i)) 
            ct_save = '/home/cveeger/dispatching/CT_TH_PM_testcase2/ct/ct_%s_%s_%s.txt' 
                      %(str(u),str(int(100*p)),str(i))
            
            # create job file
            job = open(jobpath,'w')
            seed = seeds[i]
            job.write('#!/bin/sh\n')
            job.write('LD_LIBRARY_PATH=/share/apps/se/chi/timed-c-simulator/trunk-rev2303/lib\n')
            job.write('export LD_LIBRARY_PATH\n')
            job.write('/home/cveeger/dispatching/CT_TH_PM_testcase2/dispatch --setseed ' +
                      str(seed) + ' --run-model ' + str(nr_lots) + ' ' + str(nr_idd) + ' '
                      + str(ta) + ' ' + str(te) + ' ' + str(ce) + ' ' + str(p) + ' | prettify\n')
            job.close()
            
            # submit job to the queue
            os.system('qsub -e error.out -o ' + ct_save + ' ' + jobpath)           
            

In the cluster tool, jobs may start almost simultaneously on different cluster processors. Because the chi distribution initial values are dependent on the actual time, they may be the same for different simulation runs. As a consequence, the outcome of the simulation run may be the same as well, whereas for simulation replication, we are interested in the stochastic behavior of the simulation. Therefore, the initial values of the used distributions have to be set by hand. This can be done by using the setseed command. (see the python script) A different seed value is used for each simulation replication with the same input variables. As a rule of thumb, the difference between seed values should be larger than four times the number of distributions used in the used chi model. I used a difference of 123456 to be on the safe side =).

wiki/seclusterchi1.0.txt · Last modified: Friday, 27 May 2011 : 15:43:29 by hvrooy