User Tools

Site Tools


wiki:seclusterchi1.0

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
wiki:seclusterchi1.0 [Friday, 27 May 2011 : 15:42:43]
hvrooy version change to rev6189
wiki:seclusterchi1.0 [Friday, 27 May 2011 : 15:43:29] (current)
hvrooy
Line 1: Line 1:
 +====== Running Chi 1.0 simulations on the SE cluster ======
 +
 +This is a tutorial that explains how to run chi 1.0 simulations on the SE cluster. The SE cluster is a collection of CPU's and a queue. The idea is to submit jobs to the queue. The system itself determines on which CPU each job is processed. ​
 +
 +
 +===== Getting an account =====
 +
 +To get an account on the cluster, send a request to Henk van Rooy (h.w.a.m.v.rooy@tue.nl). If the account is arranged, use SSH to connect to it. The path of the cluster is: secluster-02.se.wtb.tue.nl.
 +
 +===== Chi commands =====
 +
 +Chi commands do not work automatically,​ as on the SE-rack system after using toolselect (or chiselect). To use the chi commands from your home directory, you have to tell the system where those commands are located. In other words, you have to make links. Step by step you have to do the following:
 +  - In your home directory, make a bin directory: type 'mkdir bin'​. ​
 +  - Goto this directory: type 'cd bin'
 +  - Use the following three commands to create the links. ​
 +      * ln -s /​share/​apps/​se/​chi/​timed-c-simulator/​trunk-rev6189/​bin/​chic
 +      * ln -s /​share/​apps/​se/​chi/​timed-c-simulator/​trunk-rev6189/​bin/​startmodel
 +      * ln -s /​share/​apps/​se/​chi/​timed-c-simulator/​trunk-rev6189/​bin/​prettify
 +
 +Now, the chi commands chic, startmodel and prettify work from every directory. This is because your bin directory is defined in your path. In the bin directory, you just made a shortcut to the real location of the chic, startmodel and prettify files. ​
 +
 +You should check regularly if a new version of chi 1.0 is available. You can see by going to the chi directory (cd /​share/​apps/​se/​chi/​timed-c-simulator) and see which versions are available. Then, you have to remove the links in your bin directory and make new ones using the path of the new chi version. ​
 +
 +
 +
 +===== Starting a chi 1.0 job =====
 +
 +A job has to be specified in a file, which may be a .txt file. The job in this file has to be submitted to the queue of the cluster. ​
 +
 +A job file has to be in the following form:
 +<​code>​
 +LD_LIBRARY_PATH=/​share/​apps/​se/​chi/​timed-c-simulator/​trunk-rev6189/​lib
 +export LD_LIBRARY_PATH
 +./​yourchimodel --run-model '​yourchimodelinput'​ | prettify
 +</​code>​
 +
 +In this job, the first two lines indicate where the chi libraries are located. A library location is also included in the compiled chi model, but this location only works on the rack systems. So you have to give the location of the chi library by hand. 
 +
 +In the third line, ./ is used to execute yourchimodel. Startmodel is not used, because it does not work yet. Because ./ is used, you have to include '​--run-model'​ to tell linux that it should execute the whole model instead of one of the subprocesses in the model. '| prettify'​ is required to convert your chi output to output that can be interpreted by a text editor. (startmodel does that automatically).
 +
 +Now we have a job file, we can submit it to the queue. The following command can be used: 
 +<​code>​
 +qsub -e error.out -o ct.txt yourjob.txt
 +</​code>​
 +-e is used to indicate were the errors should be written to
 +-o is used to define the output file
 +
 +For other options, type info qsub.
 +
 +It is usually convenient to automatically generate and submit jobs using python. ​
 +
 +
 +===== Python script =====
 +
 +The following python script is an example of running chi simulations on the cluster. In my working directory, I created a subdirectory for jobs and one for simulation output. ​
 +
 +<​code>​
 +#​!/​usr/​bin/​env python
 +#
 +import sys,​os,​string,​math,​random
 +
 +# chi input variables
 +U = range(5,​105,​5)
 +P = range(0,​105,​5)
 +
 +# chi input constants
 +te = 3.0
 +ce = 1.0
 +nr_lots =  100000
 +nr_idd = 20000
 +
 +# number of simulation replications
 +nr_reps = 30
 +
 +# setseed (see explanation below)
 +seeds = range(1,​31*123456,​123456)
 +
 +# delete previous jobs and simulation output (if any) in dirs '​jobs'​ and '​ct'​ respectively. ​
 +os.system('​rm jobs/​job*.txt'​)
 +os.system('​rm ct/​ct*.txt'​)
 +
 +# start main loop over vhi input variables
 +for u in U:
 +    for p in P:
 +        # simulation specific pre-calculations.
 +    p = 0.01*p
 +        ta_min_ss1 = 0.5 * p * te
 +        ta_min_ss2 = (1 - p) * te
 +        ta_min = max(ta_min_ss1,​ ta_min_ss2)
 +        ta = ta_min/​(0.01*u)
 +
 +        # start loop of simulation reps with the same input variables
 +        for i in range(0,​nr_reps):​
 +            # specify job path and output path
 +            jobpath = '/​home/​cveeger/​dispatching/​CT_TH_PM_testcase2/​jobs/​job_%s_%s_%s.txt' ​                                           ​
 +                      %(str(u),​str(int(100*p)),​str(i)) ​
 +            ct_save = '/​home/​cveeger/​dispatching/​CT_TH_PM_testcase2/​ct/​ct_%s_%s_%s.txt' ​
 +                      %(str(u),​str(int(100*p)),​str(i))
 +            ​
 +            # create job file
 +            job = open(jobpath,'​w'​)
 +            seed = seeds[i]
 +            job.write('#​!/​bin/​sh\n'​)
 +            job.write('​LD_LIBRARY_PATH=/​share/​apps/​se/​chi/​timed-c-simulator/​trunk-rev2303/​lib\n'​)
 +            job.write('​export LD_LIBRARY_PATH\n'​)
 +            job.write('/​home/​cveeger/​dispatching/​CT_TH_PM_testcase2/​dispatch --setseed ' +
 +                      str(seed) + ' --run-model ' + str(nr_lots) + ' ' + str(nr_idd) + ' '
 +                      + str(ta) + ' ' + str(te) + ' ' + str(ce) + ' ' + str(p) + ' | prettify\n'​)
 +            job.close()
 +            ​
 +            # submit job to the queue
 +            os.system('​qsub -e error.out -o ' + ct_save + ' ' + jobpath) ​          
 +            ​
 +</​code>​
 +
 +In the cluster tool, jobs may start almost simultaneously on different cluster processors. Because the chi distribution initial values are dependent on the actual time, they may be the same for different simulation runs. As a consequence,​ the outcome of the simulation run may be the same as well, whereas for simulation replication,​ we are interested in the stochastic behavior of the simulation. Therefore, the initial values of the used distributions have to be set by hand. This can be done by using the setseed command. (see the python script) A different seed value is used for each simulation replication with the same input variables. As a rule of thumb, the difference between seed values should be larger than four times the number of distributions used in the used chi model. I used a difference of 123456 to be on the safe side =). 
 +
 +
 +
  
wiki/seclusterchi1.0.txt · Last modified: Friday, 27 May 2011 : 15:43:29 by hvrooy