Difference: NBICTutorial (r19 vs. r18)

Back to menu

e-BioInfra Tutorial at the NBIC Conference:

port applications to the grid and run them as part of a workflow

Date: 19 April 2011

Location: NBIC Conference, Lunteren

Instructors: Barbera van Schaik, Mark Santcroos, Aldo Jongejan, Antoine van Kampen and Silvia D. Olabarriaga

NBIC conference 2011 - all tutorials: https://wiki.nbic.nl/index.php/NBIC_Conference_2011:_BioAssist_Tutorials_Installation

URL to this tutorial: http://www.bioinformaticslaboratory.nl/foswiki/bin/view/BioLab/NBICTutorial

Description: The tutorial demonstrates how to port and run applications on the grid with the eBioScience infrastructure. The participant learns how to run an existing grid workflow, wrap an application as a workflow component and link components in a workflow.

IMPORTANT: when using the VBrowser, do not delete or copy any grid files that were not created by yourself! The LFC is a shared storage device and the basis for usage is trust.

Installation instructions

A virtual machine is available with all software pre-installed. Alternatively you can install the Vbrowser, a wrapper script and the Moteur2 application on your own machine. The Vbrowser is used to access and copy files from/to grid resources and to start grid workflows. The wrapper script is used to wrap executables as a workflow component. The Moteur2 application can be used to connect workflow components.

Install virtualbox and load the ebioinfra virtual machine (VM)

  1. Install the latest version of VirtualBox on your machine. Binaries for a number of platforms are available on the USB stick
  2. Download the VM if you did not retrieve a USB stick: eBioInfra.ovf (16KB), eBioInfra-disk1.vmdk (1.7GB)
  3. Start Virtualbox and import the eBioInfra.ovf file (File > import appliance)
  4. Start the eBioInfra virtual machine

Installation of vbrowser, wrapper script and moteur2 on your own system


  • Perl (for the executable wrapper)
  • Java 1.6 (for Vbrowser and Moteur2)
  • Graphviz (for Moteur2)


Background and manuals:

Loading the grid certificate

The tutorial certificate has been inactivated after the conference. Ask Aldo or Barbera how to proceed if you would like to do this tutorial.

Get acquainted with the interface

Start the vbrowser and activate the grid certificate

  1. Locate the installation of VBrowser software on your computer (tip: menu)
  2. Start the VBrowser
  3. Login with the guest certificate by pressing the "!" button. Leave all settings unchanged and fill in the password. Push the "Create" button and then "OK"

Copy a file to the grid and examine the stored file

  1. Locate directory /grid/vlemed/NBICTutorial (tip: LFC resource)
  2. Create a subdirectory there to store your files (below mentioned as MYDIR). You might need to refresh the screen before you will see the new directory.
  3. Copy a (small) file from your local computer to MYDIR and observe the messages shown on the Transferring pop-up (tip: open new VBrowser window)
  4. Look at the properties of the new file. On which host was it stored?
  5. Look at the file replicas. What is the complete name of the physical file?
  6. Add a new replica for this file. What is the complete name of the new physical file?
  7. Look again the properties of the file. On which host(s) is it stored?

Run an existing workflow on the grid

  1. Go to the /grid/vlemed/NBICTutorial directory
  2. Browse to directory HelloWorld
  3. Right-click on the HelloWorld.gwendia file and select "View with.." > "Other.." > "ViewerMoteur2"
  4. Fill in your name (without spaces). Make sure you press enter after each value.
  5. Submit the workflow by pressing the "Run workflow" button
  6. The jobs can be monitored from the Vbrowser and from an internet browser. Copy/paste the url in an internet browser.
  7. Monitor the workflow execution (tip: click on the workflow components)
  8. Locate the stdout, stderr and output files
  9. Look at the generated files and answer the questions
    1. On which host has the job run?
    2. How long did the job take to run? (tip: inspect stdout, stderr)
  10. Copy the result file to your desktop

Run an existing workflow with multiple input parameters

  1. Run the workflow again with more than one input value (press enter after each value!)
  2. Look at the generated files and answer the questions
    1. Where did the jobs run?
    2. Where were the files stored?
    3. Were there any errors/warnings during the execution?

Port an application to the grid

Examine the files of the hello-world workflow

  1. Take a look at the hello-world.sh file (on your desktop in directory HelloWorld)
  2. Test the shell script via the command line
  3. To port an application to the grid it is necessary to properly define the in- and output parameters and files. The example shell script has one input value (inputParam) and one output file (outputFile). A wrapper has to be written for the executable.
    1. Take a look at the HelloWorld.xml file. This is the GASW wrapper for the executable that you would like to run on the grid.
    2. One or more components can be linked in a workflow. This is defined in the Gwendia language or in the Scufl language. Take a look at the HelloWorld.gwendia file to get an impression of the Gwendia language.

Port a hello-world.sh script to the grid with a wrapper script

  1. It is not necessary to write these files ourselve. They can be generated automatically with a wrapper script
  2. Open a terminal and start the command "create_GASW_SCUFL_GWENDIA". When you are not using the virtual machine, start the perl script "create_GASW_SCUFL_GWENDIA.pl" Follow the instructions on the screen. An example is given below.
    1. About the LFC directory: Fill in the grid directory name where you would like to place the files, e.g. /grid/vlemed/NBICtutorial/MYDIR
    2. About the names of the in and output parameters/files: they are not allowed to start with in, out or result
    3. About the template for the output file (/grid/vlemed/NBICtutorial/MYDIR/result/$na1_%s.txt): You can generate output file names automatically by supplying a template name.
      • $dir1 means: place the output file in the same directory as the input file (not applicable in this example)
      • $na1 is the name of the first input parameter/file, $na2 of the second parameter/file, etc
      • %s is a random number. This will prevent that files are overwritten accidently
  3. Copy the generated files and the hello-world.sh file to your directory on the grid
  4. Run the workflow (the gwendia file)
ebioinfra@ebioinfraVM:~$ create_GASW_SCUFL_GWENDIA 
This script creates a Gasw xml file, the scufl AND gwendia file with the given in- and output.
The scufl and gwendia file will have the same name as you enter for the Gasw file preceded by WF_ (e.g. WF_"name".scufl or "name".gwendia), these files are written in the output directory.

Enter the name for the output directory e.g.:
"outputXML" (will be created in working directory)
"the full path"  /home/user/creategasw/outputXML)
Press enter to use the default directory: xmlfiles): 
Created directory /home/ebioinfra/Desktop/workflowfiles

Enter the name for your Gasw outputfile, e.g. myGaswFile.xml.
 If you enter an existing file(name), this file will be overwritten!!!

Opened file: /home/ebioinfra/Desktop/workflowfiles/MyHelloWorld.xml

Enter the directory on the LFC where you will store the Gasw xml file (e.g. /grid/vlemed/AMC-e-BioScience/thisworkflow/gasw)

Enter the access type for the executable (LFN or URL): 

Enter the name of the executable: 

Enter the path where the executable is stored (e.g. /grid/vlemed/user

Writing to MyHelloWorld.xml: 



Enter the number of parameters to use  (press enter if none): 

Enter the parameter name for parameter no. 1: 
Enter the parameter option for parameter no. 1 (e.g. -l) 
Writing to MyHelloWorld.xml: 


Enter the number of input files (press enter if none): 

Enter the number of output files  (press enter if none): 

Enter the description for output file no. 1: 
Enter the output filename AND directory for output file no. 1 :  (e.g. /grid/vlemed/user/outdir/Out_Filename.txt or $dir(n)/$na(n)/%s_output_file) 
Writing to MyHelloWorld.xml: