CRAB for CMS-HI Dilepton group


  • DongHo summarize the prerequisites, please visit here.



  • Step 1 : Store data at castor (not publishing on DBS)
(Notice : If you access the data in castor by crab, please read this.)

  • Step 2 : Store data at MIT and publish on DBS

Tips for effective work

  • If you want to store data at castor, before submit CRAB you should do following first. (ex: stored directory : /castor/
    • rfrm -r /castor/
    • rfmkdir /castor/
    • rfchmod 775 /castor/
(If the directory doesn't have the right to write by group, you will face to exit code 60307 like below.)
ID    END STATUS            ACTION       ExeExitCode JobExitCode E_HOST
----- --- ----------------- ------------  ---------- ----------- ---------
1     Y   Retrieved         Cleared       0          60307

crab:  ExitCodes Summary
 >>>>>>>>> 1 Jobs with Wrapper Exit Code : 60307 
    List of jobs: 1 
   See for Exit Code meaning

drwxr-xr-x   0 hckim    zh                          0 Sep 06 15:44 StoreResults-PyquenEvtGen_bJpsiMuMu_JPsiPt912_RECO_3111_v5
Should change like
drwxrwxr-x   0 hckim    zh                          0 Sep 06 15:44 StoreResults-PyquenEvtGen_bJpsiMuMu_JPsiPt912_RECO_3111_v5

  • At MIT, maximum capacity to process job is about 2000, so it will be good to reduce the number of job per task. In my experience, less than 200 is OK.

  • Because of unknown reason, sometimes crab submit can be failed with exit code 60307. That means the folder to store data doesn't have permit to write by group. At that time, please remove that folder, kill that job and create and submit new job. (Thay may depend on which machine do a job..I guess.)
  • If you face to exit code 8018, do following list. (ex : trouble job : 13, 119)
    • crab -get(or -getoutput) 13, 119
    • crab -resubmit 13, 119

  • When dataset which you accessed with CRAB is extended(that is, addes more event to same dataset), you need to do like following. * crab -extend : That created extended CRAB job added to already created jobs * crab -submit #extended job number : From this step, same as usual jobs

Check whether all the data is stored and published successfully

  • Check the storage at castor
  • Check the storage at MIT
    • store at /net/pstore01/d00/scratch/hckim/PrimaryDataset/User_Dataset_Name/PsetHash/
      • ex. /pnfs/
  • Check the publishment on DBS

  • If you want submit multiple root file, add following in crab.cfg
    • remark output_files = test1.root
    • add get_edm_output = 1

  • (NEW) When you are using "VarParsing" in your pset python file, please add pycfg_params = noprint under the [CMSSW] section like below.

import FWCore.ParameterSet.Config as cms
process = cms.Process('Slurp')

process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring())
process.maxEvents = cms.untracked.PSet( input       = cms.untracked.int32(10) )
process.options   = cms.untracked.PSet( wantSummary = cms.untracked.bool(True) )

process.output = cms.OutputModule("PoolOutputModule",
    outputCommands = cms.untracked.vstring("keep *"),
    fileName = cms.untracked.string('outfile.root'),
process.out_step = cms.EndPath(process.output)

  • (NEW) If job is aborted, please test with crab -postMortem [jobid]. After this command, the CMSSW_[jobid].LoggingInfo file is made and in this file you can find the reason of abort. Reference is here. The result is like below.
[lxplus422] ~/scratch0/CMSSW_4_1_3/src/HiAnalysis/HiOnia/crab $ crab -c crab_0_110906_185936 -postMortem 1
crab:  Version 2.7.8 running on Fri Sep  9 16:38:32 2011 CET (14:38:32 UTC)

crab. Working options:
   scheduler           glite
   job type            CMSSW
   server              OFF
   working directory   /afs/

crab:  Logging info for job 1: 
      written to /afs/ 

[lxplus422] ~/scratch0/CMSSW_4_1_3/src/HiAnalysis/HiOnia/crab $ cd crab_0_110906_185936/job/
[lxplus422] ~/scratch0/CMSSW_4_1_3/src/HiAnalysis/HiOnia/crab/crab_0_110906_185936/job $ ls  CMSSW_1.LoggingInfo
[lxplus422] ~/scratch0/CMSSW_4_1_3/src/HiAnalysis/HiOnia/crab/crab_0_110906_185936/job $ vi CMSSW_1.LoggingInfo

Event: Abort
- Arrived                    =    Tue Sep  6 19:07:44 2011 CEST
- Host                       =
- Level                      =    SYSTEM
- Priority                   =    asynchronous
- Reason                     =    The job cannot be submitted because the blparser service is not alive


Useful link

-- HyunChulKim - 07 Jul 2010

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2011-09-09 - HyunChulKim
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding KoreaCmsWiki? Send feedback