We almost always use the srm-copy
option -f which means that a list formatted as <sourceURL>
<size>
<targetURL> is read in from a file. One reason this
is used is because it allows the targetURL to be specified for each
file. An example of such an
srm-copy command would look like this:
srm-copy.linux
-d -f productionHighReversedFullFieldP04ik.rndm -c
/auto/u/hjort/hrm2/hrm.rc -l P04ik.log
This command would be run at the
destination of the transfers (pdsfgrid4 in this example) since SRM
transfers
always work in "pull" mode. Each line in the file
productionHighReversedFullFieldP04ik.rndm looks like this:
gsiftp://stargrid03.rhic.bnl.gov/star/data32/reco/productionHigh/ReversedFullField/P04ik/2004/067/st_physics_5067059_raw_2030001.MuDst.root 278979705
srm://garchive.nersc.gov/nersc/projects/starofl/reco/productionHigh/ReversedFullField/P04ik/2004/067/st_physics_5067059_raw_2030001.MuDst.root
This file is generated by a STAR-specific script (diskOrHPSS.pl) that
compares the
RCF (mirror) and PDSF file catalogs. Here the files are sourced
from NFS disks at RCF instead of from HPSS. This has proven to be
a dependable means of transferring files and usually provides better
throughput than if files are sourced from HPSS. Note that this
method doesn't even require an HRM to be running at RCF. An HRM
is running at PDSF which caches the files prior to sinking them into
HPSS. Once the files are in HPSS RRS is called and the files are
entered into the PDSF file catalog. The
.rndm file extension denotes that the script has randomized the order
of files in order to distribute the i/o load on the NFS disks.
Files that are not available on NFS disks can be sourced from HPSS at
RCF. The syntax used for these transfers is similar to that shown
above:
srm-copy.linux
-d -f productionLowFullFieldP04ik.srm -c /auto/u/hjort/hrm2/hrm.rc -l
P04ik.log -at PLAIN -et GSI -al starpftp -ap "password"
and the file list for transfers from
HPSS looks like this:
srm://stargrid03.rhic.bnl.gov:NSPORT/home/starreco/reco/productionHalfHigh/HalfField/P04ik/2004/059/st_physics_5059055_raw_1020009.MuDst.root?remoteobj=HRMServerBNL&msshost=hpss.rcf.bnl.gov&mssport=MSSPORT 268435346
srm://garchive.nersc.gov/nersc/projects/starofl/reco/productionHalfHigh/HalfField/P04ik/2004/059/st_physics_5059055_raw_1020009.MuDst.root
Here "NSPORT" is the stargrid03 port that the nameserver is running on
and "MSSPORT" is the port used to access HPSS. The source file
now begins with
srm: instead
of
gsiftp: and this mode of
transfers requires HRMs to be running both at RCF and PDSF.
In practice it works best to break up large transfers (>10k files)
into smaller transfers (<10k files). A simple shell script is
generated by the diskOrHPSS.pl script to do a series of srm-copy
command, each command corresponding to a particular
trgsetupname/magscale/production of STAR MuDst's:
#!
/usr/local/bin/tcsh
srm-copy.linux -d -f
productionHalfHighHalfFieldP04ik.rndm -c /auto/u/hjort/hrm2/hrm.rc -l
P04ik.log
srm-copy.linux -d -f
productionHalfLowHalfFieldP04ik.rndm -c /auto/u/hjort/hrm2/hrm.rc -l
P04ik.log
srm-copy.linux -d -f
productionPPReversedFullFieldP04ik.rndm -c /auto/u/hjort/hrm2/hrm.rc -l
P04ik.log
Note that at present it is not possible to combine files in HPSS
and on NFS disk into a single file list. This is because the
authentication methods for disk (GSI, the default) and HPSS (PLAIN) are
different. Hopefully we'll get GSI authentication at RCF HPSS
working in the near future.
For more information about SRM data transfers see the
DataMover-UserGuide
which is also linked from the SDM group's webpages.