./dripSearch.py [options] --spectra 
<spectra file> --digest-dir <protein database>
DripSearch utilizes a DBN for Rapid Identification of Peptides
  (DRIP) to identify peptides from tandem mass spectra. DRIP is
  primarily used for high peptide identification accuracy and improved
  derived features regarding PSMs (the latter is utilized in
  dripExtract). Model parameters may also be learned via
  expectation-maximization (implemented in dripTrain) and
  utilized during search for improved accuracy.
  
  
  
  
  
  If you use
  DRIP in your research, please
  cite:
John T. Halloran, Jeff A. Bilmes, and William S. Noble. "Learning Peptide-Spectrum Alignment Models for Tandem Mass Spectrometry". Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI 2014). AUAI, Quebic City, Quebec Canada, July 2014.
--spectra <spectra file> – The name of the file from
  which to parse the fragmentation spectra, in ms2 file format.--digest-dir <dripDigest output directory> – Output
  directory of dripDigest (note, the protein database must be digested
  with dripDigest prior to running dripSearch). Default
    = dripDigest-output
  The following directories will be created:
log – directory containing DRIP results.  If used in cluster mode
    (--cluster-mode True ), cluster search results are
    written to this directory.  If used in standalone mode
    (--cluster-mode False ), GMTK output files are
    written to this directory.
  encode –
    directory containing GMTK input files.
  drip_collection –
    directory containing DRIP parameter files for GMTK.
  --precursor-window <float> – Tolerance used
  for matching peptides to spectra. Peptides must be within +/-
  'precursor-window' of the spectrum value. The precursor window units
  depend upon precursor-window-type. Default = 3.--precursor-window-type <Da|ppm> –
  Specify the units for the window that is used to select peptides
  around the precursor mass location, either in Daltons
  (Da) or part-per-million (ppm). Default
  = Da.--charges <comma-separated-integers|all> – precursor
  charges to search. To specify individual charges, list as
  comma-delimited, e.g., “1,2,3” to search all charge 1, 2, or 3
  spectra. Default = All.--high-res-ms2 <T|F> –
    boolean, whether the search is over high-res ms2 (high-high)
    spectra. When this parameter is true, DRIP uses the real valued
    masses of candidate peptides as its Gaussian means; for low-res
    ms2 (low-low or high-low), the observed m/z measurements are much
    less accurate so these Gaussian means are learned using training
    data. Default = False.
  --high-res-gauss-dist <float> –
    m/z distance for 99.9% of m/z Gaussian mass to lie within.  Only
    available for high-res MS2 searches. Default=0.05.
  --precursor-filter <T|F> –
    boolean, when true, filter all peaks 1.5Da from the observed
    precursor mass. Default=False.
  --decoys <T|F> –
    whether to create (shuffle target peptides) and search decoy
    peptides. Default = T.
  --num-threads <integer> – the number of threads to run on a multithreaded CPU. If supplied value is greater than number of supported threads, defaults to the maximum number of supported threads. Multithreading is not suppored for cluster use as this is typically handled by the cluster job manager. Default = 1.
  --top-match <integer> – The number of psms per spectrum written to the output files. Default = 1.
  --beam <integer> – K-beam width to use to speed
    up inference. Default value of 0 means exact inference. Warning -
    identifications may be significantly poor if the beam width is too
    small, i.e., beam < 100. Default = 0.
  --random-wait <integer> – randomly wait up to
    specified number of seconds before writing results back to NFS
    (for cluster use). Default = 10.
  --num-jobs <integer> – the number of jobs to
    run in parallel (for cluster use). Default = 1.
  --cluster-mode <T|F> – evaluate dripSearch
    prepared data as jobs on a cluster.  Only set this to true once
    dripSearch has been run to prepare data for cluster use.  Default
    = False.
  --write-cluster-scripts <T|F> – write scripts
    to be submitted to cluster queue.  Only used when num-jobs > 1.
    Job outputs will be written to log subdirectory in current
    directory. Default = True.
  --cluster-dir <string> – absolute path of
    directory to run cluster jobs. Default = /tmp.
  --merge-cluster-results <T|F> – merge
    dripSearch cluster results collected in directory log.
    Default = False.
  --output <string> – output file to write
    both target and decoy results. Default = none.
  The following examples are available in test.sh.
    Run dripDigest and dripTrain first, as necessary.