==========================================================================
meadTools, version 2.2 : tools for MEAD-based electrostatic calculations

                    www.itqb.unl.pt/simulation
==========================================================================


Contents of this README file:

  A. Authors
  B. Overview
  C. Changes from previous version
  D. Installation
  E. Distributed files
  F. Recipe for binding simulations without tautomers
  G. Recipe for binding simulations with tautomers
  H. Acknowledgments
  I. References



A. Authors
==========

Main developer: Antonio M. Baptista

Other developers: Carlos Cunha, Miguel Machuqueiro, Sara R. R. Campos,
Pedro R. Magalhaes, Catarina A. Carvalheda, Paulo J. Martel.


B. Overview
===========

meadTools is a set of software tools primarily intended to be used
with other programs (especially MEAD) in order to run binding
simulations of electrons and/or protons [1,3], using proton tautomers
[2,3] or not, based on Poisson-Boltzmann (PB) calculations.

For details on licensing, see the file LICENSE.

The documentation of meadTools consists of this README file plus the
comment-headers of its individual programs.

A tutorial (for tautomeric calculations) is now included.

If you use meadTools in your work please cite [2,3].

Please report bugs to baptista@itqb.unl.pt.



C. Changes from previous version (for previous users)
================================

- A tutorial (for tautomeric calculations) was added.

- A minor bug in meadT was corrected, which made difficult to follow the
  error messages when a run of meadT was aborted for some reason.



D. Installation
===============

To install meadTools in Linux go to an appropriate directory (eg,
/usr/local/) and type (replacing VERSION with the appropriate version
number):

  tar xzf meadTools-VERSION.tar.gz

A directory meadTools-VERSION/ will be created, which contains all the
distributed files (see below).

The meadTools package was developed for Linux, and no attempt has been
done to port it to other operating systems.  In particular, most of
the package tools are written in AWK and Bash, which are assumed to
exist in the system at the path indicated in the initial "shebang"
line (the one starting with "#!"); you may need to changed those paths
to conform to your operating system.  There is also a C program, which
should be fully portable.  In addition, you may also need to install
some of the following software (whose versions tested with meadTools
are indicated):

- MEAD :
  It is assumed that all PB calculations are done using the MEAD package
  (tested versions <= 2.2.9).  MEAD is distributed under the GNU GPL and
  used to be available at http://www.scripps.edu/mb/bashford, but that site
  seems to be down; we provide a copy at
  https://www.itqb.unl.pt/simulation.

- MCRP and PETIT :
  It is assumed that the Monte Carlo (MC) simulations that follow the
  PB calculations are done using the programs MCRP (tested versions <=
  1.2) or PETIT (tested versions <= 1.6), depending on whether
  nontautomeric or tautomeric treatments are used.  MCRP and PETIT can
  be obtained at http://www.itqb.unl.pt/simulation.

- GROMACS :
  If you want to use the program makepqr, you need molecular topology
  files created with (or at least following the format used by) the
  GROMACS package (tested versions <= 2018).  GROMACS can be obtained
  at http://www.gromacs.org.

- ASC :
  If you want to use the program selectWacc, you need the program ASC to
  compute solvent accessibility surfaces (tested versions = 2.14).  ASC is
  distributed under a specific license granting free academic use and can
  be obtained at http://mendel.imp.ac.at/studies/asc.jsp.


E. Distributed files
====================

The meadTools package is just a bundle of tools which are useful for
running MEAD-based calculations of protonation and/or redox processes.
The package does not provide a fully automated interface to run those
calculations directly from a PDB file, for example, although that is
relatively simple if you have GROMACS installed in your system.  Each
tool is single-purpose and you must combine several of them to get
what you want, which actually give you much more flexibility.  Anyway,
standard recipies for the most obvious tasks are given in the
following sections.

The programs included in this distribution fit into three categories
with respect to proton tautomerism: (1) Some programs are for
exclusive use with tautomers and do not make sense in the
non-tautomeric case (eg, addHtaut).  (2) Other programs can be used in
the non-tautomeric case (eg, meadT and convert), but they are not the
best choice for that case; instead, one should use multiflex (from
MEAD) and MCRP to run all non-tautomeric calculations.  (3) Finally,
some programs are equally useful for the tautomeric and non-tautomeric
cases (eg, makepqr and makesites).  The files currently distributed
are:

- README :
  The main documentation for meadTools (this text file), containing a
  general description of the package and the "recipes" for doing
  non-tautomeric and tautomeric simulations.

- LICENSE :
  The text of the license for using meadTools.

- makepqr :
  An AWK program to make a .pqr file from GROMACS files.  Check Usage
  and program header for details.

- addHtaut :
  An AWK program to add tautomeric protons to .pqr files.  Check Usage
  and program header for details (eg, sites supported).

- selectWacc :
  A Bash script that makes an accessibility-based selection of the
  water molecules in a .pqr file.  It uses the program ASC.  Check
  Usage and program header for details.

- makesites :
  An AWK program to create a .sites file from a .pqr file.  Works for
  both tautomeric and non-tautomeric cases.  Check Usage and program
  header for details.

- getst :
  An AWK program to get the .st files specified by a .sites file.
  Works for both tautomeric and non-tautomeric cases.  Check Usage and
  program header for details.  It was tested but not used in many
  real-life cases, so that some bugs may exist.

- stmodels :
  An AWK program to use as model compounds only the fragments
  indicated in the .st files [6].  Check Usage and program header for
  details.

- statepqr :
  An AWK program that puts all sites in a .pqr file into: (1) a
  specified reference state, (2) a set of individual states, or (3) an
  average state reflecting a set of state fractions.  Check Usage and
  program header for details.

- meadT :
  A Bash script that runs PB calculations for a set of tautomeric
  sites. It uses the program multiflex, from the MEAD package,
  producing .pkcrg and .g files.  It can use a simple single-machine
  parallelization algorithm.  Check Usage and program header for
  details.

- convert :
  An AWK program that converts the .pkcrg and .g files created by
  meadT into a single input file needed by PETIT.  This tool is needed
  for historical reasons, but meadT should obviously write its output
  directly in that final form; that would hopefully be done in a
  future version.

- cconvert/ :
  This directory contains a C version of the convert program.  It is a
  more-or-less direct translation of the AWK version, but it is _much_
  faster.  As noted for the AWK version, this should be eliminated in
  a future version.

- nulltit :
  An AWK program that computes the titration curve using the "null
  model", ie, using "typical" values in solution.  Check Usage and
  program header for details.

- renumbpqr :
  An AWK program that renumbers atoms and residues, which can
  sometimes be useful.  It reads a .pqr file from stdin and writes the
  renumbered .pqr file to stdout.  Check Usage and program header for
  details.

- st-{G53a6,G43a1,st-G54a7_Fit,st-G53a6_Fit,st-G43a1_Fit}/ :
  Directories with *.st files for the G54a7 (GROMOS96 54A7), G53a6
  (GROMOS96 53A6), and G43a1 (GROMOS96 43A1) force fields. In the *_Fit
  directories, the pKa values were calculated according to reference [6],
  using constant-pH MD simulations. In the other directories, Tanford pKa
  values are used (reference [7]). Inside each directory, there is a README
  with more detailed explanation. These are the directories to be given as
  an argument to getst. Note: The *_Fit directories contain .st files
  *only* for the sites that were parametrized using constant-pH MD
  simulations; if you want to use other titrable sites (e.g., ARG) or
  non-titrable sites (e.g., SER, THR; as done in rigid calculations), you
  may build your own st-XXX directory with those extra files (e.g., copied
  from a directory without "_Fit").

- st-*/*.st :
  For tautomeric sites there is one *tau?.st file per pseudo-site.  For
  non-tautomeric sites there is one *all.st and/or *avx.st file per
  site; in case of doubt of what the "avx" or "all" really stand for,
  check the actual file (there is no general rule). Note that *avx.st
  files were mostly created for st-G43a1.

- tutorial :
  Directory containing a tutorial for calculations using proton
  tautomerism, for the hen egg white lysozime structure 4LZT.pdb. See the
  file tutorial.txt.



F. Recipe for binding simulations without tautomers
===================================================

1. Make the initial .pqr file:
------------------------------

If you are using GROMACS, you can run editconf with the option -mead
to directly create the .pqr file using the atomic radii in the file
vdwradii.dat of the GROMACS distribution.  An alternative is to
compute the radii from the force field LJ parameters, which can be
done using the program makepqr (eg, with the options "W 2RT" [5]).
Many other methods are possible, and there are other programs/packages
around that build .pqr files directly from .pdf files.  Regardeless of
the method you use, be sure that you know where the charges and radii
come from.  Above all, make sure that all the potentially titrable
sites are properly set in terms of atoms and charges in the .pqr file.
For example, if you have a protein whose N-terminus is missing from
the PDB, you certainly don't want the "apparent" N-terminus in the
.pqr file to be titrable.  Another example are free Cys.  All these
cases are _your_ responsibility.  In addition, you have to make sure
that the atoms in each titrable site are consistent with the .st files
you are going to use (see below).  Note that all tools distributed
here assume that you build your .pqr using makepqr or an equivalent
program, meaning that all subsequent steps assume that some particular
atom and residue name rules are used; see the header of makepqr for
details.


2. Make the .sites file:
------------------------

This can be easily done using the makesites program.  Although the
program tries to make some guesses, it is _your_ responsibility to
check if special cases (eg, Cys, N- and C-termini, etc) are correctly
treated.  See the header of makesites for details.


3. Get the .st files:
---------------------

You can get all the needed .st files using the program getst.  As
usual, you _must_ check any unusual sites.


4. Modify the .pqr and .sites files using stmodels:
---------------------------------------------------

An offset is added to the residue number of the atoms included in
the .st files. This ensures that there is no overlap of the different
fragments in MEAD calculations. 

The offset value must be at least equal to the number of residues plus 
one, to ensure that the fragments are considered independently.
 
Do not forget to revert to the normal numbering after the MEAD
calculations. Check the stmodels program header for further details. 


5. Run multiflex:
-----------------

You are ready to run the multiflex program, which is part of Donald
Bashford's MEAD package.  See MEAD documentation for details.


6. Run MCRP:
------------

After multiflex is over, you can run MCRP, a C program that is
distributed separately.  It runs MC simulations of the binding of
electrons and/or protons, taking as input the files created by
multiflex.  See the MCRP documentation for details.



G. Recipe for binding simulations with tautomers
================================================
(For a detailed example, see the tutorial/ directory)

1. Make the initial .pqr file in the usual way:
-----------------------------------------------

What is said above in point 1 of the recipe for non-tautomeric
calculations, also applies here.  In addition, you probably want to
include some water molecules in the .pqr file, which can be done using
the selectWacc program (see its Usage and header for details).  You
should also pay attention to the protonation state of all potentially
protonable sites (including alcohols, thiols and water), whose forms
_must_ be:

Amino/His     | LYS, NTXXX, HIS                | fully protonated (charged)  
Carboxyl      | ASP, GLU, CTXXX, PRA, PRD, ACE | fully deprotonated (charged)
Phenyl        | TYR                            | one proton (neutral)        
Alcohol/Thiol | SER, THR, free CYS             | one proton (neutral)        
Water         | HOH, H2O                       | fully deprotonated (only O) 

All these protons (when present) should be at their dihedral minima,
as usually placed by GROMOS, GROMACS, etc.

ATTENTION: The .pqr files obtained from MM topologies may contain
additional protons at some sites which, according to the above table,
should have no protons (eg, some acids may be protonated in a MD run).
Be sure to remove _all_ of these protons!


2. Check .st files:
-------------------

You can get all the needed .st files using the program getst.  As
usual, you _must_ check any unusual sites.


3. Make .sites file:
--------------------

Build the .sites file with all alternative tautomers (see 5(b) below).
This can be done with the makesites program:

  makesites t initial.pqr > mol.sites

Note that the "t" option must be used, to get a tautomeric .sites
file.  In that case, SER, THR and waters will be also included in the
.sites file, in order to treat them later as pseudo-titrable (ie, as
proton rotamers) during the Monte Carlo simulation.


4. Add all protons needed for tautomerism:
------------------------------------------

a) Run addHtaut:
     addHtaut initial.pqr mol.sites > aux.pqr

b) Check that all .st files are in the current directory (or another
   directory given to statepqr; see its usage).

c) Run statepqr to put everything in the charged state:
     statepqr r=c aux.pqr mol.sites > final.pqr

Only the residues listed in the above table are presently recognized.
Strictly speaking, step (c) is not needed, since meadT (see 5 below)
runs it at start.  Nevertheless, just to be sure, run it at this
stage.


5. Modify the .pqr and .sites files using stmodels:
---------------------------------------------------

An offset is added to the residue number of the atoms included in
the .st files. This ensures that there is no overlap of the different
fragments in MEAD calculations.

The offset value must be at least equal to the number of residues plus
one, to ensure that the fragments are considered independently.
 
Do not forget to revert to the normal numbering after the MEAD
calculations. Check the stmodels program header for further details.


6. Run meadT:
-------------

This program should be used like multiflex (see MEAD documentation),
except for the following:

a) meadT has some specific command-line options, which must come
   _before_ the multiflex options.  Parallelization is supported.
   Check option -h and the program header for further details.

b) Each line of the input .sites file must contain, after the residue
   number, as many .st designations as the number of tautomers
   (instead of a single one).  Obviously, all the corresponding .st
   files must be present in the directory.

c) A .pkcrg file (pKa values with all other sites charged) is created
   instead of the usual .pkint file.

d) A .out file is created with the (appended) standard output of all
   single-site multiflex runs.

e) A .err file is created with the (appended) standard error of all
   multiflex runs.


7. Run convert:
---------------

This is an AWK program that converts the files .pkcrg and .g produced
by meadT into the new input format required by version 1.1 (and
further ones) of PETIT.  This may require from minutes to hours and a
lot of memory.  There is now a C version of this program which is
_much_ faster.  The need for this conversion should be removed in
future versions (meadT should directly create the file required by
PETIT).


8. Run PETIT:
-------------

You can now run PETIT, a C program that is distributed separately.  It
runs MC simulations of the binding of electrons and/or protons, taking
as input the file created by convert.  See the PETIT documentation for
details.



H. Acknowledgments
==================

The authors thank the following organizations:

- Instituto de Tecnologia Quimica e Biologica, Universidade Nova de
  Lisboa, Portugal, for computational resources.

- Fundacao para a Ciencia e a Tecnologia, Portugal, for funding
  through grants and fellowships: BPD/18899/98, BPD/5740/2001,
  POCTI/BME/45810/2002, SFRH/BPD/14540/2003, SFRH/BD/23506/2005,
  PTDC/BIA-PRO/104378/2008.



I. References
=============

[1] Baptista, A.M., Martel, P.J., Soares, C.M. (1999) Simulation of
    electron-proton coupling with a Monte Carlo method: application to
    cytochrome c3 using continuum electrostatics. Biophys. J. 76,
    2978-2998.

[2] Baptista, A.M., Soares, C.M. (2001) Some theoretical and
    computational aspects of the inclusion of proton isomerism in the
    protonation equilibrium of proteins. J. Phys. Chem. B, 105,
    293-309.

[3] Teixeira, V.H., Soares, C.M., Baptista, A.M. (2002) Studies of the
    reduction and protonation behavior of tetraheme cytochromes using
    atomic detail. J. Biol. Inorg. Chem. 7, 200-216.

[4] Martel, P.J., Soares, C.M., Baptista, A.M., Fuxreiter, M.,
    Naray-Szabo, G., Louro, R.O., Carrondo, M.A. (1999) Comparative
    redox and pKa calculations on cytochrome c3 from several
    Desulfovibrio species using continuum electrostatic
    methods. J. Biol. Inorg. Chem. 4, 73-86.

[5] Teixeira, V.H., Cunha, C.A., Machuqueiro, M., Oliveira, A.S.F.,
    Victor, B.L., Soares, C.M., Baptista, A.M. (2005) On the use of
    different dielectric constants for computing individual and
    pairwise terms in Poisson-Boltzmann studies of protein ionization
    equilibrium. J. Phys. Chem B. 109, 14691-14706.

[6] Machuqueiro, M., Baptista, A.M. (2011) Is the prediction of pKa
    values by constant-pH molecular dynamics being hindered by
    inherited problems? Proteins 79, 3437-3447.

[7] Nozaki, Y., Tanford, C. (1967) Examination of titration behaviour
    Methods Enzymol. 11, 715-734.

=========================================================================
