!> \file sda_flex.html
!! main page for doxygen documentation
/**
* @mainpage SDA 7 ( SDA_Flex )
* @author Michael Martinez, Julia Romanowska, Neil Bruce, Daria Kokh, Paolo Mereghetti, Razif Gabdoulline, Stefan Richter and Rebecca Wade
* @date 2014
*
*
* \tableofcontents
* Table of contents:
*
* - \ref intro "Introduction: SDA 7 Main Features"
* - \ref section_archi "Architecture of the Solute-Grid relationship"
* - \ref section_parallel "Parallelisation in SDA 7"
* - \ref Force "Computation of forces and energy in SDA 7"
* - \ref Record "Complexes and trajectories files"
* - \ref Compilation "Compiling SDA 7"
* - \ref Memory "Memory leak tools"
* - Technical documentation specific to 1 module (Reaction Criteria, PBC,..)
*
*
*
* @section intro Introduction: SDA 7 Main Features
*
*
* The functionalities from SDA 6 and SDAMM have been unified into a single codebase, making it easier to develop new functionalities for either or both codes.
*
* - The following calculations are possible for two-solute simulations only (comparable to SDA 6 simulations):
*
* - Association rates
* - Docking
* - Electron transfer
*
* - For multiple protein simulation up to high concentrations: (formerly in SDAMM)
*
* - Translational and rotational diffusion coefficients
* - Transcient oligomerization and binding
*
* - For protein diffusion to surfaces:
*
* - Docking
* - Many protein diffusion in the presence of a surface
*
* - Spherical or cubic geomtries -with or without Periodic Boundary Conditions (\ref PBC "PBC") - can be simulated
* - The following interaction types are implemented:
*
* - Electrostatic interaction, including the method of image-charge if a metal surface is present
* - Electrostatic desolvation
* - Non-polar desolvation
* - Repulsive soft-core (repulsive Lennard-Jones) interaction for use with multiple proteins (formerly in SDAMM)
* - Hydrodynamic interactions for simulating multiple proteins at high concentration (formerly in SDAMM)
* - Lennard-Jones interactions for simulating protein-surface interactions (parametrized for the ProMetCS force-field)
* - Metal desolvation for simulating protein-surface interactions (parametrized for the ProMetCS force-field)
* - Debye-Hückel approximation of the electrostatic interaction at large distance
*
*
*
*
* New features :
*
* - Solutes can be represented by more than one structure to model solute flexibility or different protonation states.
* Each solute structure is associated to a set of interaction grids and charges.
* - OpenMP parallelisation. This is efficient for multiple and two-solute simulations.
* - Additional statistics can be computed during the simulations: residence time; first passage time; radial distribution function.
* - Extended set of tools for pre and post-analysis of simulations.
*
*
*
* \image html workflow_sda7.png "Workflow for SDA 7" width=3cm
*
* @section section_archi Architecture of the Solute-Grid relationship
*
*
* SDA 7 is written in Fortran 90. The additional features of Fortran 90 allow for a more "object oriented design" for the software than was possible with Fortran 77 which was used for previous versions of SDA.
*
* - Dynamic allocation and use of pointers allow for more efficient design and better memory usage.
* - Modules and types are used as classes: allocation (constructor) functions, deallocation (destructor) functions, and member functions.
* - Each module in the code (files which begins with mod_*.f90 ) include one main "class" and functionalities, they are all included in the library libsda.so.
* - The main program sda_flex and all tools included in the software are linked to this library. This allows reusability of the code.
*
*
*
* The main new feature of SDA 7 is the introduction of 'flexibility' for one or all of the solutes.
* Flexibility is introduced by considering several rigid body representations of a protein taken from an NMR ensemble, molecular dynamics simulation, normal mode analysis, or another relevant conformational sampling technique.
* SDA 7 can then swap between these representations based on e.g. a Monte Carlo criterion.
* Thus multiple conformations can be treated within a single trajectory.
*
* The diagram below: "Diagram of Solutes, Grids and their relationship", shows the relationships between key modules and how multiple proteins and flexibility are implemented:
*
* - Each grid ( electrostatic, exclusion, etc... ) is of the type \ref mod_grid::type_grid "grid".
* - A specific conformation or structure of a solute, including atom positions, charges, electrostatic grid,
a set of reaction criteria, etc. is stored in an object of type \ref mod_setofgrid::sogrid "setofgrid".
* - A solute with multiple structure is represented by a set of conformation, regrouped in the \ref mod_flexibility::flexible "flexible" objects.
Flexible objects ensure than only one conformation is used at a given time,
* with a "current_pointer" associated to the corresponding \ref mod_setofgrid::sogrid "setofgrid".
* - Solutes, denoted equivalently \ref mod_protein::protein "protein" in the code although they can be any type of molecule,
* are defined by their position, orientation, and a pointer to \ref mod_setofgrid::sogrid "setofgrid" or to an instance of \ref mod_flexibility::flexible "flexible".
* - All solutes are stored in an array \ref mod_array_protein::array_protein_type "array_protein".
*
*
*
*
* In the diagram solute 1 is flexible, and has two possible conformations.
*
*
*
* - During the trajectory, solute 1 changes conformation, from conformation 1 to conformation 2 (algorithms are discussed below).
* The change in conformation is performed by changing the association of the pointer (\ref mod_flexibility::flexible::p_current_node "p_current_node") of the \ref mod_flexibility::flexible "flexible" object to
* the \ref mod_setofgrid::sogrid "setofgrid" for the conformation 2.
*
*
*
*
* Solutes 2 and 3 are identical solutes and are not flexible.
*
*
*
* - As they are not flexible, \ref mod_flexibility::flexible "flexible" objects do not need to be created.
* - The solutes are identical, therefore they are directly associated with a single \ref mod_setofgrid::sogrid "setofgrid", resulting in efficient use of memory.
* Only the positions and orientations of solute 2 and 3 must be stored independently.
*
*
*
* It is straightforward to specify many identical flexible solutes:
*
*
*
* - All solutes are associated to their own \ref mod_flexibility::flexible "flexible" object, associated to their current conformation.
* - They all share the same possible conformations, defined by N \ref mod_setofgrid::sogrid "setofgrid".
*
*
*
* \image html StructureModuleProteins.png "Diagram of Proteins, Grids and their relationship" width=5cm
*
*
* The design presented is efficient for the memory usage, but has one important limitation:
*
*
* Everything related to one conformation is stored in one \ref mod_setofgrid::sogrid "setofgrid", including the position of the atoms, charges and reaction criteria.
* But the positions evolve during the trajectory unless the solute is fixed. (We refer to them as the "movable atoms").
* It would be possible to retrieve the instantaneous position of these "movable atoms" from the position and orientation of the solute stored for each \ref mod_protein::protein "protein" (by rigid-body transformation).
* However, it is not efficient in practice. In particular, rotations require a matrix multiplication which is computationally expensive.
* The compromise made in SDA (since its origin in 1998) is to apply these rotations at every step, when the Brownian Dynamics update is applied in \ref make_bd_move_2prot.f90 "make_bd_move".
* Only the rotation is applied, not the translation.
*
*
* The translation is indeed recomputed each time that the instantaneous position of a movable atom is needed.
* The full transformations (rotation and translation) are required at different stages, e.g. for computing the interactions, checking overlap with the exclusion grid, checking reaction criteria, etc.,
* and many other functionalities.
*
* There were many reasons to use this implementation in SDA 6 (and previous versions):
*
* - Only two solutes were simulated and the larger solute, solute 1, is kept fixed throughout the simulation,
* so this transformation was needed only for the smaller, second solute.
* - SDAMM and SDA 7 use the same approach, but a copy of the movable atoms must be present in the \ref mod_protein::protein "protein" object; they cannot be shared in one \ref mod_setofgrid::sogrid "setofgrid".
* So the solutes have additional pointers to movable arrays (\ref mod_protein::protein::p_pos_charge , \ref mod_protein::protein::p_list_atom_surface and \ref mod_protein::protein::p_pos_rcriteria).
* - By default, during the initialisation, these arrays are first allocated and populated in a \ref mod_setofgrid::sogrid "setofgrid" and the pointers of the proteins are associated to them.
* Then, in \ref initialize_movable_atoms, an independent copy for each solute is made when needed.
*
*
*
* @section section_parallel Parallelisation in SDA 7
*
* @subsection subsection_sdamm Multiple molecule simulation (SDAMM) parallelisation
*
*
* Different parallelisation schemes are used in SDA 7. To help understand the parallelisation schemes,
* we briefly discuss each simulation type, and then the strategy used for parallisation with some pseudo codes.
*
*
* In SDAMM simulations, only one trajectory is generated and the trajectory has a predefined number of steps.
* The most computationally expensive subroutines are :
*
* - Force calculations: can be easily more than 95 % of the time, algorithm of O(N^2) with N being the number of solutes
* - Brownian Dynamic update: linear scaling with N
* - Energy calculations: only performed before writing the configuration in the trajectory output file
*
*
*
* The method of parallelisation is a straightforward application of OpenMP, with the main loop split between threads in theses specific subroutines.
*
*
* For the force and energy calculations:
*
* - The force and energy calculations are computed by a double loop over all pairs of solutes, the sum of the interactions
* of each solute is added to an array "total_interaction".
* \verbatim
do s1 = 1, n
do s2 = s1 + 1, n
interaction12, interaction21 = compute_interaction_between_solute1_and_solute2()
total_interaction ( s1 ) = interaction ( s1 ) + interaction12
total_interaction ( s2 ) = interaction ( s2 ) + interaction21
end do
end do
\endverbatim
* - The first loop is split over different threads, the REDUCTION keyword assures the correction synchronization
* when updating the "total_interaction" array.
* \verbatim
!$OMP PARALLEL
!$OMP REDUCTION(+:all_interaction)
!$OMP DO SCHEDULE( DYNAMIC )
do s1 = 1, n ! only the first loop is split over the threads
do s2 = s1 + 1, n
interaction12, interaction21 = compute_interaction_between_solute1_and_solute2()
! secure because of REDUCTION
total_interaction ( s1 ) = interaction ( s1 ) + interaction12
total_interaction ( s2 ) = interaction ( s2 ) + interaction21
end do
end do
!$OMP END DO
!$OMP END PARALLEL
\endverbatim
*
*
*
* For the BD update : there is only one loop over all proteins
*
*
* Parallel Performance
*
*
* The scaling of parallel calculations can be heavily system dependent,
* so you should test for your system before carrying out large calculations, especially at high levels of parallelisation.
* You can expect that larger systems will scale better on more cores.
* We tested a 128 lysozyme molecules on a shared memory AMD Magny Cours machine with 4 CPUs of 12 cores for a total of 48 cores. We present the following results as a guideline.
*
*
* - Force and energies scale relatively well up to 24 CPUs, the BD update does not scale after approx. 10 CPUs.
* - Without accounting for the overhad of thread creation and synchronization, the calculation rapidly becomes limited by the Amdahl's law,
* .i.e. the serial section in the loop becomes the bottleneck to scaling.
* - With larger systems (512 lisozymes), we obtain a scaling of 80 % with 24 CPUs.
*
*
*
* @subsection subsection_sda Two solute simulation (sda_2proteins) parallelisation
*
*
* The implementation of the parallelisation with two solutes is very different from that for SDAMM:
*
* - For association rate calculations or docking, we want to perform many trajectories to achieve good statistics.
* - We do not know the time required for a trajectory to complete (timemax or escape)
* - There is no clear "most expensive function", it depends on the simulation performed.
* - Force and energy cannot be parallelised in the same way as for SDAMM, the parallelized loops should be
* inside the interaction calculations
*
*
*
* The approach chosen for parallelisation is to split the total number of trajectories across the different threads.
* Each thread is responsible for running a full trajectory. When one trajectory is finished, the thread will start calculating a new trajectory,
* until no more trajectories need to be conducted.
* The difficulties with this approach are:
*
* - An independent copy of the \ref mod_protein::protein "protein" object is needed for each thread because
* the solute's position, orientation and movable atoms cannot be shared.
* However, the grids are fixed and only used for reading, so only one copy of the \ref mod_setofgrid::sogrid "setofgrid" is required, no matter how many threads are used.
* - The synchronization of shared objects. Every function must be thread-safe and the update of most of the shared arrays (e.g. complexes)
* must be protected from concurrent writes.
*
*
*
* The following choices were made for a thread safe implementation for two-solute calculations:
*
* - An independent copy of Solute 2 (the moving solute) is assigned to each thread. It includes a copy of the movable arrays
* even if the solute is not flexible, and a copy of the associated \ref mod_flexibility::flexible "flexible" object in this case.
* - A local copy of the list of complexes is also assigned to each thread. They are merged into a global list of complexes
* at user-defined intervals (e.g. every 100 trajectories).
*
* - The insertion of a complex into the list of complexes is not thread-safe.
* - The copy and merging into a global list of complexes is probably the best compromise.
* The local list of complexes are typically smaller than the global one.
* Additionally, checking and inserting a new complex into the local list is faster.
* Merging local into global lists is protected by a CRITICAL section, but it is infrequent (every 50-100 trajectories done by the thread).
* After a few trajectories have been performed, the desynchronisation between threads becomes an advantage (attractive feature).
*
* - For the update of global arrays (association rates, residence time...), a temporary and local array is owned by each thread throughout an entire trajectory.
* The update of the shared array is only done at the end of the trajectory. The update of global arrays are usually simple updates adapted for the use of the ATOMIC section,
* which tends to have minimal performance overheads.
* \verbatim
!$OMP ATOMIC
global( i ) = global ( i ) + local ( i )
\endverbatim
* - Some arrays, like those used for the bootstrap or to store the number of steps per trajectory,
* are designed to store a unique result on a per trajectory basis.
* Therefore there is no need to protect them against concurrent writes.
*
*
*
*
* A schematic outline of \ref MainLoop_sda_2proteins_omp.f90 (variables assignment as SHARED or PRIVATE are omitted)
* is given below:
*
*
* All objects have been previously initialized in \ref read_input_file.f90
*
* \verbatim
MainLoop( tab_protein_unique )
!$OMP PARALLEL
! allocate local array, OpenMP or not
allocate ( local_escape, local_sumtime,...)
! if OpenMP make a copy of the Solutes
!$ if ( tid > 0 ) then
!$ call copy_array_protein ( tab_protein_unique, ptab_protein, type_calc )
! normal if not OpenMP or master
!$ else
! write (*,*) 'Associate ptab_protein to tab_protein_unique'
ptab_protein => tab_protein_unique
!$ end if
! if OpenMP make a local copy of the list of complexes ( here empty, only copy data members )
!$ if ( tid >= 0 ) then
!$ call copy_record ( o_complexes, p_local_complexe, .false., size_thread_complexe )
! else normal if not openMP
!$ else
! associate the global pointer to o_complexe if not openmp
if ( associated ( o_complexes % pcomplexe ) ) then
p_local_complexe => o_complexes
end if
!$ end if
! not necessary but convenient, simpler to refer to solute 1 and 2
protein1 => ptab_protein % array ( 1 )
protein2 => ptab_protein % array ( 2 )
!$OMP BARRIER
!$OMP DO SCHEDULE( DYNAMIC )
LOOP_RUN: &
do run = 1, total_traj
!$ if need to merge complexes every 100 runs
!$OMP CRITICAL ( crit_merge_p_complexe )
!$ call merge_complexe ( o_complexes, p_local_complexe )
!$OMP END CRITICAL ( crit_merge_p_complexe )
!$ endif
call init_position_protein()
! begin BD loop
LOOP_STEP: &
do
call force_epedhdlj_2proteins_fast( protein1, protein2 )
call make_bd_move_2prot( protein1, protein2 )
call check_reaction_pairs( protein1, protein2 )
! add to the list of complexes
if ( complexe_to_record .eqv. .true. )
call energy_epedhdlj_2proteins_fast( protein1, protein2, energy )
call add_record ( p_local_complexe, ptab_protein % array ( 2:2 ), energy )
end if
! use atomic inside
call update_residence_time ( resid_time, protein2 % position, dist, dtnow )
end do LOOP_STEP
! only at the end of the trajectory
call update_escape()
end do LOOP_RUN
! deallocate Solutes for all threads, except the MASTER
nullify ( protein1, protein2 )
!$ if ( tid > 0 ) then
!option true for NOT deleting the grid
!$ call delete_array_protein ( ptab_protein, .true. )
! now need explicit deallocation of the pointer
!$ deallocate ( ptab_protein, stat = ierr )
!$ nullify ( ptab_protein )
!$ end if
! same deallocation for local_complexes
\endverbatim
*
* The use of OpenMP pragma allows a compilation with or without OpenMP (without loss of performance).
* With this implementation a high level of parallelization is achieved because there is no sequential code in the main loop.
* The merging of the lists of complexes is still the main limitation, but their frequency can be tuned for a specific system and computer architecture.
*
*
* \subsubsection subsubsection_endeffect Run completion (end effect) issues
*
* A demonstration simulation with 6 trajectories and 3 available threads can be represented by the scheme below
* \image html EndEffect.png "End Effect : threads are waiting at the end of the simulation" width=1.8cm
*
* Each thread runs calculations for one trajectory. When a trajectory is completed, the thread will run the next trajectory.
* At the very end of the calculation there are no more trajectories to perform and N-1 CPUs are in a waiting state
* This may be a problem for association rate calculations where trajectories can, in principle, last for an infinite amount of time.
* In practice this effect is usually minor.
* It is difficult to estimate the end effect during runs to test the performance of SDA 7
* (relatively short runs with X trajectories, where X is generally small ~ 100),
* so the scaling is generally underestimated compared to that expected for a real simulation (> 10,000 trajectories).
*
* In the most favorable case, association rates calculations (with no complexes recorded) have scaling reaching 80 % on 48 CPUs.
*
*
* @section Force Computation of forces and energy in SDA 7
*
*
* The implementation of computations of forces and of energies are similar in sda_2protein and SDAMM,
* but they are nevertheless implemented in different functions for optimization.
* The differences are the double-loop over all pairs of solutes in the case of SDAMM (as seen in the \ref section_parallel "parallel" section)
* and the fact than the first solute is always kept fixed in sda_2proteins.
*
*
* In the following sections, we use the example of the computation of the energy of the charges on solute 1 in the potential grid of solute 2.
*
*
* \subsection subsection_charge_grid Charges and Grids
*
*
* The following five types of interaction are described as "charges" interacting with a potential grid.
*
* - Coulombic electrostatic interaction : The effective charges of solute 1 and the potential grid (minus the derivative of the potential for the force) of solute 2.
* - Electrostatic desolvation interaction : The square of the effective charges and the electrostatic desolvation grid.
* - Non-polar desolvation : The surface accessibility of atoms and the non-polar grid.
* - Repulsive soft-core (Lennard-Jones): The surface accessibility of atoms and the repulsive soft-core Lennard-Jones grid.
* In this case, no "charges" are needed in the computation, only the position of atoms.
* But to have an identical algorithm, we introduced a dummy_array which contains only values of 1 for the charges.
* - Lennard-Jones for surface: The same dummy_array is used for the charges, and a LJ grid is generated for each type of atoms on the surface.
* - Image-charge: Use both effective charges and the electrostatic grid of the same solute
*
*
*
*
* The other energy terms: metal desolvation, and the correction to the image-charge have a different implementation and
* will not be considered in this section.
*
*
*
* All energies are computed as :
* \f[
* E_{12} = \sum_{i=1}^n q_1^i(x,y,z) \, U_2(x,y,z)
* \f]
* where \f$E_{12}\f$ is the energy of the effective charges from solute 1, \f$q_1\f$, in the potential of solute 2, \f$U_2\f$.
* With the exception of the electrostatic energy, the total energy is evaluated as the sum of those components, \f$E = E_{12} + E_{21}\f$.
*
* For the electrostatic interaction energy, the definition is sligthly different. The total electrostatic energy should satisfy:
* \f[
* E_{el} = E_{12} = E_{21}
* \f]
* But due to approximations in the electrostatic potential grid and effective charges computations, the results of \f$E_{12}\f$ do not exactly equal \f$E_{21}\f$.
* In order to use the same algorithm for each type of interaction, we estimate the total electrostatic energy as the average of the two components:
* \f[
* E_{el} = \frac{1}{2} ( E_{12} + E_{21} )
* \f]
* Where the factor \f$\frac{1}{2}\f$ is applied on the electrostatic potential grids when they are read into memory.
* It can be adjusted in the SDA7 input file with the parameter epfct (default 0.5)
*
*
* The same applies for the forces, with the difference that we need the derivatives of the energy:
* \f[
* \vec{f_{12}} = \sum_{i=1}^n - \, q_1^i(x,y,z) \, \nabla U_2(x,y,z)
* \f]
*
* The values of the energy and the derivatives are extrapolated from the grids (see functions \ref mod_force_energy::potential_atom_uhbd and \ref mod_force_energy::force_atom_uhbd ).
* Note that pointers to the grids are used, rather than the original grid. The use of pointers allows for these functions to be used on any 3-dimensional grid of real*8.
* It is therefore be relatively straighforward to implement calculations of new energy and force terms.
*
*
* \subsection subsection_choice_grid Choice of the Grids
*
* All grids and charges have a type associated with them (enumerated in \ref mod_gridtype), all grids derive from the same class \ref mod_grid.
* The types are used to select the correct association between charges and grids. This is performed by the function \ref mod_protein::select_charge.
* Given the grid_number, the \ref mod_protein::select_charge function associates pointers to the charges and the grids.
* Pointers are then used for computing the interactions by the same subroutine.
*
* A simple and general method for evaluating the forces is summarized here schematically. It is simplified pseudo-code, from \ref mod_compute_force::force_epedhdlj_2proteins, which does not take into account
* asymmetric interactions.
* \verbatim
subroutine energy_epedhdlj_2proteins ( prot1, prot2,...)
! loop over grid, sure to be associated because a pre-sorting is done in param_force_energy
GRID: do i = 1, param_force_energy % nb_grid
! extract the grid number, identical to accessing the sogrid directly
grid_number = param_force_energy % array_grid_calc ( i, 1 )
call compute_force_sda_2proteins ( prot1, prot2, grid_number,..)
end do
end subroutine
subroutine compute_energy_sda_2proteins ( prot1, prot2, grid_number,..)
! assign correctly the pointers from the grid_number
call select_charge ( grid_number, prot1, prot2, nq1, nq2, p_charge1, p_charge2, &
p_charge_position1, p_charge_position2, dummy_array )
! compute force, discussed in detail the next chapter
Energy 12 = Sum of p_charge1 * U ( p_charge_position1, prot1 % p_sogrid % pset ( grid_number ) )
Energy 21 = Sum of p_charge2 * U ( p_charge_position2, prot2 % p_sogrid % pset ( grid_number ) )
end subroutine
\endverbatim
*
*
* \subsection subsection_normal_algorithm Simple Algorithm for computing forces and energies
*
*
* There are several possible methods for calculating the force (and energy), here we present the simplest one.
* Its advantage is, as its name suggests, simplicity, and in addition generality.
* This method should be used as the first when a new type of interaction is being added to the code, since it greatly reduces the risk of introducing errors.
* The principal difficulty is dealing with the coordinate transformation between frames of reference, because the grids are fixed and never rotated.
*
*
* \image html ForceDrawFinal2.png "Computation of forces" width=4cm
*
*
* The figure "Computation of forces" shows the general computation of \f$f_{21}\f$, the force applied on solute 2,
* when both solutes can rotate (e.g. the case of SDAMM).
*
* The grid shown with the dashed line represents the grid orientation in the fixed frame of reference R', and is stored in memory.
* The solid lines represents the grid orientation corresponding to the instantaneous orientation of solute 1 (frame R).
*
* The simple algorithm is a loop over the charges of solute 1:
*
* - The position of the charge \f$i\f$ is computed in the fixed frame of reference of the grid (R') --- the dashed "solute 2".
* - The force is computed in this frame as \f$f'= - q.\nabla U\f$.
* - f' is transformed back into R, it is shown as f.
* - The torque, t, is computed as the cross product of the position and the force : \f$\tau = r \wedge f \f$
* (not shown in the picture).
* - f and t are added to the total force and total torque on solute 2.
*
*
* Below is an extract from the SDA 7 code (\ref mod_compute_force_sdamm::force_epedhdlj_sdamm) for the computation of the force and the torque on solute 2.
*
* \verbatim
subroutine compute_force_sdamm_2proteins( prot1, prot2, pos_relat,..)
! get the orientation of the first solute
ex = prot1 % orientation_x
ey = prot1 % orientation_y
ez = prot1 % orientation_z
! inverse the matrix, it is a unity matrix : transposed equal the inverse
xe(1) = ex(1)
xe(2) = ey(1)
xe(3) = ez(1)
ye(1) = ex(2)
ye(2) = ey(2)
ye(3) = ez(2)
ze(1) = ex(3)
ze(2) = ey(3)
ze(3) = ez(3)
! loop over the charges of solute 2
FORCE2: do j=1,nq2
! pos_relat is the relative position of the center of geometry of the solutes, and p_charge_position2 have the correct orientation
p2 = p_charge_position2 (:,j)
! relative position of the charge in the frame of reference R
rto = p2 + pos_relat
! make the trnasformation between the frames of reference, rtn is the relative position in the frame R'
call tr ( rto, rtn, ex, ey, ez )
! get the derivative of the potential at the position rtn
call force_atom_uhbd ( rtn, force, prot1 % p_sogrid % pset_grid ( grid_number) )
! transform back into R
call tr ( force, force_o, xe, ye, ze )
! multiply by the charge to get f in R. Back transformation and multiplication can be inverted here
qf = force_o * p_charge2(j)
! call the cross product to get the torque, directly in R
call cross (tmp2, p2, qf )
! add to the total torque and force
tt2 = tt2 + tmp2
ff2 = ff2 + qf
end do FORCE2
\endverbatim
*
*
* With the code presented, we could compute all interactions, both for sda_2proteins and SDAMM.
* There are nevertheless different implementations for optimizing the calculations.
* In sda_2proteins, solute 1 is always fixed, so that some computations are simpler:
* the charges of solute 2 do not need to be expressed in the rotated frame.
*
*
* \subsection subsection_fast_algorithm Fast Algorithm
*
*
* Two improvements of the simple algorithm are made:
*
* - We used two transformations between the frames of reference for each atom.
* Profiling of the code shows that the function \ref tr() is the most called function.
* This includes cross() which is called by tr().
* This way the forces are always added in the frame of reference (R).
* These steps are unnecessary: forces and torques can be added in the frame R', and then transformed back into the frame R.
* It divides the number of calls to the tr() and cross() functions by roughly two.
* - Both electrostatic potential and electrostatic desolvation calculations use the effective charge positions.
* In the former case with the effective charges and in the latter with the square of the effective charges.
* This implies that exactly the same geometric transformation is performed twice.
* Therefore, we can compute the electrostatic and the desolvation terms at the same time. Again, we can divide the number of calls to tr() by two.
* - We can use the same logic as above for the case of the non-polar desolvation and repulsive soft core (Lennard-Jones), which both use the positions of the surface atom.
*
*
*
*
* The functions that implement these improvements have an equivalent name to the functions implementing the simple algorithm, with the "fast" suffix added.
* The module \ref mod_force_energy "mod_force_energy" stores many information about the required calculations.
* In particluar, it always tries to use the optimized functions.
*
* The fast algorithm is more difficult to maintain than the simple algorithm, but in the best case (SDAMM with all four interactions) there is a 50% gain in performance.
*
*
* \section Record Complexes and trajectories files
*
* The implementation of those main outputs has been largely modified, compared to SDA6. The specifications for the new design were :
*
* - Reuse of the code: All subprogrammes in SDA 7 (sda_2proteins, SDAMM, sda_koff,..) use the same output formats.
* Furthermore, many tools use complexes or trajectories files as input and/or output. The common tasks, such as file opening, reading, and copying should be easy to implement in the tools
* - Easy to maintain: A change in the format, like adding new information (new types of interaction energy, new descriptor for a solute...)
* should not make it necessary to modify all existing tools
* - Consise and optimized in size: As interactions or new properties are added, the number of fields increases.
* At a point, a simple docking with only electrostatic interactions was producing a docked complex line with more than 25 fields, half of them filled with zeros.
* The option to produce and read binary files format has also been included.
*
* The first release of SDA 7 does not fully support all those requirements (features are implemented but interfaces should be simplified), but it already presents considerable advantages.
* A versioning has also been included, to maintain the compatibility with previously generated files.
*
*
* \subsection design_record Design of the module record
*
* The requirements cited above should apply to both complexes and trajectories files. Their functions are indeed very similar:
* they record information about the solute (solute number, conformation number...), the solute's position and orientation,
* the energy interactions terms and additional statistics dependent on the type of simulation.
*
*
* But their usage differs a lot. Data for trajectories are always appended into the file.
* They are usually large files with a simple and chronological ordering.
* On an other hand, complexes are ordered by energy values. The best complexes (with a minimum energy) are at the top of the file
* and those with large energies are discarded, the idea being to maintain a maximum number of relevant complexes,
* in order to analyse them afterwards with external tools like the clustering.
*
*
* The implementation of complexes and trajectories output in SDA 7 groups the similarities into a main class \ref mod_record::record "record".
* All the functions/tools that use complexes and trajectories objects, should call functions from \ref mod_record::record "record".
* Moreover, each "line" of those files is defined in class \ref mod_onecomplexe::one_complexe "one_complex" (it's called 'complex' but it refers to one solute).
*
* The main difference between complexes and trajectories files is in the implementation of intermediate types: \ref mod_record::listcomplexe "ListComplexes" and \ref mod_record::trajectory "Trajectory".
* There are different algorithms for collecting the objects of type \ref mod_onecomplexe::one_complexe "one_complex" into complexes and trajectories:
*
* - An array of \ref mod_onecomplexe::one_complexe "one_complex" is used for trajectories,
* it is filled sequentially, written to disk and reset when the array is full.
* - A list is implemented for storing the \ref mod_onecomplexe::one_complexe "one_complex" of the complexes file.
* The insertion or deletion of a \ref mod_onecomplexe::one_complexe "one_complex" at any position is then optimized.
*
*
* The scheme below shows a schematic design of those modules:
*
* \image html record_diagram.png "Diagram of the module Record" width=4cm
*
* \subsection design_onecomplex Variable fields in records
*
* The information stored in each \ref mod_onecomplexe::one_complexe "one complex" can vary but it always follows the same order:
*
* - Integers array \ref mod_onecomplexe::one_complexe::int_ocompl "int_ocompl": stores the description of the trajectory and solute. Any of those fields can be printed or not, but their order is fixed:
*
* - trajectory number
* - step number
* - solute number
* - conformation number
*
* - Position and orientation of the solute: 9 fields of real*8 (3 for translation and 6 for the orientation of the inner x and y axis) are mandatory
* - Total energy: the sum of the different energy terms, this field is always present
* - Real array \ref mod_onecomplexe::one_complexe::real_ocompl "real_ocompl": stores the energy interaction terms.
* Just like the integer array, these are variable fields but their order is fixed (following the numbering in mod_gridtype.f90 )
*
* - Coulombic electrostatic (ES)
* - Desolvation electrostatic (ED)
* - Nonpolar/hydrophobic desolvation (HD)
* - Repulsive soft-core(Lennard-Jones) (rLJ)
* - Full Lennard-Jones (LJ)
* - Electrostatic image charge (ImgChg)
* - Metal desolvation (MetDes)
*
* - One more real array \ref mod_onecomplexe::one_complexe::other_real_ocompl "other_real_ocompl": stores additional data, depending on the type of calculation
*
* - In the complexes file, 3 additional fields: the occurence of each complex, its average total energy and the standard deviation of the total energy
* - In the trajectories files: the time in ps
*
* - Three integers representing the information about the \ref mod_onecomplexe::one_complexe::box "box": when PBC are used in SDAMM, these fields are used to write in which copy of the box the solute resides currently.
*
*
*
* The information about what is written to disk in each file is coded in the header of the complexes/trajectories file (defined in \ref mod_record::record::description_column_str "description_column_str"), if these are in ascii format. The header contains all informations to read the following lines. It is constructed in \ref mod_record::record::header_str "header_str" or directly printed from \ref mod_record::write_header_bin "write_header_bin", and is composed of:
*
* - The version number: currently, the newest is "#version0.2"
* - The type of calculation: either the enumeration code from \ref mod_record::record::record_type_calc "record_type_calc" or its string version
* - The type of record: complex or trajectory, written also either with its enumeration value or with a string
* - The total number of solutes: as given by the user in the input for SDAMM, fixed at 2 for sda_2proteins
* - The number of types of solutes: used to split the energy terms between the types of solutes in SDAMM
* - Five integers: they define all parametrizable options for printing the complexes. These are describded in the next paragraph
* - Optionally, for sda_2proteins, the centers of geometry of the solutes are written in the next 2 lines
*
* After the header, follow the \ref mod_onecomplexe::one_complexe "one_complex" data, one per line in ascii format.
*
*
* \subsection param_options Coding the input options
*
* The five mysterious integer values present in the header describe the following options:
*
* - The value of \ref mod_record::record::bit_integer "bit_integer": indicates which of the integer variable fields of the \ref mod_onecomplexe::one_complexe "one_complex" object are present
* - The value of \ref mod_record::record::bit_energy "bit_energy": indicates which of the energy terms were computed
* - The value of \ref mod_record::record::nb_other_real "nb_other_real": how many fields are used for the additional information. Currently, the complexes file expects nb_other_real=3 and trajectories nb_other_real=1
* - The boolean value \ref mod_record::record::opt_sum "opt_sum": if true (1), the energy terms representing interactions between different types of solutes are summed up, otherwise, they are output separately
* - The boolean value \ref mod_record::record::box_info "box_info": if the box information should be output (in SDAMM, if PBC is set)
*
*
*
* The first 2 integers are read and written as bits. The idea was to store the information about the presence of a term or not and avoid repetitive numbers.
* The implemented solution uses the binary representation of those integers as logical values.
* As an example of bit_energy, the value of 5 in binary representation is written as:
* \verbatim
5
00000101 <- we have to read from the right to the left
\endverbatim
* This value codes the following information (remember that the order of the terms is fixed):
*
* - first term 1: electrostatic energy was computed
* - second term 0: electrostatic desolvation is not present in the file
* - third term 1: apolar desolvation was computed
* - fourth term 0: repulsive soft-core (Lennard-Jones) was not computed
* - fith term 0: Lennard-Jones was not computed
* - (all other fields are 0 and are not currently used)
*
*
*
* Another example for a bit_integer of 14 - its binary format is:
* \verbatim
14
00001110
\endverbatim
* Which means that:
*
* - first term 0: No trajectory number is present
* - second term 1: Step number is present
* - third term 1: Solute numbers are present
* - fourth term 1: Conformation numbers are present
* - (all other fields are 0 and are not currently used)
*
*
*
* In principle, all the classes using the complexes/trajectories, should use the interface functions implemented in mod_record and mod_onecomplexe, and not deal directly with the BITS functions themselves!
*
*
* \subsection example_record Example of using the record module
*
* Below, we show a very simple examplary code. Real examples can be found in tools (specifically in \ref read_record.f90 and in \ref trajectory2dcd.f90).
* The code reads an existing record (complex or trajectory), extracts useful information and writes the new complexes or trajectory file to the disk.
*
* \verbatim
! simple initialization, nullify pointers. clean for allocated objects
call init_record ( o_record, opt_iocmx = io_record )
! read complex or trajectory (information in header), ascii or binary
! set all internal fields and bits from the header
! if optional nb_line, count the number of lines and assign it to nb_line
call read_header_record ( in_record, filename, cm, nb_line )
! finalize a correct initialization, optional .true. do not write the header
call initialize_record ( o_record, tab_prot, .true. )
!!!!!!!!!!!! Read data
! can read all data at once
! for traj can proceed by steps or multiple of steps
call read_file_record ( o_record )
! reset position, complexe or trajectory
call reset_position ( out_record )
! can loop over each one_complexe's
! p_tmp_onecomplexe is a convenient shortcut (for o_record % p_onecompl),
! will be updated with the getNext optional argument
p_tmp_onecomplexe => o_record % p_onecompl
! make a loop over all entries
do while ( ( ierr == 0 ) )
! if correctly set_up applies for sda 2proteins and sdamm
do i = 1, nb_prot_in
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! Here make whatever you want
!!! Read input
! e.g., get the conformation number, method 1 with indice_nconf data member of record
if ( iflex ) then
tmp_conf = p_tmp_onecomplexe % int_ocompl ( o_record % indice_nconf )
end if
! get integer and energy information from the current one_complexe
! order is natural : 4 integer, as many as necessary real
call return_compatible_array_wrap ( o_record, array_int, array_real, other_real )
! e.g., get position and orientation
!
! Here need to declare tab_prot explicitely before, see example in trajectory2dcd.f90
tab_prot % array ( tmp_prot ) % position = p_tmp_onecomplexe % position
tab_prot % array ( tmp_prot ) % orientation_x = p_tmp_onecomplexe % orient_x
tab_prot % array ( tmp_prot ) % orientation_y = p_tmp_onecomplexe % orient_y
!!! Write output
! allocate onecomplexe, with different fileds (coming from record_comp)
allocate ( p_new_onecomplexe, stat = ierr )
call new_onecomplexe ( p_new_onecomplexe, record_comp % nb_int, record_comp % nb_real, record_comp % nb_other_real )
call set_onecomplexe_wrap ( record_comp, p_tmp_onecomplexe % position, p_tmp_onecomplexe % box, &
p_tmp_onecomplexe % orient_x, p_tmp_onecomplexe % orient_y, &
array_int, array_real, other_real, opt_p_onecomplexe = p_new_onecomplexe )
! always insert onecomplexe at the end
call insert_new_node ( record_comp % pcomplexe % p_last, p_new_onecomplexe )
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! go to the next one_complexe
! optionaly increment p_tmp_onecomplexe to the next element
call getNext( record_traj, p_tmp_onecomplexe, ierr )
end do ! loop over solutes
! or write the data here in case of sdamm, after having read all solutes
! increment the loop
nloop = nb_line + 1
end do ! loop over nb_line
! clean memory from the record
call delete_record ( o_record, .true. )
! write output into disk and clean the memory
call write_record ( record_comp, .false. )
call delete_record ( record_comp, .true. )
\endverbatim
*
*
* \subsection Compilation Compiling SDA 7
*
* We have tested SDA 7 compilation on Linux with:
*
* - gnu fortran : 4.4 to 4.8.3
* - intel fortran: 11 and 14
* - portland fortran: version 12 and 13.4 (only partly tested!)
* - XL Fortran: even less tested, but compile
*
*
* We had errors during the link stage with a MacOS (maverick XOS with gfortran 4.8).
* So we provide also a version without the use of the intermediate library libsda.so. and a static build
* All options are similar, only Makefile.static must be called: make -f Makefile.static option
* We could also test the compilation on an IBM power7, it is in Makefile.xlf
.
*
*
* In the provided Makefile, the default options for the intel compiler (-warn all
) are very strict.
* The declaration of interfaces (for all the functions not contained in a given module) is mandatory in this case.
* Fortran does not use header files, so without those interfaces it is impossible for the compiler to notice a wrong or even missing argument of a function.
* It is convenient to check the integrity of the code, and may save (a lot by experience) debugging time.
*
*
*
* \subsection Memory Memory leak tools
*
* SDA 7 uses extensive memory allocation. It requires explicit allocation of and deletion of objects from the heap memory.
* Valgrind ( http://valgrind.org/ ) has been used during the development of the software. It is invaluable for finding mistakes (forget to free memory) or bugs (writing outside an array) in the code.
* In 90 % of the cases, it answers to the question (and gives the origin of the error if the code was compiled with "debug=yes" option): Why does my program crash on the cluster? It was working fine on my workstation!!
*
*
* We recommend to use this tool when modifying the SDA7 code or developing a new module/tool to avoid accumulation of mistakes and errors.
* The released version gives 0 errors and 0 memory leaks in the serial version, when compiled with gfortran (as far as tested), however there are a few errors related to the use of OpenMP.
* Unfortunately, when compiling with the intel compiler, more errors are found by Valgrind, many related to the use of SSE instructions.
*
*
*/