PForce Manual
                  Vadim A. Markel
                 vmarkel@upenn.edu

===========================
===========================

TABLE OF CONTENTS

0. Introduction
1. Some Notations
2. Brief Summary of Programs
3. Parameter File PF.par
4. Parameter File LS.par
5. Application Notes for LS

===========================
===========================

0. Introduction
===============

The codes in this package can be used to generate all data displayed
in the figures of the Paper or run custom simulations (within the same
model) of the physical properties of the ionic lattice of NaCl.

The code LS.exe performs lattice summation in order to compute the
electric field at the location of individual ions. This code may be of
more general interest. It uses brute force summation. The Paper
provides further analytical approximations, which work well in the
range of parameters 0 < x <= 2, where x is distance to a sub-layer of
ions in units of one lattice unit h (fixed in the lateral directions
X and Y) to the point of observation.

The codes have been tested only with Intel's compiler ifort. Other
compilers may or may not work. The codes use some specific type
conversion and quadruple-precision functions, which may be Intel's
extensions of the standard language and not available in other
compilers.

1. Some Notations
=================

The following quantities are dimensionless:

kappa = q^2 / k*h^3  -- The main parameter of the model. Lattice is stable for 0 <= kappa <= ~0.25
       
                   q -- Absolute value of the charge of an ion
                   k -- Spring constant for the chemical binds
                   h -- Fixed lattice unit in the XY plane

p = h * P_ext / k    -- Dimensionless external pressure

e = h^2 * E_ext / 1  -- Dimensionless applied field (in atomic units)
 
L(e,p)               -- Dimensionless width of the slab in units of h
                        L = ( zA(N) - zA(1) + zB(N) - zB(1) ) / (2*h)


2. Brief Summary of Programs
============================

Program  | Function                       | Interpretation of pars | Output, Comments and Definitions
================================================================================================================
PF_u     | Computes the lattice unit      | kappa_min <-- 0        | u.dat : kappa, u1, u2 : u1 = z(N)-z(1)]/(N-1)
         | u(kappa) at p=e=0              | kappa_max <-- kappa    |                       : u2 = z(Nh+1)-z(Nh)
         |                                | e=p       <-- 0        |                       : Nh = N/2
----------------------------------------------------------------------------------------------------------------
PF_p     | Computes the ratio L(p)/L(0)   | p_min     <-- p_min    | L0.dat : L(p)/L(0) <-- without electrostatic interaction
         | at a given kappa and e=0       | p_max     <-- p_max    | L.dat  : L(p)/L(0) <-- with    electrostatic interaction
         |                                | kappa     <-- kappa    |
         |                                | e         <-- 0        |
----------------------------------------------------------------------------------------------------------------
         | Computes                       | kappa     <-- kappa    | L.dat   : e, L(p,e), 1-L(p,e)/L(p,0)
PF_e     | L(e),  1 - L(e)/L(0)           | p         <-- p_min    | eps.dat : e, eps
         | eps(e) [Eq.(50a)]              | e_min     <-- e_min    | Dip.dat : e, D 
         | D(e)   [Eq.(50b)]              | e_max     <-- e_max    | err.dat : e, error of equilibrium
         | for given kappa and p          |                        |           error = sqrt[ Sum(Eq.39)^2 / N ]
----------------------------------------------------------------------------------------------------------------
         | Computes eps(kappa) for p=0    | kappa_min <-- 0        | eps.dat : kappa, eps
PF_eps   | using two points e=0 and e=de  | kappa_max <-- kappa    | err.dat : kappa, error
         | Here eps is given by Eq.(50a)  | p         <-- p_min    |           error of equilibrium
         |                                | de        <-- e_max    |           error = sqrt[ Sum(Eq.39)^2 / N ]
----------------------------------------------------------------------------------------------------------------
         | Computes Young's modulus       | kappa_min <-- 0        | eta.dat : kappa, h*eta/k 
PF_eta   | eta(kappa) at e=0 using two    | kappa_max <-- kappa    |                             
         | pressure points p=0 and p=dp   | dp        <-- p_max    |           h*eta/k := dp / [1 - L(dp)/L(0)]
         |                                | e         <-- 0        |                                           
----------------------------------------------------------------------------------------------------------------
         | Computes the coefficients      | kappa_min <-- 0        | alpha.dat : kappa, alpha
PF_alpha | alpha(kappa) [Eq.(11)]         | kappa_max <-- kappa    | mu.dat    : kappa, k*mu/h                            
         | k*mu(kappa)/h [mu=alpha/eta]   | dp        <-- p_max    |          
         | using four points              | de        <-- e_max    |                                           
         | e=0, e=de, p=0, p=dp           |                        |
----------------------------------------------------------------------------------------------------------------
         | Computes the lattice sums      | L         <-- L        | S.dat : x, S_s(x), S_d(x)
LS       | Sigma_s(x), Sigma_d(x) [Eq.21] | x_min     <-- x_min    |
         | and then determines            | x_max     <-- x_max    |
         | S_d(x), S_s(x) [Eq.23]         | x_c       <-- x_c      | --- Critical value of x above which
         |                                |                        |     computations are done in quad precision
---------------------------------------------------------------------------------------------------------------- 

3. Parameter File PF.par
========================

Programs that start with PF utilize the parameter file PF.par. An
example of PF.par is given below:

150                   ! N            [Number of atom layers]
0.1                   ! kappa        [q^2/kh^3 -- parameter of the model]
0.0,0.2               ! e_min, e_max [e = E*h^2/q where E is external electric field]
0.0,0.0               ! p_min, p_max [p = P*h/k   where P is external pressure]
101                   ! Np           [Number of points for various scans]
1.0d-8, 1.0d-8        ! th1, th2     [Small constants for the stop conditions]
0                     ! iter_max     [Give 0 for iter_max=1e9 (almost unlimited)]
F                     ! Vocal        [T or F, controls run-time behavior]

Some parameters are interpreted differently by different
codes. Detailed description of all parameters is as follows.


N           -- Number of atom layers


kappa       -- Main dimensionless parameter of the model, kappa = q^2 / k * h^3
               Interpreted as follows:

               	   PF_u, PF_eps, PF_eta, PF_alpha : 
	             kappa is scanned from kappa_min=0 to kappa_max where the value of 
              	     kappa_max is taken from the parameter kappa. Np points in kappa are 
              	     sampled as determined by the parameter Np

               	   PF_p, PF_e :
	       	     kappa is fixed to the value determined by the parameter


e_min, e_max -- Min and max values of the dimensionless field [e = E_ext*h^2/q]
                Interpreted as follows:

                   PF_u, PF_p, PF_eta :
		      Not used

		   PF_e :
		      e is scanned from e_min to e_max using Np equidistant sampling 
                      points as determined by the parameter Np

		   PF_eps :
                      eps(kappa) for each value of kappa is computed using two points e=0 and e=de,
		      where de is determined by the parameter e_max (should be <<1); e_min is not used. 

		   PF_alpha :
		      For each kappa, coefficients alpha(kappa) and mu(kappa) are computed using eps(kappa,p) 
                      at two values of the pressure, p=0 and p=dp. At each kappa and p, eps(kappa,p) is computed
                      using two points e=0 and e=de where de is determined by the parameter e_max (should be <<1); 
                      e_min is not used. 


p_min, p_max -- Min and Max values of dimensionless external pressure [p = P*h/k]
                Interpreted differently:

		   PF_u, PF_eps : 
                      Not used

		   PF_P : 
                      p is scanned from p_min to p_max using Np equidistant points (determined by Np)

		   PF_e : 
                      Computations are done at a fixed pressure p determined from the parameter p_min;
                      p_max is not used

		   PF_eta, PF_alpha :
		      Computations utilize two values of pressure, p=0 and p=dp, where dp is determined
                      by the parameter p_max; p_min is not used


Np           -- Number of points for sampling. Can be sampling of kappa, e or p

th1, th2     -- Small constants used in the stop conditions for the iterations. Codes PF_u, PF_p, PF_eta
                are based on Algorithm 1 and utilize only th1. Codes PF_e, PF_eps and PG_alpha are
                based on Algorithm 2 and utilize both constants. In these cases, th1 is used in the inner loop 
                for a fixed atom layer and th2 in an external loop over all layers. Typically, these constants 
                should be the same. Selecting th1=th2=1.0d-8 will suffice under most circumstances. Accurate 
                determination of the strain as a function of applied field e, i.e., 1-L(e)/L(0) for small e,
                may require the use of th1=th2=1.0d-12, as is implemented in the demo directory. Generally,
                execution ls faster when th1, th2 are larger.


iter_max     -- Maximum number of allowed iterations in each loop whose other exit condition is that some error 
                is <= th1 or th2. Specifying iter_max=0 will set iter_max = 1e9. Selecting a positive value
                for iter_max will assign this value to iter_max. Selecting a finite and not very large value
                to iter_max is useful for scans over kappa since, for some values of kappa and p, equilibrium 
                may not exist and the loop exit condition based on equilibrium error will not be met. Exiting  
                due to exceeding the maximum number of iterations does not result in errors since the computations
                are assumed to fail in such cases and no data are written to the output files.

Vocal        -- If T (true), prints some extra information to the screen. Should be suppressed under normal 
                use by selecting F (false).



4. Parameter File LS.par
========================

Program LS utilizes the parameter file LS.par. An example of LS.par is
given below:

0.01, 5.0, 6.0   ! x_min, x_max, x_c
101              ! Np
100000           ! L
T                ! Vocal


x_min, x_max, x_c -- Min, Max and critical values of the parameter x over which the functions S_s(x), S_d(x)
                     are sampled. The argument x takes Np equidistant points from x_min to x_max inclusively.
                     If x < x_c, double precision is used for summation. If x >= x_c, quadruple precision is used.

Np                -- Number of points in which S_s(x) and S_d(x) are sampled

L                 -- Truncation order for summation. Series in Eq.(21) are truncated by the condition |n|,|m| <= L

Vocal             -- If T (true), will display information about work-sharing loads for individual threads. This is only 
                     needed for fine-tuning and troubleshooting. In most cases, should be set to F (false)

5. Application Notes for LS
===========================

5.1. The code computes truncated series Sigma_s(x), Sigma_d(x) defined in
Eq.(21) and then computes the functions S_s(x), S_d(x) defined in terms
of the former two functions in Eq.(23). 

5.2. The larger is x, the more difficult it is to compute S_s(x) and
S_d(x) with good relative precision. To be sure, these functions are
small at x>2 and numerical sums may oscillate near zero at the level
of 1e-4 or less. However, the two functions, if computed precisely,
remain positive and go to zero exponentially with x. To capture this
behavior numerically for x>2, large values of L and high precision of
summation are needed.

5.3. For large values of L, summation in double precision may not be
enough. To overcome this problem, quadruple precision can be used when
necessary. The code uses quadruple precision for all x >= x_c where
x_c is determined from the parameter file.

5.4. However, quadruple precision can be much slower than double
precision. A factor of 15 is typical. Therefore, quadruple precision
should be used for verification and in extreme cases. All figures in
the paper can be computed accurately using double precision, but have
been verified in quadruple precision.

5.5. If quadruple precision is used, parallelization becomes
essential. Recomputing Fig.4 of the Paper in quadruple precision in a
single thread can easily take two weeks.

5.6. Luckily, the code is effectively parallelized using OpenMP
directives and will scale almost linearly with the number of physical
threads (do not use logical threads). To use parallelization, set the
environmental variable OMP_NUM_THREADS to the number of available
physical threads. The run time will scale accordingly.