PForce Manual Vadim A. Markel vmarkel@upenn.edu =========================== =========================== TABLE OF CONTENTS 0. Introduction 1. Some Notations 2. Brief Summary of Programs 3. Parameter File PF.par 4. Parameter File LS.par 5. Application Notes for LS =========================== =========================== 0. Introduction =============== The codes in this package can be used to generate all data displayed in the figures of the Paper or run custom simulations (within the same model) of the physical properties of the ionic lattice of NaCl. The code LS.exe performs lattice summation in order to compute the electric field at the location of individual ions. This code may be of more general interest. It uses brute force summation. The Paper provides further analytical approximations, which work well in the range of parameters 0 < x <= 2, where x is distance to a sub-layer of ions in units of one lattice unit h (fixed in the lateral directions X and Y) to the point of observation. The codes have been tested only with Intel's compiler ifort. Other compilers may or may not work. The codes use some specific type conversion and quadruple-precision functions, which may be Intel's extensions of the standard language and not available in other compilers. 1. Some Notations ================= The following quantities are dimensionless: kappa = q^2 / k*h^3 -- The main parameter of the model. Lattice is stable for 0 <= kappa <= ~0.25 q -- Absolute value of the charge of an ion k -- Spring constant for the chemical binds h -- Fixed lattice unit in the XY plane p = h * P_ext / k -- Dimensionless external pressure e = h^2 * E_ext / 1 -- Dimensionless applied field (in atomic units) L(e,p) -- Dimensionless width of the slab in units of h L = ( zA(N) - zA(1) + zB(N) - zB(1) ) / (2*h) 2. Brief Summary of Programs ============================ Program | Function | Interpretation of pars | Output, Comments and Definitions ================================================================================================================ PF_u | Computes the lattice unit | kappa_min <-- 0 | u.dat : kappa, u1, u2 : u1 = z(N)-z(1)]/(N-1) | u(kappa) at p=e=0 | kappa_max <-- kappa | : u2 = z(Nh+1)-z(Nh) | | e=p <-- 0 | : Nh = N/2 ---------------------------------------------------------------------------------------------------------------- PF_p | Computes the ratio L(p)/L(0) | p_min <-- p_min | L0.dat : L(p)/L(0) <-- without electrostatic interaction | at a given kappa and e=0 | p_max <-- p_max | L.dat : L(p)/L(0) <-- with electrostatic interaction | | kappa <-- kappa | | | e <-- 0 | ---------------------------------------------------------------------------------------------------------------- | Computes | kappa <-- kappa | L.dat : e, L(p,e), 1-L(p,e)/L(p,0) PF_e | L(e), 1 - L(e)/L(0) | p <-- p_min | eps.dat : e, eps | eps(e) [Eq.(50a)] | e_min <-- e_min | Dip.dat : e, D | D(e) [Eq.(50b)] | e_max <-- e_max | err.dat : e, error of equilibrium | for given kappa and p | | error = sqrt[ Sum(Eq.39)^2 / N ] ---------------------------------------------------------------------------------------------------------------- | Computes eps(kappa) for p=0 | kappa_min <-- 0 | eps.dat : kappa, eps PF_eps | using two points e=0 and e=de | kappa_max <-- kappa | err.dat : kappa, error | Here eps is given by Eq.(50a) | p <-- p_min | error of equilibrium | | de <-- e_max | error = sqrt[ Sum(Eq.39)^2 / N ] ---------------------------------------------------------------------------------------------------------------- | Computes Young's modulus | kappa_min <-- 0 | eta.dat : kappa, h*eta/k PF_eta | eta(kappa) at e=0 using two | kappa_max <-- kappa | | pressure points p=0 and p=dp | dp <-- p_max | h*eta/k := dp / [1 - L(dp)/L(0)] | | e <-- 0 | ---------------------------------------------------------------------------------------------------------------- | Computes the coefficients | kappa_min <-- 0 | alpha.dat : kappa, alpha PF_alpha | alpha(kappa) [Eq.(11)] | kappa_max <-- kappa | mu.dat : kappa, k*mu/h | k*mu(kappa)/h [mu=alpha/eta] | dp <-- p_max | | using four points | de <-- e_max | | e=0, e=de, p=0, p=dp | | ---------------------------------------------------------------------------------------------------------------- | Computes the lattice sums | L <-- L | S.dat : x, S_s(x), S_d(x) LS | Sigma_s(x), Sigma_d(x) [Eq.21] | x_min <-- x_min | | and then determines | x_max <-- x_max | | S_d(x), S_s(x) [Eq.23] | x_c <-- x_c | --- Critical value of x above which | | | computations are done in quad precision ---------------------------------------------------------------------------------------------------------------- 3. Parameter File PF.par ======================== Programs that start with PF utilize the parameter file PF.par. An example of PF.par is given below: 150 ! N [Number of atom layers] 0.1 ! kappa [q^2/kh^3 -- parameter of the model] 0.0,0.2 ! e_min, e_max [e = E*h^2/q where E is external electric field] 0.0,0.0 ! p_min, p_max [p = P*h/k where P is external pressure] 101 ! Np [Number of points for various scans] 1.0d-8, 1.0d-8 ! th1, th2 [Small constants for the stop conditions] 0 ! iter_max [Give 0 for iter_max=1e9 (almost unlimited)] F ! Vocal [T or F, controls run-time behavior] Some parameters are interpreted differently by different codes. Detailed description of all parameters is as follows. N -- Number of atom layers kappa -- Main dimensionless parameter of the model, kappa = q^2 / k * h^3 Interpreted as follows: PF_u, PF_eps, PF_eta, PF_alpha : kappa is scanned from kappa_min=0 to kappa_max where the value of kappa_max is taken from the parameter kappa. Np points in kappa are sampled as determined by the parameter Np PF_p, PF_e : kappa is fixed to the value determined by the parameter e_min, e_max -- Min and max values of the dimensionless field [e = E_ext*h^2/q] Interpreted as follows: PF_u, PF_p, PF_eta : Not used PF_e : e is scanned from e_min to e_max using Np equidistant sampling points as determined by the parameter Np PF_eps : eps(kappa) for each value of kappa is computed using two points e=0 and e=de, where de is determined by the parameter e_max (should be <<1); e_min is not used. PF_alpha : For each kappa, coefficients alpha(kappa) and mu(kappa) are computed using eps(kappa,p) at two values of the pressure, p=0 and p=dp. At each kappa and p, eps(kappa,p) is computed using two points e=0 and e=de where de is determined by the parameter e_max (should be <<1); e_min is not used. p_min, p_max -- Min and Max values of dimensionless external pressure [p = P*h/k] Interpreted differently: PF_u, PF_eps : Not used PF_P : p is scanned from p_min to p_max using Np equidistant points (determined by Np) PF_e : Computations are done at a fixed pressure p determined from the parameter p_min; p_max is not used PF_eta, PF_alpha : Computations utilize two values of pressure, p=0 and p=dp, where dp is determined by the parameter p_max; p_min is not used Np -- Number of points for sampling. Can be sampling of kappa, e or p th1, th2 -- Small constants used in the stop conditions for the iterations. Codes PF_u, PF_p, PF_eta are based on Algorithm 1 and utilize only th1. Codes PF_e, PF_eps and PG_alpha are based on Algorithm 2 and utilize both constants. In these cases, th1 is used in the inner loop for a fixed atom layer and th2 in an external loop over all layers. Typically, these constants should be the same. Selecting th1=th2=1.0d-8 will suffice under most circumstances. Accurate determination of the strain as a function of applied field e, i.e., 1-L(e)/L(0) for small e, may require the use of th1=th2=1.0d-12, as is implemented in the demo directory. Generally, execution ls faster when th1, th2 are larger. iter_max -- Maximum number of allowed iterations in each loop whose other exit condition is that some error is <= th1 or th2. Specifying iter_max=0 will set iter_max = 1e9. Selecting a positive value for iter_max will assign this value to iter_max. Selecting a finite and not very large value to iter_max is useful for scans over kappa since, for some values of kappa and p, equilibrium may not exist and the loop exit condition based on equilibrium error will not be met. Exiting due to exceeding the maximum number of iterations does not result in errors since the computations are assumed to fail in such cases and no data are written to the output files. Vocal -- If T (true), prints some extra information to the screen. Should be suppressed under normal use by selecting F (false). 4. Parameter File LS.par ======================== Program LS utilizes the parameter file LS.par. An example of LS.par is given below: 0.01, 5.0, 6.0 ! x_min, x_max, x_c 101 ! Np 100000 ! L T ! Vocal x_min, x_max, x_c -- Min, Max and critical values of the parameter x over which the functions S_s(x), S_d(x) are sampled. The argument x takes Np equidistant points from x_min to x_max inclusively. If x < x_c, double precision is used for summation. If x >= x_c, quadruple precision is used. Np -- Number of points in which S_s(x) and S_d(x) are sampled L -- Truncation order for summation. Series in Eq.(21) are truncated by the condition |n|,|m| <= L Vocal -- If T (true), will display information about work-sharing loads for individual threads. This is only needed for fine-tuning and troubleshooting. In most cases, should be set to F (false) 5. Application Notes for LS =========================== 5.1. The code computes truncated series Sigma_s(x), Sigma_d(x) defined in Eq.(21) and then computes the functions S_s(x), S_d(x) defined in terms of the former two functions in Eq.(23). 5.2. The larger is x, the more difficult it is to compute S_s(x) and S_d(x) with good relative precision. To be sure, these functions are small at x>2 and numerical sums may oscillate near zero at the level of 1e-4 or less. However, the two functions, if computed precisely, remain positive and go to zero exponentially with x. To capture this behavior numerically for x>2, large values of L and high precision of summation are needed. 5.3. For large values of L, summation in double precision may not be enough. To overcome this problem, quadruple precision can be used when necessary. The code uses quadruple precision for all x >= x_c where x_c is determined from the parameter file. 5.4. However, quadruple precision can be much slower than double precision. A factor of 15 is typical. Therefore, quadruple precision should be used for verification and in extreme cases. All figures in the paper can be computed accurately using double precision, but have been verified in quadruple precision. 5.5. If quadruple precision is used, parallelization becomes essential. Recomputing Fig.4 of the Paper in quadruple precision in a single thread can easily take two weeks. 5.6. Luckily, the code is effectively parallelized using OpenMP directives and will scale almost linearly with the number of physical threads (do not use logical threads). To use parallelization, set the environmental variable OMP_NUM_THREADS to the number of available physical threads. The run time will scale accordingly.