UserGuide RectDisp 1.4 Vadim A. Markel vmarkel@upenn.edu Reference: V.A.Markel, M.Schöbinger and K.Hollaus, A fast method to compute dispersion diagrams of three-dimensional photonic crystals with rectangular geometry, Comp.Phys.Comm. 279, 108441 (2022). https://doi.org/10.1016/j.cpc.2022.108441 CONTENTS 0. Introduction 1. Environment, Language and Compiling, Parallelism 2. Input and Output 3. Parameter File RectDisp.par 4. Defining Permittivities 5. Output File disp.dat 6. Units 7. Application Notes 8. Demo Directory 9. Limitations 0. Introduction =============== 0.1. Codes in this package are provided as is without any implicit or explicit warranty. Execution of the codes depends on several parameters. It is the responsibility of the user to select the correct values for these parameters, verify convergence, etc. 0.2. Codes are free to use to any non-profit organization and individual researcher working for a nonprofit organization. Commercial use is prohibited. 0.3. If these codes are used in any scientific publication, attribution and/or notification of the author would be appreciated. 0.4. This UserGuide is not exhaustive; in case of questions or if you need help, please contact the author at the e-mail above. 0.5. Current limitations: (a) Only propagation in the crystallographic planes XZ or YZ or strictly along Z-axis are currently supported. This means that either of three cases are allowed (see Point 5.2 for definition of parameters fx, fy): (i) fx=fy=0 (ii) fx=/=0 and fy=0 (ii) fx=0 and fy=/=0 (b) Several useful options for the root-searching algorithms are currently not adjustable through parameter selection. (c) Dispersion formulas for the inclusion and host permittivities are not adjustable through parameters. Instead, one can edit the files cfun_i.f and cfun_h.f directly to change the laws of dispersion. The form of these files is very self-explanatory. Considering that there are many ways in which dispersion can be defined, it is too tedious to control through input parameters. An alternative approach that allows full flexibility is to read the permittivities from file by using PERM_i='FILE' or PERM_h='FILE' (see Section 3, description of Parameters [2] and [3]). 1. Environment, Language and Compiling, Parallelism =================================================== 1.1. The codes have been developed and tested under Linux. Codes were not tested under Windows or Mac OS, but are expected to work with proper compilation. 1.2. Codes do not use any external libraries. 1.3. Codes are written in modern Fortran. Compilation was tested using Intel's Fortran (ifort) and Gnu Fortran (gfortran). Intel's compiler is preferred and yields better performance and easier/more efficient parallelization. 1.4. To compile the codes using ifort, simply change to the root directory ~/RectDisp/ and type "make all " or "make RectDisp ". To clean the directory and re-compile from scratch, type "make clean " and then "make all ". 1.5. To compile using gfortran, copy the file Makefile.gnu to Makefile using the command "cp Makefile.gnu Makefile ". Then type "make all ". 1.6. In case of successful compilation, the executable will be placed in ~/RectDisp/bin/. From ~/RectDisp/work/, run the code by typing "../bin/RectDisp.exe " 1.7. If compiled with ifort, the executable will be parallelalized automatically, and the resulting executable will be scalable. However, scaling is much better for large values of Lx, Ly, Lz. For L=8, it is not recommended using more than 2 or 3 threads. For L=64, 16 threads can provide good performance. 1.8 Users can control the number of threads by setting the value of the environmental variable OMP_NUM_THREADS with an appropriate shell command. For example, under csh or tcsh, type something like "setenv OMP_NUM_THREADS 16 " (16 here is an example). The computer should have 16 physical (not logical) threads or else there will be no expected scaling. 1.9. To parallelize under gfortran, uncomment the line "lopt = -fopenmp". Then go to the subdirectory ~/RectDisp/sub/ and edit the file mult_W.f. The user will need to insert appropriate OpenMP directives to parallelize the loops. Scalability was not verified and is not guaranteed with gfortran, and parallelization effectiveness will depend on User's expertise. 2. Input and Output =================== 2.1. The following file is required in the working directory: RectDisp.par. An example of this file is placed in ~/RectDisp/work/. This file contains several parameters to control the problem set-up and execution. The parameters must be readable and in some sense correct. All parameters are checked for sanity at run-time. If some meaningless parameters are entered (example: a negative number of frequency points), the code will report an error and quit. 2.2. If PERM_i='FILE', then the input file eps_i.dat is required in the working directory. If PERM_h='FILE' the input file eps_h.dat is required in the working directory. The files, when required, must have the correct format (see Section 4). 2.3. If PERM_i='DISP', then the file eps_i.dat will be written to the working directory. If PERM_h='DISP', then the file eps_h.dat will be written to the working directory. These files can be used in the future as input files. 2.4. The PC dispersion data are written to the file disp.dat. This is the main result of the codes. 2.5. Convergence data are written to the file conv.dat. These data are for development and debugging and users should not display or interpret the numbers in conv.dat. 3. Parameter file RectDisp.par ============================== 3.1. IMPORTANT: when editing this file: a) Do not use tabs, use single spaces only b) The text after ! symbols are comments. Do not move ! symbols; all actual parameters should fit in the space before the ! symbols. 3.2. The file RectDisp.par contains parameters that will control the problem set-up and execution of the codes. The example file has comments that should be self-explanatory. The following is a more in-depth description. Numbers below correspond to the sequential parameter numbers in ~/RectDisp/work/RectDisp.par. [1] Method : character*8 : The method to solve the nonlinear equation. 'rEPS' -- hybrid method to be used with purely real permittivities 'fixP' -- fixed-point iteration 'ratA' -- rational approximation These options cover all the possibilities implemented so far. However, some spelling variations are allowed; see file ~/RectDisp/inc/values.f for details. [2] PERM_i : character*8 : The method for defining permittivity of inclusions. 'DISP' -- dispersion formula (see Section 4) 'FIXED' -- fixed values read from RectDisp.par 'FILE' -- read values from file perm_i.dat These options cover all the possibilities implemented so far. However, some spelling variations are allowed; see file ~/RectDisp/inc/values.f for details. [3] PERM_h : character*8 : The method for defining permittivity of host. 'DISP' -- dispersion formula (see Section 4) 'FIXED' -- fixed values read from RectDisp.par 'FILE' -- read values from file perm_h.dat These options cover all the possibilities implemented so far. However, some spelling variations are allowed; see file ~/RectDisp/inc/values.f for details. [4] ceps_i_fix : complex*16 : Fixed permittivity of inclusions to be used if PERM_i='FIXED' Enter in the form (2.0, 0.20) where the first and second number are the real and imaginary parts of eps_i [5] ceps_h_fix : complex*16 : Fixed permittivity of the host to be used if PERM_h='FIXED' Enter in the form (2.0, 0.20) where the first and second number are the real and imaginary parts of eps_h [6] Pol : character*8 : Polarization label. Is used to select the polarization mode as described in the paper. Following values can be given but are not always consistent with other parameters (code will let you know if not): 's', 'p', 'x', 'y', 'xz', 'yz' For example, if propagation is in XZ plane, the labels 'xz' and 'p' are equivalent. However, if propagation is strictly along Z, then the labels 's' or 'p' are ambiguous; give 'x' or 'y', which can yield different results in anisotropic PCs. For spelling variations, see ~/RectDisp/inv/values.f [7] hx, hy, hz : real*8 : Dimensions of unit cell in arbitrary units. See Section 6 for more information about units. [8] Lx, Ly, Lz : integer*4 : Size of the reciprocal lattice grid. This is the most important set of parameters controlling convergence and performance. The total number of plane waves used for the basis is (2*Lx+1)(2*Ly+1)(2*Lz+1). If Lx=Ly=Lz=L, increasing L by the factor of 2 will increase the computation time by the factor of 16. Memory requirements will increase by the factor of 8. However, the codes use very little memory and are more likely to become slow before users run out of memory. [9] ax, ay, az : real*8 : Size of inclusion in same units are Parameters hx, hy, hz. IMPORTANT: the inclusion size is (2*ax)*(2*ay)*(2*az). Therefore, it is required that 2*ax <= hx. If 2*ax=hx, the medium is continuous in the X-direction and it is sufficient to set Lx=0; same for other directions. See Point 7.4. [10] rk_min, rk_max : real*8 : Minimum and maximum normalized frequencies for a scan. Here rk = k*hz/pi, where k=omega/c is the free-space wave number. Thus, these quantities are units-independent and dimensionless. Not used if either PERM_i='FILE' or PERM_h='FILE' [11] fkx, fky : real*8 : Dimensionless parameters defining projection of the Bloch wave number onto the XY plane. At any frequency used, we have kx = k*fkx, ky=k*fky where k=omega/c. The Bloch wave vector is of the form (kx, ky, qz). At every frequency considered, kx and ky are defined from the above condition and qz is computed. [12] Np : integer*4 : Number of frequency points. If PERM_i='FILE' or PERM_h='FILE', the actual number of frequency points can be smaller than Np and defined by the number(s) of lines in eps_i.dat and eps_h.dat. But the number of frequency points will never be larger than Np: if any of the above files has more lines, only the first Np lines will be read and used. [13] Nord : integer*4 : Order of continued fraction. Value of 16 is almost always sufficient. Can be set however to a generous value (say, 64) without significant loss of performance. The code will determine if the expansion converges at smaller Nord and will not perform calculations that are not needed. [14] eps : real*8 : Precision with which the equation should be satisfied, i.e., for solving F(qz)=0 the stop condition is |F(qz)|<=eps. Usually the value eps=0.001 or eps=0.0001 is sufficient for good precision. Making eps smaller might not improve the precision of obtained solutions due to other imprecision involved. [15] itm_1, itm_2, itm_3 : integer*4 : Maximum numbers of iterations used in iterative solution of the nonlinear equation. For Method='fixP' or Method='ratA', only itm_1 is used. For Method='rEPS', all three parameters are used. Recommended values are 50,25,25, but, in most cases, this is excessive. The parameters are supplied so that the code does not loop indefinitely in the case when the solution can not be found. [16] Nqth : integer*4 : Minimum distance from the edge of first Brillouin zone (FBZ) in units of 2*pi/hz for which to re-check precision and run additional iterations of solver as needed. If a solution qz is found such that |Re(q)| > pi/hz, the code will translate it to FBZ as qz <-- qz - nq*2*pi/hz where nq = floor[ (Re(qz) + pi/hz) / (2*pi/hz) ] Theoretically, if there is a root at qz, there is also a root at any qz + n*2*pi/hz, where n is an integer. But due to the truncation used in numerical algebra, this property does not hold precisely. Therefore, if |nq| > Nqth, the code will recheck precision and, if necessary, run additional iterations to make sure that the precision goal is met. Nqth=0 --- Precision goal will be enforced for all points in disp.dat. Nqth=1 --- Some points in disp.dat may have not met precision goal. However, the difference with Nqth=0 is most often small and not visually discernible. Faster execution compared to Nqth=0 Nqth=2 --- Even more potential to lose precision but can be not different from Nqth=1 Nqth>2 --- A warning will be issued about possible loss of precision Under many circumstances, Nqth=1 provides the best balance between performance and precision. If some points look off, change to Nqth=0 (but a fix is not guaranteed). Selecting Nqth>2 is not recommended but this option is available. By default, Nqth=1 is set in all parameter files supplied with this package; it is recommended to change this setting only with full understanding of what this entails. 4. Defining Permittivities ========================== 4.1. There are three methods to define the permittivities: a) Fixed values from RectDisp.par b) Dispersion formula coded in subroutines cfun_i.f and cfun_h.f c) Reading values from input file(s) 4.2. The functions coded in cfun_i.f and cfun_h.f can not be tuned by changing parameters. Edit these files directly if you wish to change the dispersion formula. This should be very easy and self-explanatory. 4.3. If either fixed permittivities or dispersion formulas are used for both host and inclusions, then the frequencies for the scan are determined using the parameters rk_min, rk_max and Np from the parameter file RectDisp.par, according to: drk = (rk_max-rk_min)/(Np-1) for i=1,Np : rk = rk_min + drk*(i-1) Here rk = k*hz/pi, where k=omega/c. Thus rk is by definition dimensionless. 4.4. Dispersion formulas in cfun_i.f and cfun_h.f compute complex permittivities eps_i and eps_h as functions of the dimensionless frequency rk. The formula is eps(rk) = 1 + w0*w0*(eps(0)-1) / (w0*w0 - rk*rk - i*g*w0*rk) where w0 is the dimensionless resonance frequency, w0=omega_0*hz/pi, i is the complex unit, eps(0) is the static permittivity (the limit when rk-->0) and g=gamma/omega_0 is the dimensionless parameter specifying the width of the resonance. Values of w0, eps(0) and g are specified in the files cfun_i.f and cfun_h.f for the inclusions and the host separately. All these constants are dimensionless. 4.5. If at least one of the permittivities is read from file, the parameters rk_min and rk_max are not used to determine the frequency samples. Specifically, the following approach is used: a) If PERM_i='FILE' and PERM_h='FILE' (both permittivities are read from files): i) The code will attempt to read Np first lines from both files. If one of the files has Mp". 8.3. Note that functionality of the demo examples relies on several symbolic links. Files to which these links point should exist. All such files are within the directory tree with the root ~/rectDisp/. In particular, the executable RectDisp.exe should exist in ~/RectDisp/bin/ subdirectory. 8.4. Each subdirectory ~/ExampleName/ has a further subdirectory ~/Data/, where pre-computed data and graphics produced from these data are placed. Users can reproduce the data by running the example as explained in Point 8.2. To re-create the graphics from data, type "./cmd " from the ~/Data/ subdirectory. This assumes that the original data files are in place, or user-created data files with the same names are in place. 9. Limitations and Future Development ===================================== 9.1. At higher frequencies, there might exist multiple solutions for qz at a given normalized frequency rk. This usually happens above the second band gap. The code currently finds only one solution for a given frequency and it can therefore "jump" from one branch to another. The resultant dispersion can look noisy or irregular. In future development, this issue will be addressed by applying polynomial approximation to the root problem and finding all solutions qz in the FBZ. 9.2. More general propagation directions or restrictions of the Bloch variety will be implemented in the future.