Go to the first, previous, next, last section, table of contents.


Debugging and Interfacing

GNU Fortran currently generates code that is object-compatible with the f2c converter. Also, it avoids limitations in the current GBE, such as the inability to generate a procedure with multiple entry points, by generating code that is structured differently (in terms of procedure names, scopes, arguments, and so on) than might be expected.

As a result, writing code in other languages that calls on, is called by, or shares in-memory data with g77-compiled code generally requires some understanding of the way g77 compiles code for various constructs.

Similarly, using a debugger to debug g77-compiled code, even if that debugger supports native Fortran debugging, generally requires this sort of information.

This section describes some of the basic information on how g77 compiles code for constructs involving interfaces to other languages and to debuggers.

Caution: Much or all of this information pertains to only the current release of g77, sometimes even to using certain compiler options with g77 (such as `-fno-f2c'). Do not write code that depends on this information without clearly marking said code as nonportable and subject to review for every new release of g77. This information is provided primarily to make debugging of code generated by this particular release of g77 easier for the user, and partly to make writing (generally nonportable) interface code easier. Both of these activities require tracking changes in new version of g77 as they are installed, because new versions can change the behaviors described in this section.

Names

Fortran permits each implementation to decide how to represent names as far as how they're seen in other contexts, such as debuggers and when interfacing to other languages, and especially as far as how casing is handled.

External names--names of entities that are public, or "accessible", to all modules in a program--normally have an underscore (`_') appended by g77, to generate code that is compatible with f2c. External names include names of Fortran things like common blocks, external procedures (subroutines and functions, but not including statement functions, which are internal procedures), and entry point names.

However, use of the `-fno-underscoring' option disables this kind of transformation of external names (though inhibiting the transformation certainly improves the chances of colliding with incompatible externals written in other languages--but that might be intentional.

When `-funderscoring' is in force, any name (external or local) that already has at least one underscore in it is implemented by g77 by appending two underscores. External names are changed this way for f2c compatibility. Local names are changed this way to avoid collisions with external names that are different in the source code---f2c does the same thing, but there's no compatibility issue there except for user expectations while debugging.

For example:

Max_Cost = 0

Here, a user would, in the debugger, refer to this variable using the name `max_cost__' (or `MAX_COST__' or `Max_Cost__', as described below). (We hope to improve g77 in this regard in the future--don't write scripts depending on this behavior! Also, consider experimenting with the `-fno-underscoring' option to try out debugging without having to massage names by hand like this.)

g77 provides a number of command-line options that allow the user to control how case mapping is handled for source files. The default is the traditional UNIX model for Fortran compilers--names are mapped to lower case. Other command-line options can be specified to map names to upper case, or to leave them exactly as written in the source file.

For example:

Foo = 3.14159

Here, it is normally the case that the variable assigned will be named `foo'. This would be the name to enter when using a debugger to access the variable.

However, depending on the command-line options specified, the name implemented by g77 might instead be `FOO' or even `Foo', thus affecting how debugging is done.

Also:

Call Foo

This would normally call a procedure that, if it were in a separate C program, be defined starting with the line:

void foo_()

However, g77 command-line options could be used to change the casing of names, resulting in the name `FOO_' or `Foo_' being given to the procedure instead of `foo_', and the `-fno-underscoring' option could be used to inhibit the appending of the underscore to the name.

Main Program Unit (PROGRAM)

When g77 compiles a main program unit, it gives it the public procedure name `MAIN__'. The libf2c library has the actual `main()' procedure as is typical of C-based environments, and it is this procedure that performs some initial start-up activity and then calls `MAIN__'.

Generally, g77 and libf2c are designed so that you need not include a main program unit written in Fortran in your program--it can be written in C or some other language. Especially for I/O handling, this is the case, although g77-0.5.16 includes a bug fix for libf2c that solved a problem with using the `OPEN' statement as the first Fortran I/O activity in a program without a Fortran main program unit.

However, if you don't intend to use g77 (or f2c) to compile your main program unit--that is, if you intend to compile a `main()' procedure using some other language--you should carefully examine the code for `main()' in libf2c, found in the source file `gcc/f/runtime/libF77/main.c', to see what kinds of things might need to be done by your `main()' in order to provide the Fortran environment your Fortran code is expecting.

For example, libf2c's `main()' sets up the information used by the `IARGC' and `GETARG' intrinsics. Bypassing libf2c's `main()' without providing a substitute for this activity would mean that invoking `IARGC' and `GETARG' would produce undefined results.

When debugging, one implication of the fact that `main()', which is the place where the debugged program "starts" from the debugger's point of view, is in libf2c is that you won't be starting your Fortran program at a point you recognize as your Fortran code.

The standard way to get around this problem is to set a break point (a one-time, or temporary, break point will do) at the entrance to `MAIN__', and then run the program.

After doing this, the debugger will see the current execution point of the program as at the beginning of the main program unit of your program.

Of course, if you really want to set a break point at some other place in your program and just start the program running, without first breaking at `MAIN__', that should work fine.

Arrays (DIMENSION)

Fortran uses "column-major ordering" in its arrays. This differs from other languages, such as C, which use "row-major ordering". The difference is that, with Fortran, array elements adjacent to each other in memory differ in the first subscript instead of the last; `A(5,10,20)' immediately follows `A(4,10,20)', whereas with row-major ordering it would follow `A(5,10,19)'.

This consideration affects not only interfacing with and debugging Fortran code, it can greatly affect how code is designed and written, especially when code speed and size is a concern.

Fortran also differs from C, a popular language for interfacing and to support directly in debuggers, in the way arrays are treated. In C, arrays are single-dimensional and have interesting relationships to pointers, neither of which is true for Fortran. As a result, dealing with Fortran arrays from within an environment limited to C concepts can be challenging.

For example, accessing the array element `A(5,10,20)' is easy enough in Fortran (use `A(5,10,20)'), but in C some difficult machinations are needed. First, C would treat the A array as a single-dimension array. Second, C does not understand low bounds for arrays as does Fortran. Third, C assumes a low bound of zero (0), while Fortran defaults to a low bound of one (1) and can supports an arbitrary low bound. Therefore, calculations must be done to determine what the C equivalent of `A(5,10,20)' would be, and these calculations require knowing the dimensions of `A'.

For `DIMENSION A(2:11,21,0:29)', the calculation of the offset of `A(5,10,20)' would be:

  (5-2)
+ (10-1)*(11-2+1)
+ (20-0)*(11-2+1)*(21-1+1)
= 4293

So the C equivalent in this case would be `a[4293]'.

When using a debugger directly on Fortran code, the C equivalent might not work, because some debuggers cannot understand the notion of low bounds other than zero. However, unlike f2c, g77 does inform the GBE that a multi-dimensional array (like `A' in the above example) is really multi-dimensional, rather than a single-dimensional array, so at least the dimensionality of the array is preserved.

Debuggers that understand Fortran should have no trouble with non-zero low bounds, but for non-Fortran debuggers, especially C debuggers, the above example might have a C equivalent of `a[4305]'. This calculation is arrived at by eliminating the subtraction of the lower bound in the first parenthesized expression on each line--that is, for `(5-2)' substitute `(5)', for `(10-1)' substitute `(10)', and for `(20-0)' substitute `(20)'. Actually, the implication of this can be that the expression `*(&a[2][1][0] + 4293)' works fine, but that `a[20][10][5]' produces the equivalent of `*(&a[0][0][0] + 4305)' because of the missing lower bounds.

Come to think of it, perhaps the behavior is due to the debugger internally compensating for the lower bounds by offsetting the base address of `a', leaving `&a' set lower, in this case, than `&a[2][1][0]' (the address of its first element as identified by subscripts equal to the corresponding lower bounds).

You know, maybe nobody really needs to use arrays.

Procedures (SUBROUTINE and FUNCTION)

Procedures that accept `CHARACTER' arguments are implemented by g77 so that each `CHARACTER' argument has two actual arguments.

The first argument occupies the expected position in the argument list and has the user-specified name. This argument is a pointer to an array of characters, passed by the caller.

The second argument is appended to the end of the user-specified calling sequence and is named `__g77_length_x', where x is the user-specified name. This argument is of the C type `ftnlen' (see `gcc/f/runtime/f2c.h.in' for information on that type) and is the number of characters the caller has allocated in the array pointed to by the first argument.

A procedure will ignore the length argument if `X' is not declared `CHARACTER*(*)', because for other declarations, it knows the length. Not all callers necessarily "know" this, however, which is why they all pass the extra argument.

The contents of the `CHARACTER' argument are specified by the address passed in the first argument (named after it). The procedure can read or write these contents as appropriate.

When more than one `CHARACTER' argument is present in the argument list, the length arguments are appended in the order the orginal arguments appear. So `CALL FOO('HI','THERE')' is implemented in C as `foo("hi","there",2,5);', ignoring the fact that g77 does not provide the trailing null bytes on the constant strings (f2c does provide them, but they are unnecessary in a Fortran environment, and you should not expect them to be there).

Note that the above information applies to `CHARACTER' variables and arrays only. It does not apply to external `CHARACTER' functions or to intrinsic `CHARACTER' functions. That is, no second length argument is passed to `FOO' in this case:

CHARACTER X
EXTERNAL X
CALL FOO(X)

Nor does `FOO' expect such an argument in this case:

SUBROUTINE FOO(X)
CHARACTER X
EXTERNAL X

Because of this implementation detail, if a program has a bug such that there is disagreement as to whether an argument is a procedure, and the type of the argument is `CHARACTER', subtle symptoms might appear.

Adjustable Arrays (DIMENSION)

Adjustable and automatic arrays in Fortran require the implementation (in this case, the g77 compiler) to "memorize" the expressions that dimension the arrays each time the procedure is invoked. This is so that subsequent changes to variables used in those expressions, made during execution of the procedure, do not have any effect on the dimensions of those arrays.

For example:

REAL ARRAY(5)
DATA ARRAY/5*2/
CALL X(ARRAY, 5)
END
SUBROUTINE X(A, N)
DIMENSION A(N)
N = 20
PRINT *, N, A
END

Here, the implementation should, when running the program, print something like:

20   2.  2.  2.  2.  2.

Note that this shows that while the value of `N' was successfully changed, the size of the `A' array remained at 5 elements.

To support this, g77 generates code that executes before any user code (and before the internally generated computed `GOTO' to handle alternate entry points, as described below) that evaluates each (nonconstant) expression in the list of subscripts for an array, and saves the result of each such evaluation to be used when determining the size of the array (instead of re-evaluating the expressions).

So, in the above example, when `X' is first invoked, code is executed that copies the value of `N' to a temporary. And that same temporary serves as the actual high bound for the single dimension of the `A' array (the low bound being the constant 1). Since the user program cannot (legitimately) change the value of the temporary during execution of the procedure, the size of the array remains constant during each invocation.

For alternate entry points, the code `g77' generates takes into account the possibility that a dummy adjustable array is not actually passed to the actual entry point being invoked at that time. In that case, the public procedure implementing the entry point passes to the master private procedure implementing all the code for the entry points a `NULL' pointer where a pointer to that adjustable array would be expected. The g77-generated code doesn't attempt to evaluate any of the expressions in the subscripts for an array if the pointer to that array is `NULL' at run time in such cases. (Don't depend on this particular implementation by writing code that purposely passes `NULL' pointers where the callee expects adjustable arrays, even if you know the callee won't reference the arrays--nor should you pass `NULL' pointers for any dummy arguments used in calculating the bounds of such arrays or leave undefined any values used for that purpose in COMMON--because the way g77 implements these things might change in the future!)

Alternate Returns (SUBROUTINE and RETURN)

Subroutines with alternate returns (e.g. `SUBROUTINE X(*)' and `CALL X(*50)') are implemented by g77 as functions returning the C `int' type. The actual alternate-return arguments are omitted from the calling sequence. Instead, the caller uses the return value to do a rough equivalent of the Fortran computed-`GOTO' statement, as in `GOTO (50), X()' in the example above (where `X' is quietly declared as an `INTEGER' function), and the callee just returns whatever integer is specified in the `RETURN' statement for the subroutine For example, `RETURN 1' is implemented as `X = 1' followed by `RETURN' in C, and `RETURN' by itself is `X = 0' and `RETURN').

Functions (FUNCTION and RETURN)

g77 handles in a special way functions that return the following types:

For `CHARACTER', g77 implements a subroutine (a C function returning `void') with two arguments prepended: `__g77_result', which the caller passes as a pointer to a `char' array expected to hold the return value, and `__g77_length', which the caller passes as an `ftnlen' value specifying the length of the return value as declared in the calling program. For `CHARACTER'*(*), the called function uses `__g77_length' to determine the size of the array that `__g77_result' points to; otherwise, it ignores that argument.

For `COMPLEX' and `DOUBLE COMPLEX', when `-ff2c' is in force, g77 implements a subroutine with one argument prepended: `__g77_result', which the caller passes as a pointer to a variable of the type of the function. The called function writes the return value into this variable instead of returning it as a function value. When `-fno-f2c' is in force, g77 implements a `COMPLEX' function as gcc's `__complex__ float' function, returning the result of the function in the same way as gcc would, and implements a `DOUBLE COMPLEX' function similarly.

For `REAL', when `-ff2c' is in force, g77 implements a function that actually returns `DOUBLE PRECISION' (usually C's `double' type). When `-fno-f2c' is in force, `REAL' functions return `float'.

Common Blocks (COMMON)

g77 names and lays out `COMMON' areas the same way f2c does, for compatibility with f2c.

Currently, g77 does not emit any debugging information for items in a `COMMON' area, due to an apparent bug in the GBE.

Moreover, g77 currently implements a `COMMON' area such that its type is an array of the C `char' data type.

So, when debugging, you must know the offset into a `COMMON' area for a particular item in that area, and you have to take into account the appropriate multiplier for the respective sizes of the types (as declared in your code) for the items preceding the item in question as compared to the size of the `char' type.

For example, using default implicit typing, the statement

COMMON I(15), R(20), T

results in a public 144-byte `char' array named `_BLNK__' with `I' placed at `_BLNK__[0]', `R' at `_BLNK__[60]', and `T' at `_BLNK__[140]'. (This is assuming that the target machine for the compilation has 4-byte `INTEGER' and `REAL' types.)

Local Equivalence Areas (EQUIVALENCE)

g77 treats storage-associated areas involving a `COMMON' block as explained in the section on common blocks.

A local `EQUIVALENCE' area is a collection of variables and arrays connected to each other in any way via `EQUIVALENCE', none of which are listed in a `COMMON' statement.

Currently, g77 does not emit any debugging information for items in a local `EQUIVALENCE' area, due to an apparent bug in the GBE.

Moreover, g77 implements a local `EQUIVALENCE' area such that its type is an array of the C `char' data type.

The name g77 gives this array of `char' type is `__g77_equiv_x', where x is the name of the first item listed in the `EQUIVALENCE' statements for that area that is placed at the beginning (offset 0) of this array.

When debugging, you must therefore access members of `EQUIVALENCE' areas by specifying the appropriate `__g77_equiv_x' array section with the appropriate offset. See the explanation of debugging `COMMON' blocks for info applicable to debugging local `EQUIVALENCE' areas.

(Note: g77 version 0.5.16 fixed a bug in how certain `EQUIVALENCE' cases were handled. The bug caused the debugger to not know the size of the array if any variable or array in the `EQUIVALENCE' was given an initial value via `DATA' or similar.)

Alternate Entry Points (ENTRY)

The GBE does not understand the general concept of alternate entry points as Fortran provides via the ENTRY statement. g77 gets around this by using an approach to compiling procedures having at least one `ENTRY' statement that is almost identical to the approach used by f2c. (An alternate approach could be used that would probably generate faster, but larger, code that would also be a bit easier to debug.)

Information on how g77 implements `ENTRY' is provided for those trying to debug such code. The choice of implementation seems unlikely to affect code (compiled in other languages) that interfaces to such code.

g77 compiles exactly one public procedure for the primary entry point of a procedure plus each `ENTRY' point it specifies, as usual. That is, in terms of the public interface, there is no difference between

SUBROUTINE X
END
SUBROUTINE Y
END

and:

SUBROUTINE X
ENTRY Y
END

The difference between the above two cases lies in the code compiled for the `X' and `Y' procedures themselves, plus the fact that, for the second case, an extra internal procedure is compiled.

For every Fortran procedure with at least one `ENTRY' statement, g77 compiles an extra procedure named `__g77_masterfun_x', where x is the name of the primary entry point (which, in the above case, using the standard compiler options, would be `x_' in C).

This extra procedure is compiled as a private procedure--that is, a procedure not accessible by name to separately compiled modules. It contains all the code in the program unit, including the code for the primary entry point plus for every entry point. (The code for each public procedure is quite short, and explained later.)

The extra procedure has some other interesting characteristics.

The argument list for this procedure is invented by g77. It contains a single integer argument named `__g77_which_entrypoint', passed by value (as in Fortran's `%VAL()' intrinsic), specifying the entry point index--0 for the primary entry point, 1 for the first entry point (the first `ENTRY' statement encountered), 2 for the second entry point, and so on.

It also contains, for functions returning `CHARACTER' and (when `-ff2c' is in effect) `COMPLEX' functions, and for functions returning different types among the `ENTRY' statements (e.g. `REAL FUNCTION R()' containing `ENTRY I()'), an argument named `__g77_result' that is expected at run time to contain a pointer to where to store the result of the entry point. For `CHARACTER' functions, this storage area is an array of the appropriate number of characters; for `COMPLEX' functions, it is the appropriate area for the return type (currently either `COMPLEX' or `DOUBLE COMPLEX'); for multiple- return-type functions, it is a union of all the supported return types (which cannot include `CHARACTER', since combining `CHARACTER' and non-`CHARACTER' return types via `ENTRY' in a single function is not supported by g77).

For `CHARACTER' functions, the `__g77_result' argument is followed by yet another argument named `__g77_length' that, at run time, specifies the caller's expected length of the returned value. Note that only `CHARACTER*(*)' functions and entry points actually make use of this argument, even though it is always passed by all callers of public `CHARACTER' functions (since the caller does not generally know whether such a function is `CHARACTER*(*)' or whether there are any other callers that don't have that information).

The rest of the argument list is the union of all the arguments specified for all the entry points (in their usual forms, e.g. `CHARACTER' arguments have extra length arguments, all appended at the end of this list). This is considered the "master list" of arguments.

The code for this procedure has, before the code for the first executable statement, code much like that for the following Fortran statement:

       GOTO (100000,100001,100002), __g77_which_entrypoint
100000 ...code for primary entry point...
100001 ...code immediately following first ENTRY statement...
100002 ...code immediately following second ENTRY statement...

(Note that invalid Fortran statement labels and variable names are used in the above example to highlight the fact that it represents code generated by the g77 internals, not code to be written by the user.)

It is this code that, when the procedure is called, picks which entry point to start executing.

Getting back to the public procedures (`x' and `Y' in the original example), those procedures are fairly simple. Their interfaces are just like they would be if they were self-contained procedures (without `ENTRY'), of course, since that is what the callers expect. Their code consists of simply calling the private procedure, described above, with the appropriate extra arguments (the entry point index, and perhaps a pointer to a multiple-type- return variable, local to the public procedure, that contains all the supported returnable non-character types). For arguments that are not listed for a given entry point that are listed for other entry points, and therefore that are in the "master list" for the private procedure, null pointers (in C, the `NULL' macro) are passed. Also, for entry points that are part of a multiple-type- returning function, code is compiled after the call of the private procedure to extract from the multi-type union the appropriate result, depending on the type of the entry point in question, returning that result to the original caller.

When debugging a procedure containing alternate entry points, you can either set a break point on the public procedure itself (e.g. a break point on `X' or `Y') or on the private procedure that contains most of the pertinent code (e.g. `__g77_masterfun_x'). If you do the former, you should use the debugger's command to "step into" the called procedure to get to the actual code; with the latter approach, the break point leaves you right at the actual code, skipping over the public entry point and its call to the private procedure (unless you have set a break point there as well, of course).

Further, the list of dummy arguments that is visible when the private procedure is active is going to be the expanded version of the list for whichever particular entry point is active, as explained above, and the way in which return values are handled might well be different from how they would be handled for an equivalent single-entry function.

Assigned Statement Labels (ASSIGN and GOTO)

For portability to machines where a pointer (such as to a label, which is how g77 implements `ASSIGN' and its cousin, the assigned `GOTO') is wider (bitwise) than an `INTEGER', g77 does not necessarily use the same memory location to hold the `ASSIGN'ed value of a variable as it does the numerical value in that variable, unless the variable is wide enough (can hold enough bits).

In particular, while g77 implements

I = 10

as, in C notation, `i = 10;', it might implement

ASSIGN 10 TO I

as, in GNU's extended C notation (for the label syntax), `__g77_ASSIGN_I = &&L10;' (where `L10' is just a massaging of the Fortran label `10' to make the syntax C-like; g77 doesn't actually generate the name `L10' or any other name like that, since debuggers cannot access labels anyway).

While this currently means that an `ASSIGN' statement might not overwrite the numeric contents of its target variable, do not write any code depending on this feature. g77 has already changed this implementation across versions and might do so in the future. This information is provided only to make debugging Fortran programs compiled with the current version of g77 somewhat easier. If there's no debugger-visible variable named `__g77_ASSIGN_I' in a program unit that does `ASSIGN 10 TO I', that probably means g77 has decided it can store the pointer to the label directly into `I' itself.

(Currently, g77 always chooses to make the separate variable, to improve the likelihood that `-O -Wuninitialized' will diagnose failures to do things like `GOTO I' without `ASSIGN 10 TO I' despite doing `I=5'.)


Go to the first, previous, next, last section, table of contents.