Index
Home
About
Blog
From: Robert.Corbett@Eng.Sun.COM (Robert Corbett)
Subject: Re: is lex useful?
Date: 30 Jun 1996
In article <96-06-129@comp.compilers>,
>[Yup, that's what I said. Fortran needs a multi-pass lexer to correctly
>recognize that REAL*4HELLO doesn't contain the string constant 'ELLO'. -John]
A poor example. A lexer can recognize this case in a single
left-to-right scan with one character lookahead. The sequence
letter+
*
digit+
at the start of a statement can be followed only by an identifier.
A better example is
DO10I = expr1, expr2
Since the length of expr1 is bounded only by the number of characters
allowed in a statement, either a multipass lexer or practically
unbounded lookahead are needed.
Because Fortran limits the maximum size of a statement, a lexer for
Fortran can analyze any Fortran statement in constant time.
Sincerely,
Bob Corbett
[Right, thanks for the correction. In the DO10I example, note that just
looking ahead for a comma isn't sufficient. You have to look for a comma
not enclosed in parens, which lex can't do, because REs can't count. -John]
From: Robert.Corbett@Eng.Sun.COM (Robert Corbett)
Subject: Re: Is Fortran90 LL(1)?
Date: 18 Apr 1996
>[To parse Fortran, you have to tell whether a statement is an
>assignment (or statement function) or something else. First, if you
>accept the old 3Hfoo Hollerith constants, you strip them out, being
>careful not to be confused by REAL*4HELLO. Then you look for an equal
>sign not protected by parentheses, and not followed by a comma which
>also must not be protected by parentheses. If you find the equal
>sign, and no comma, it's an assignment or a statement function. If
>not, it's something else. Once you've decided that, the lexing and
>parsing are pretty straightforward, with the parser at each stage
>having to tell the lexer what kind of tokens to look for. See my
>sample Fortran subset parser in the archives for an example of all
>this nonsense. -John]
There are still some tricky points in writing a Fortran grammar.
One problem is distinguishing a complex constant from an implied
DO-loop in a WRITE statement.
WRITE (*, *) (1.0, 0.0)
and
WRITE (*, *) (1.0, I = 1, 10)
look quite similar up to the second comma. A common way of dealing
with this case is to relax the restriction that the first part of a
complex constant must be a real or integer constant. Semantic routines
can report the error later if desired.
Sincerely,
Bob Corbett
[Yeah, in my compiler I allowed (exp,exp) as a general complex constructor,
and enforced the constant restriction semantically in places where I had to,
e.g. data statements. -John]
Index
Home
About
Blog