[HARLEQUIN][Common Lisp HyperSpec (TM)] [Previous][Up][Next]


Issue SHARP-STAR-DELIMITER Writeup

Issue:        SHARP-STAR-DELIMITER

Forum: Editorial

References: *READ-SUPPRESS* (p345), #* (p355)

Category: CLARIFICATION

Edit history: 05-Mar-91, Version 1 by Pitman

15-Mar-91, Version 2 by Pitman

Status: For X3J13 consideration

Problem Description:

What constitutes a delimiter at the end of #* ?

In the description of #* on p355, CLtL says:

``A series of binary digits (0 and 1) preceded by #* is read

as a simple bit-vector. ... If an unsigned decimal integer

appears between the # and *, it specifies explicitly the

length of the vector. In that case, it is an error if too

many bits are specified, and if too few are specified the

last one (it is an error if there are none in this case) is

used to fill all remaining elements of the bit-vector.

... The notation #* denotes an empty bit-vector, as does

#0* (which is legitimate because it is not the case that

too few elements are specified.)''

This seems to imply that the bit vector ends when the sequence

of 0's and 1's ends.

In the discussion of *READ-SUPPRESS* on p345, it says:

``The #* construction always scans over a following token and

produces the value NIL. It will not signal an error even if

the token does not consist solely of the characters 0 and 1.''

This seems to imply that the bit vector ends at the first normal

delimiter.

Proposal (SHARP-STAR-DELIMITER:NORMAL-DELIMITER):

Specify that a bit vector is delimited like any normal numeric

token, and that an error of type READER-ERROR is signalled if all

the characters in the token are not 0's and 1's. Clarify

that | and \ are not permitted as part of the token.

Rationale:

This will seem most natural to people already familiar with the

parsing of other tokens in the language. This is most consistent

with the wording in *READ-SUPPRESS*, which is slightly more explicit

than the wording in #* itself.

Also, this is safest for interchange with other dialects since it

forces users not to rely on non-standard delimiters (like "2" in

Test Case #1 below), and therefore it makes it more likely that

when in a read-suppress context in another dialect, the tokenization

a CL program has used will be the same as the tokenization such an

`other dialect' expects.

Proposal (SHARP-STAR-DELIMITER:NOT-ZERO-OR-ONE):

Specify that a bit vector is delimited by any character that

is not a 0 or 1. Correct the description of *READ-SUPPRESS*

to indicate that #* stops reading and returns NIL as soon as

any character other than 0 or 1.

Rationale:

This prefers a very literal reading of the wording in CLtL's description

of #*, and reverses the behavior of *read-suppress* to be consistent.

Test Cases:

These should signal an error under NORMAL-DELIMITER, and

should return 3 under NOT-ZERO-OR-ONE:

1. (LENGTH '(#*012 3))

2. (LENGTH '(#*0123 4))

These should return 1 under NORMAL-DELIMITER, and

should return 2 under NOT-ZERO-OR-ONE:

3. (LENGTH '(#+NO-SUCH-FEATURE #*012 3))

4. (LENGTH '(#+NO-SUCH-FEATURE #*0123 4))

These should signal an error under NORMAL-DELIMITER since

# is not a terminating readmacro, and should return 2 under

proposal NOT-ZERO-OR-ONE. (Note that in case 5 the two

tokens are both bit-vectors under NOT-ZERO-OR-ONE, but in

cases 6 and 7 the second token is a symbol.)

5. (LENGTH '(#*01#*01))

6. (LENGTH '(#*012#*012))

7. (LENGTH '(#*0123#*0123))

These should return 0 under NORMAL-DELIMITER, and

should return 1 under proposal NOT-ZERO-OR-ONE. (Note that

in case 8 the token that is seen is a bit-vector under

NOT-ZERO-OR-ONE, but in cases 9 and 10 it is a symbol.)

8. (LENGTH '(#+NO-SUCH-FEATURE #*01#*01))

9. (LENGTH '(#+NO-SUCH-FEATURE #*012#*012))

10. (LENGTH '(#+NO-SUCH-FEATURE #*0123#*0123))

Current Practice:

Symbolics Genera 8.1 implements NOT-ZERO-OR-ONE.

Symbolics Cloe implements neither (being closer to the confusing

thing that CLtL actually demands).

Specific results:

Cloe Genera

#1 3 3

#2 3 3

#3 1 2

#4 1 2

#5 2 2

#6 2 2

#7 2 2

#8 0 1

#9 0 1

#10 0 1

Moon, commenting on the test cases for the previous version of this

writeup, says MCL 2.0 is similar to Genera, but differs on one

or two examples. [New sample data for this version not available.]

JonL says Lucid has always supported NORMAL-DELIMITER.

Cost to Implementors:

For implementations that are not already compatible, the cost is

probably relatively small.

Cost to Users:

Problem situations could be mechanically detected, and semi-automatically

corrected in a straightforward way.

Cost of Non-Adoption:

Implementations could differ on how #* expressions were parsed,

causing portability problems.

Benefits:

Cost of non-adoption is avoided.

Aesthetics:

The effect of NOT-ZERO-OR-ONE, while seemingly what CLtL intends,

is often suprising (i.e., unintuitive) to new users.

The NORMAL-DELIMITER is probably more aesthetic since it uses

conventional rules for delimiters.

Discussion:

Moon, Barmar, and Pitman support NORMAL-DELIMITER.

Moon says ``if we vote for NOT-ZERO-OR-ONE then I think we're

inconsistent if we don't say that (length '(#o12399)) is 2.''

Barmar disagrees that this particular consistency is in issue.

Moon cited another test case of #o2+2.

JonL noted that Lucid barfs not only on #o2+2 but even on #o12399.


[Starting Points][Contents][Index][Symbols][Glossary][Issues]
Copyright 1996, The Harlequin Group Limited. All Rights Reserved.