Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
5c07d9b0
Commit
5c07d9b0
authored
May 14, 1998
by
Fred Drake
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Updated markup style (got rid of \verb@...@, mostly).
parent
2094e044
Changes
6
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
203 additions
and
197 deletions
+203
-197
Doc/ref/ref1.tex
Doc/ref/ref1.tex
+12
-12
Doc/ref/ref2.tex
Doc/ref/ref2.tex
+34
-34
Doc/ref/ref3.tex
Doc/ref/ref3.tex
+2
-2
Doc/ref/ref5.tex
Doc/ref/ref5.tex
+82
-78
Doc/ref/ref7.tex
Doc/ref/ref7.tex
+63
-61
Doc/ref/ref8.tex
Doc/ref/ref8.tex
+10
-10
No files found.
Doc/ref/ref1.tex
View file @
5c07d9b0
...
...
@@ -43,20 +43,20 @@ name: lc_letter (lc_letter | "_")*
lc
_
letter: "a"..."z"
\end{verbatim}
The first line says that a
\
verb
@
name
@
is an
\verb
@
lc_letter
@
followed by
a sequence of zero or more
\
verb
@
lc_letter
@
s and underscores. An
\
verb
@
lc_letter
@
in turn is any of the single characters `a' through `z'.
(This rule is actually adhered to for the names defined in lexical and
grammar rules in this document.)
The first line says that a
\
code
{
name
}
is an
\code
{
lc
_
letter
}
followed by
a sequence of zero or more
\
code
{
lc
_
letter
}
s and underscores. An
\
code
{
lc
_
letter
}
in turn is any of the single characters
\character
{
a
}
through
\character
{
z
}
. (This rule is actually adhered to for the
names defined in lexical and
grammar rules in this document.)
Each rule begins with a name (which is the name defined by the rule)
and a colon. A vertical bar (
\
verb
@
|
@
) is used to separate
and a colon. A vertical bar (
\
code
{
|
}
) is used to separate
alternatives; it is the least binding operator in this notation. A
star (
\
verb
@
*
@
) means zero or more repetitions of the preceding item;
likewise, a plus (
\
verb
@
+
@
) means one or more repetitions, and a
phrase enclosed in square brackets (
\
verb
@
[ ]
@
) means zero or one
star (
\
code
{
*
}
) means zero or more repetitions of the preceding item;
likewise, a plus (
\
code
{
+
}
) means one or more repetitions, and a
phrase enclosed in square brackets (
\
code
{
[ ]
}
) means zero or one
occurrences (in other words, the enclosed phrase is optional). The
\
verb
@
*
@
and
\verb
@
+
@
operators bind as tightly as possible;
\
code
{
*
}
and
\code
{
+
}
operators bind as tightly as possible;
parentheses are used for grouping. Literal strings are enclosed in
quotes. White space is only meaningful to separate tokens.
Rules are normally contained on a single line; rules with many
...
...
@@ -66,11 +66,11 @@ first beginning with a vertical bar.
In lexical definitions (as the example above), two more conventions
are used: Two literal characters separated by three dots mean a choice
of any single character in the given (inclusive) range of
\ASCII
{}
characters. A phrase between angular brackets (
\
verb
@
<...>
@
) gives an
characters. A phrase between angular brackets (
\
code
{
<...>
}
) gives an
informal description of the symbol defined; e.g. this could be used
to describe the notion of `control character' if needed.
\index
{
lexical definitions
}
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
Even though the notation used is almost the same, there is a big
difference between the meaning of lexical and syntactic definitions:
...
...
Doc/ref/ref2.tex
View file @
5c07d9b0
\chapter
{
Lexical analysis
}
A Python program is read by a
{
\em
parser
}
. Input to the parser is a
stream of
{
\em
tokens
}
, generated by the
{
\em
lexical analyzer
}
. This
A Python program is read by a
\emph
{
parser
}
. Input to the parser is a
stream of
\emph
{
tokens
}
, generated by the
\emph
{
lexical analyzer
}
. This
chapter describes how the lexical analyzer breaks a file into tokens.
\index
{
lexical analysis
}
\index
{
parser
}
...
...
@@ -19,7 +19,7 @@ syntax (e.g. between statements in compound statements).
\subsection
{
Comments
}
A comment starts with a hash character (
\
verb
@
#
@
) that is not part of
A comment starts with a hash character (
\
code
{
\#
}
) that is not part of
a string literal, and ends at the end of the physical line. A comment
always signifies the end of the logical line. Comments are ignored by
the syntax.
...
...
@@ -31,7 +31,7 @@ the syntax.
\subsection
{
Explicit line joining
}
Two or more physical lines may be joined into logical lines using
backslash characters (
\
verb
/
\
/
), as follows: when a physical line ends
backslash characters (
\
code
{
\e
}
), as follows: when a physical line ends
in a backslash that is not part of a string literal or comment, it is
joined with the following forming a single logical line, deleting the
backslash and the following end-of-line character. For example:
...
...
@@ -91,7 +91,7 @@ turn is used to determine the grouping of statements.
First, tabs are replaced (from left to right) by one to eight spaces
such that the total number of characters up to there is a multiple of
eight (this is intended to be the same rule as used by
{
\UNIX
}
). The
eight (this is intended to be the same rule as used by
\UNIX
{
}
). The
total number of spaces preceding the first non-blank character then
determines the line's indentation. Indentation cannot be split over
multiple physical lines using backslashes.
...
...
@@ -107,7 +107,7 @@ the stack will always be strictly increasing from bottom to top. At
the beginning of each logical line, the line's indentation level is
compared to the top of the stack. If it is equal, nothing happens.
If it is larger, it is pushed on the stack, and one INDENT token is
generated. If it is smaller, it
{
\em
must
}
be one of the numbers
generated. If it is smaller, it
\emph
{
must
}
be one of the numbers
occurring on the stack; all numbers on the stack that are larger are
popped off, and for each number popped off a DEDENT token is
generated. At the end of the file, a DEDENT token is generated for
...
...
@@ -145,7 +145,7 @@ The following example shows various indentation errors:
(Actually, the first three errors are detected by the parser; only the
last error is found by the lexical analyzer --- the indentation of
\
verb
@
return r
@
does not match a level popped off the stack.)
\
code
{
return r
}
does not match a level popped off the stack.)
\section
{
Other tokens
}
...
...
@@ -174,10 +174,10 @@ Identifiers are unlimited in length. Case is significant.
\subsection
{
Keywords
}
The following identifiers are used as reserved words, or
{
\em
keywords
}
of the language, and cannot be used as ordinary
identifiers. They must be spelled exactly as written here:
\index
{
keyword
}
The following identifiers are used as reserved words, or
\emph
{
keywords
}
of the language, and cannot be used as ordinary
identifiers. They must be spelled exactly as written here:
%
\index
{
keyword
}
%
\index
{
reserved word
}
\begin{verbatim}
...
...
@@ -212,13 +212,13 @@ shortstringchar: <any ASCII character except "\" or newline or the quote>
longstringchar: <any ASCII character except "
\"
>
escapeseq: "
\"
<any ASCII character>
\end{verbatim}
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
In ``long strings'' (strings surrounded by sets of three quotes),
unescaped newlines and quotes are allowed (and are retained), except
that three unescaped quotes in a row terminate the string. (A
``quote'' is the character used to open the string, i.e. either
\
verb
/
'
/
or
\verb
/
"
/
.)
\
code
{
'
}
or
\code
{
"
}
.)
Escape sequences in strings are interpreted according to rules similar
to those used by Standard C. The recognized escape sequences are:
...
...
@@ -230,32 +230,32 @@ to those used by Standard C. The recognized escape sequences are:
\begin{center}
\begin{tabular}
{
|l|l|
}
\hline
\
verb
/
\
/
{
\em
newline
}
&
Ignored
\\
\
verb
/
\\
/
&
Backslash (
\verb
/
\
/
)
\\
\
verb
/
\'
/
&
Single quote (
\verb
/
'
/
)
\\
\
verb
/
\"
/
&
Double quote (
\verb
/
"
/
)
\\
\
verb
/
\a
/
&
\ASCII
{}
Bell (BEL)
\\
\
verb
/
\b
/
&
\ASCII
{}
Backspace (BS)
\\
%\
verb/\E/
& \ASCII{} Escape (ESC) \\
\
verb
/
\f
/
&
\ASCII
{}
Formfeed (FF)
\\
\
verb
/
\n
/
&
\ASCII
{}
Linefeed (LF)
\\
\
verb
/
\r
/
&
\ASCII
{}
Carriage Return (CR)
\\
\
verb
/
\t
/
&
\ASCII
{}
Horizontal Tab (TAB)
\\
\
verb
/
\v
/
&
\ASCII
{}
Vertical Tab (VT)
\\
\
verb
/
\
/
{
\em
ooo
}
&
\ASCII
{}
character with octal value
{
\em
ooo
}
\\
\
verb
/
\x
/
{
\em
xx...
}
&
\ASCII
{}
character with hex value
{
\em
xx...
}
\\
\
code
{
\e
}
\emph
{
newline
}
&
Ignored
\\
\
code
{
\e\e
}
&
Backslash (
\code
{
\e
}
)
\\
\
code
{
\e
'
}
&
Single quote (
\code
{
'
}
)
\\
\
code
{
\e
"
}
&
Double quote (
\code
{
"
}
)
\\
\
code
{
\e
a
}
&
\ASCII
{}
Bell (BEL)
\\
\
code
{
\e
b
}
&
\ASCII
{}
Backspace (BS)
\\
%\
code{\e E}
& \ASCII{} Escape (ESC) \\
\
code
{
\e
f
}
&
\ASCII
{}
Formfeed (FF)
\\
\
code
{
\e
n
}
&
\ASCII
{}
Linefeed (LF)
\\
\
code
{
\e
r
}
&
\ASCII
{}
Carriage Return (CR)
\\
\
code
{
\e
t
}
&
\ASCII
{}
Horizontal Tab (TAB)
\\
\
code
{
\e
v
}
&
\ASCII
{}
Vertical Tab (VT)
\\
\
code
{
\e
}
\emph
{
ooo
}
&
\ASCII
{}
character with octal value
\emph
{
ooo
}
\\
\
code
{
\e
x
}
\emph
{
xx...
}
&
\ASCII
{}
character with hex value
\emph
{
xx...
}
\\
\hline
\end{tabular}
\end{center}
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
In strict compatibility with Standard C, up to three octal digits are
In strict compatibility with Standard
\
C
, up to three octal digits are
accepted, but an unlimited number of hex digits is taken to be part of
the hex escape (and then the lower 8 bits of the resulting hex number
are used in all current implementations...).
All unrecognized escape sequences are left in the string unchanged,
i.e.,
{
\em
the backslash is left in the string.
}
(This behavior is
i.e.,
\emph
{
the backslash is left in the string.
}
(This behavior is
useful when debugging: if an escape sequence is mistyped, the
resulting output is more easily recognized as broken. It also helps a
great deal for string literals used as regular expressions or
...
...
@@ -331,8 +331,8 @@ Some examples of floating point literals:
\end{verbatim}
Note that numeric literals do not include a sign; a phrase like
\
verb
@
-1
@
is actually an expression composed of the operator
\
verb
@
-
@
and the literal
\verb
@
1
@
.
\
code
{
-1
}
is actually an expression composed of the operator
\
code
{
-
}
and the literal
\code
{
1
}
.
\section
{
Operators
}
...
...
@@ -345,7 +345,7 @@ The following tokens are operators:
< == > <= <> != >=
\end{verbatim}
The comparison operators
\
verb
@
<>
@
and
\verb
@
!=
@
are alternate
The comparison operators
\
code
{
<>
}
and
\code
{
!=
}
are alternate
spellings of the same operator.
\section
{
Delimiters
}
...
...
@@ -363,7 +363,7 @@ meaning:
The following printing
\ASCII
{}
characters are not used in Python. Their
occurrence outside string literals and comments is an unconditional
error:
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
\begin{verbatim}
@
$
?
...
...
Doc/ref/ref3.tex
View file @
5c07d9b0
...
...
@@ -220,14 +220,14 @@ read from a file.
\obindex
{
string
}
\index
{
character
}
\index
{
byte
}
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
(On systems whose native character set is not
\ASCII
{}
, strings may use
EBCDIC in their internal representation, provided the functions
\function
{
chr()
}
and
\function
{
ord()
}
implement a mapping between
\ASCII
{}
and
EBCDIC, and string comparison preserves the
\ASCII
{}
order.
Or perhaps someone can propose a better rule?)
\index
{
ASCII
}
\index
{
ASCII
@
\ASCII
{}
}
\index
{
EBCDIC
}
\index
{
character set
}
\indexii
{
string
}{
comparison
}
...
...
Doc/ref/ref5.tex
View file @
5c07d9b0
This diff is collapsed.
Click to expand it.
Doc/ref/ref7.tex
View file @
5c07d9b0
This diff is collapsed.
Click to expand it.
Doc/ref/ref8.tex
View file @
5c07d9b0
...
...
@@ -13,9 +13,9 @@ While a language specification need not prescribe how the language
interpreter is invoked, it is useful to have a notion of a complete
Python program. A complete Python program is executed in a minimally
initialized environment: all built-in and standard modules are
available, but none have been initialized, except for
\
verb
@
sys
@
(various system services),
\
verb
@
__builtin__
@
(built-in functions,
exceptions and
\
verb
@
None
@
) and
\verb
@
__main__
@
. The latter is used
available, but none have been initialized, except for
\
module
{
sys
}
(various system services),
\
module
{__
builtin
__}
(built-in functions,
exceptions and
\
code
{
None
}
) and
\module
{__
main
__}
. The latter is used
to provide the local and global name space for execution of the
complete program.
\refbimodindex
{
sys
}
...
...
@@ -29,7 +29,7 @@ The interpreter may also be invoked in interactive mode; in this case,
it does not read and execute a complete program but reads and executes
one statement (possibly compound) at a time. The initial environment
is identical to that of a complete program; each statement is executed
in the name space of
\
verb
@
__main__
@
.
in the name space of
\
module
{__
main
__}
.
\index
{
interactive mode
}
\refbimodindex
{__
main
__}
...
...
@@ -59,7 +59,7 @@ This syntax is used in the following situations:
\item
when parsing a module;
\item
when parsing a string passed to the
\
verb
@
exec
@
statement;
\item
when parsing a string passed to the
\
keyword
{
exec
}
statement;
\end{itemize}
...
...
@@ -81,14 +81,14 @@ end of the input.
There are two forms of expression input. Both ignore leading
whitespace.
The string argument to
\
verb
@
eval()
@
must have the following form:
The string argument to
\
function
{
eval()
}
must have the following form:
\bifuncindex
{
eval
}
\begin{verbatim}
eval
_
input: condition
_
list NEWLINE*
\end{verbatim}
The input line read by
\
verb
@
input()
@
must have the following form:
The input line read by
\
function
{
input()
}
must have the following form:
\bifuncindex
{
input
}
\begin{verbatim}
...
...
@@ -96,10 +96,10 @@ input_input: condition_list NEWLINE
\end{verbatim}
Note: to read `raw' input line without interpretation, you can use the
built-in function
\
verb
@
raw_input()
@
or the
\verb
@
readline()
@
method
built-in function
\
function
{
raw
_
input()
}
or the
\method
{
readline()
}
method
of file objects.
\obindex
{
file
}
\index
{
input!raw
}
\index
{
raw input
}
\bifuncindex
{
raw
_
in
dex
}
\
ttindex
{
readline
}
\bifuncindex
{
raw
_
in
put
}
\
withsubitem
{
(file method)
}{
\ttindex
{
readline()
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment