diff --git a/peps/pep-0822.rst b/peps/pep-0822.rst index 29eeabe44ab..dec2a7ef402 100644 --- a/peps/pep-0822.rst +++ b/peps/pep-0822.rst @@ -18,23 +18,25 @@ multiline string literals. Dedented multiline strings use a new prefix "d" (shorthand for "dedent") before the opening quote of a multiline string literal. -Example (spaces are visualized as ``_``): +Example (spaces are visualized as ``.``): .. code-block:: python - def hello_paragraph() -> str: - ____return d""" - ________
- __________Hello, World! - ________
- ____""" + def hello_paragraph() -> str: + ....return d""" + ........+ ..........Hello, World! + ........
+ ....""" -The closing triple quotes control how much indentation would be removed. -In the above example, the returned string will contain three lines: +Unlike ``textwrap.dedent()``, indentation before closing quotes is also +considered when determining the amount of indentation to be removed. +Therefore, the string returned in the example above consists of the following +three lines. -* ``"____\n"`` (four leading spaces) -* ``"______Hello, World!\n"`` (six leading spaces) -* ``"____
\n"`` (four leading spaces) +* ``"....\n"`` +* ``"......Hello, World!\n"`` +* ``"....
\n"`` Motivation @@ -43,7 +45,7 @@ Motivation When writing multiline string literals within deeply indented Python code, users are faced with the following choices: -* Accept that the content of the string literal will be left-aligned. +* Write the contents of the string without indentation. * Use multiple single-line string literals concatenated together instead of a multiline string literal. * Use ``textwrap.dedent()`` to remove indentation. @@ -51,14 +53,15 @@ users are faced with the following choices: All of these options have drawbacks in terms of code readability and maintainability. -* Left-aligned multiline strings look awkward and tend to be avoided. +* Writing multiline strings without indentation in deeply indented code + looks awkward and tends to be avoided. In practice, many places including Python's own test code choose other methods. * Concatenated single-line string literals are more verbose and harder to - maintain. + maintain. Writing ``"\n"`` at the end of each line is tedious. + It's easy to miss the semicolons between many string concatenations. * ``textwrap.dedent()`` is implemented in Python so it requires some runtime - overhead. - It cannot be used in hot paths where performance is critical. + overhead. Moreover, it cannot be used to dedent t-strings. This PEP aims to provide a built-in syntax for dedented multiline strings that is both easy to read and write, while also being efficient at runtime. @@ -99,30 +102,41 @@ Specification ============= Add a new string literal prefix "d" for dedented multiline strings. -This prefix can be combined with "f", "t", and "r" prefixes. +This prefix can be combined with "f", "t", "r", and "b" prefixes. This prefix is only for multiline string literals. So it can only be used with triple quotes (``"""`` or ``'''``). -Using it with single or double quotes (``"`` or ``'``) is a syntax error. Opening triple quotes needs to be followed by a newline character. This newline is not included in the resulting string. +The content of the d-string starts from the next line. -The amount of indentation to be removed is determined by the whitespace -(``' '`` or ``'\t'``) preceding the closing triple quotes. -Mixing spaces and tabs in indentation raises a ``TabError``, similar to -Python's own indentation rules. +Indentation is leading whitespace characters (spaces and tabs) of each line. -The dedentation process removes the determined amount of leading whitespace -from every line in the string. -Lines that are shorter than the determined indentation become just an empty -line (e.g. ``"\n"``). -Otherwise, if the line does not start with the determined indentation, -Python raises an ``IndentationError``. +The amount of indentation to be removed is determined by the longest common +indentation of lines in the string. +Lines consisting entirely of whitespace characters are ignored when +determining the common indentation, except for the line containing the closing +triple quotes. + +Spaces and tabs are treated as different characters. +For example, ``" hello"`` and ``"\thello"`` have no common indentation. + +The dedentation process removes the determined indentation from every line in +the string. + +* Lines that are longer than or equal in length to the determined indentation + must start with the determined indentation. + Othrerwise, Python raises an ``IndentationError``. + The determined indentation is removed from these lines. +* Lines that are shorter than the determined indentation (including + empty lines) must be a prefix of the determined indentation. + Otherwise, Python raises an ``IndentationError``. + These lines become empty lines. Unless combined with the "r" prefix, backslash escapes are processed after -removing indentation. -So you cannot use ``\\t`` to create indentation. +the dedentation process. +So you cannot use ``\\t`` in indentations. And you can use line continuation (backslash at the end of line) and remove indentation from the continued line. @@ -130,102 +144,177 @@ Examples: .. code-block:: python - # Whitespace is shown as _ and tab is shown as ---> for clarity. - # Error messages are just for explanation. Actual messages may differ. - - s = d"" # SyntaxError: d-string must be a multiline string - s = d"""Hello""" # SyntaxError: d-string must be a multiline string - s = d"""Hello - __World! - """ # SyntaxError: d-string must start with a newline - - s = d""" - __Hello - __World!""" # SyntaxError: d-string must end with an indent-only line - - s = d""" - __Hello - __World! - """ # Zero indentation is removed because closing quotes are not indented. - print(repr(s)) # '__Hello\n__World!\n' - - s = d""" - __Hello - __World! - _""" # One space indentation is removed. - print(repr(s)) # '_Hello\n_World!\n' - - s = d""" - __Hello - __World! - __""" # Two spaces indentation are removed. - print(repr(s)) # 'Hello\nWorld!\n' - - s = d""" - __Hello - __World! - ___""" # IndentationError: missing valid indentation - - s = d""" - --->Hello - __World! - __""" # IndentationError: missing valid indentation - - s = d""" - --->--->__Hello - --->--->__World! - --->--->""" # Tab is allowed as indentation. - # Spaces are just in the string, not indentation to be removed. - print(repr(s)) # '__Hello\n__World!\n' - - s = d""" - --->____Hello - --->____World! - --->__""" # TabError: mixing spaces and tabs in indentation - - s = d""" - __Hello \ - __World!\ - __""" # line continuation works as ususal - print(repr(s)) # 'Hello_World!' - - s = d"""\ - __Hello - __World - __""" # SyntaxError: d-string must starts with a newline. - - s = dr""" - __Hello\ - __World!\ - __""" # d-string can be combined with r-string. - print(repr(s)) # 'Hello\\\nWorld!\\\n' - - s = df""" - ____Hello, {"world".title()}! - ____""" # d-string can be combined with f-string and t-string too. - print(repr(s)) # 'Hello, World!\n' - - s = dt""" - ____Hello, {"world".title()}! - ____""" - print(type(s)) #