From a5e110155b4da270afe38cd699b24fcb22afc8d4 Mon Sep 17 00:00:00 2001 From: Inada Naoki Date: Wed, 21 Jan 2026 01:21:36 +0900 Subject: [PATCH 1/2] update pep 0822 --- peps/pep-0822.rst | 401 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 279 insertions(+), 122 deletions(-) diff --git a/peps/pep-0822.rst b/peps/pep-0822.rst index 29eeabe44ab..e44b2590994 100644 --- a/peps/pep-0822.rst +++ b/peps/pep-0822.rst @@ -22,19 +22,21 @@ Example (spaces are visualized as ``_``): .. code-block:: python - def hello_paragraph() -> str: - ____return d""" - ________

- __________Hello, World! - ________

- ____""" + def hello_paragraph() -> str: + ____return d""" + ________

+ __________Hello, World! + ________

+ ____""" -The closing triple quotes control how much indentation would be removed. -In the above example, the returned string will contain three lines: +Unlike ``textwrap.dedent()``, indentation before closing quotes is also +considered when determining the amount of indentation to be removed. +Therefore, the string returned in the example above consists of the following +three lines. -* ``"____

\n"`` (four leading spaces) -* ``"______Hello, World!\n"`` (six leading spaces) -* ``"____

\n"`` (four leading spaces) +* ``"____

\n"`` +* ``"______Hello, World!\n"`` +* ``"____

\n"`` Motivation @@ -43,7 +45,7 @@ Motivation When writing multiline string literals within deeply indented Python code, users are faced with the following choices: -* Accept that the content of the string literal will be left-aligned. +* Write the contents of the string without indentation. * Use multiple single-line string literals concatenated together instead of a multiline string literal. * Use ``textwrap.dedent()`` to remove indentation. @@ -51,14 +53,15 @@ users are faced with the following choices: All of these options have drawbacks in terms of code readability and maintainability. -* Left-aligned multiline strings look awkward and tend to be avoided. +* Writing multiline strings without indentation in deeply indented code + looks awkward and tends to be avoided. In practice, many places including Python's own test code choose other methods. * Concatenated single-line string literals are more verbose and harder to - maintain. + maintain. Writing ``"\n"`` at the end of each line is tedious. + It's easy to miss the semicolons between many string concatenations. * ``textwrap.dedent()`` is implemented in Python so it requires some runtime - overhead. - It cannot be used in hot paths where performance is critical. + overhead. Moreover, it cannot be used to dedent t-strings. This PEP aims to provide a built-in syntax for dedented multiline strings that is both easy to read and write, while also being efficient at runtime. @@ -99,30 +102,41 @@ Specification ============= Add a new string literal prefix "d" for dedented multiline strings. -This prefix can be combined with "f", "t", and "r" prefixes. +This prefix can be combined with "f", "t", "r", and "b" prefixes. This prefix is only for multiline string literals. So it can only be used with triple quotes (``"""`` or ``'''``). -Using it with single or double quotes (``"`` or ``'``) is a syntax error. Opening triple quotes needs to be followed by a newline character. This newline is not included in the resulting string. +The content of the d-string starts from the next line. -The amount of indentation to be removed is determined by the whitespace -(``' '`` or ``'\t'``) preceding the closing triple quotes. -Mixing spaces and tabs in indentation raises a ``TabError``, similar to -Python's own indentation rules. +Indentation is leading whitespace characters (spaces and tabs) of each line. -The dedentation process removes the determined amount of leading whitespace -from every line in the string. -Lines that are shorter than the determined indentation become just an empty -line (e.g. ``"\n"``). -Otherwise, if the line does not start with the determined indentation, -Python raises an ``IndentationError``. +The amount of indentation to be removed is determined by the longest common +indentation of lines in the string. +Lines consisting entirely of whitespace characters are ignored when +determining the common indentation, except for the line containing the closing +triple quotes. + +Spaces and tabs are treated as different characters. +For example, ``" hello"`` and ``"\thello"`` have no common indentation. + +The dedentation process removes the determined indentation from every line in +the string. + +* Lines that are longer than or equal in length to the determined indentation + must start with the determined indentation. + Othrerwise, Python raises an ``IndentationError``. + The determined indentation is removed from these lines. +* Lines that are shorter than the determined indentation (including + empty lines) must be a prefix of the determined indentation. + Otherwise, Python raises an ``IndentationError``. + These lines become empty lines. Unless combined with the "r" prefix, backslash escapes are processed after -removing indentation. -So you cannot use ``\\t`` to create indentation. +the dedentation process. +So you cannot use ``\\t`` in indentations. And you can use line continuation (backslash at the end of line) and remove indentation from the continued line. @@ -130,102 +144,177 @@ Examples: .. code-block:: python - # Whitespace is shown as _ and tab is shown as ---> for clarity. - # Error messages are just for explanation. Actual messages may differ. - - s = d"" # SyntaxError: d-string must be a multiline string - s = d"""Hello""" # SyntaxError: d-string must be a multiline string - s = d"""Hello - __World! - """ # SyntaxError: d-string must start with a newline - - s = d""" - __Hello - __World!""" # SyntaxError: d-string must end with an indent-only line - - s = d""" - __Hello - __World! - """ # Zero indentation is removed because closing quotes are not indented. - print(repr(s)) # '__Hello\n__World!\n' - - s = d""" - __Hello - __World! - _""" # One space indentation is removed. - print(repr(s)) # '_Hello\n_World!\n' - - s = d""" - __Hello - __World! - __""" # Two spaces indentation are removed. - print(repr(s)) # 'Hello\nWorld!\n' - - s = d""" - __Hello - __World! - ___""" # IndentationError: missing valid indentation - - s = d""" - --->Hello - __World! - __""" # IndentationError: missing valid indentation - - s = d""" - --->--->__Hello - --->--->__World! - --->--->""" # Tab is allowed as indentation. - # Spaces are just in the string, not indentation to be removed. - print(repr(s)) # '__Hello\n__World!\n' - - s = d""" - --->____Hello - --->____World! - --->__""" # TabError: mixing spaces and tabs in indentation - - s = d""" - __Hello \ - __World!\ - __""" # line continuation works as ususal - print(repr(s)) # 'Hello_World!' - - s = d"""\ - __Hello - __World - __""" # SyntaxError: d-string must starts with a newline. - - s = dr""" - __Hello\ - __World!\ - __""" # d-string can be combined with r-string. - print(repr(s)) # 'Hello\\\nWorld!\\\n' - - s = df""" - ____Hello, {"world".title()}! - ____""" # d-string can be combined with f-string and t-string too. - print(repr(s)) # 'Hello, World!\n' - - s = dt""" - ____Hello, {"world".title()}! - ____""" - print(type(s)) # - print(s.strings) # ('Hello, ', '!\n') - print(s.values) # ('World',) - print(s.interpolations) - # (Interpolation('World', '"world".title()', None, ''),) + # d-string must starts with a newline. + s = d"" # SyntaxError: d-string must be triple-quoted + s = d"""""" # SyntaxError: d-string must start with a newline + s = d"""Hello""" # SyntaxError: d-string must start with a newline + s = d"""Hello + __World! + """ # SyntaxError: d-string must start with a newline + + # d-string removes the longest common indentation from each line. + # Empty lines are ignored, but closing quotes line is always considered. + s = d""" + __Hello + __World! + __""" + print(repr(s)) # 'Hello\nWorld!\n' + + s = d""" + __Hello + __World! + _""" + print(repr(s)) # '_Hello\n_World!\n' + + s = d""" + __Hello + __World! + """ + print(repr(s)) # '__Hello\n__World!\n' + + s = d""" + __Hello + _ + + __World! + ___""" # Longest common indentation is '__'. + print(repr(s)) # 'Hello\n\n\nWorld!\n_' + + # Closing qutotes can be on the same line as the last content line. + # In this case, the string does not end with a newline. + s = d""" + __Hello + __World!""" + print(repr(s)) # 'Hello\nWorld!' + + # Tabs are allowed as indentation. + # But tabs and spaces are treated as different characters. + s = d""" + --->__Hello + --->__World! + --->""" + print(repr(s)) # '__Hello\n__World!\n' + + s = d""" + --->Hello + __World! + __""" # There is no common indentation. + print(repr(s)) # '\tHello\n__World!\n__' + + # Line continuation with backslash works as usual. + # But you cannot put a backslash right after the opening quotes. + s = d""" + __Hello \ + __World!\ + __""" + print(repr(s)) # 'Hello_World!' + + s = d"""\ + __Hello + __World + __""" # SyntaxError: d-string must starts with a newline. + + # d-string can be combined with r-string, b-string, f-string, and t-string. + s = dr""" + __Hello\ + __World!\ + __""" + print(repr(s)) # 'Hello\\\nWorld!\\\n' + + s = db""" + __Hello + __World! + __""" + print(repr(s)) # b'Hello\nWorld!\n' + + s = df""" + ____Hello, {"world".title()}! + ____""" + print(repr(s)) # 'Hello,_World!\n' + + s = dt""" + ____Hello, {"world".title()}! + ____""" + print(type(s)) # + print(s.strings) # ('Hello,_', '!\n') + print(s.values) # ('World',) How to Teach This ================= -In the tutorial, we can introduce d-string with triple quote string literals. -Additionally, we can add a note in the ``textwrap.dedent()`` documentation, -providing a link to the d-string section in the language reference or -the relevant part of the tutorial. +The main difference between ``textwrap.dedent("""...""")`` and d-string can be +explained as follows: + +* ``textwrap.dedent()`` is a regular function, but d-string is part of the + language syntax. d-string has no runtime overhead, and it can remove + indentation from t-strings. + +* When using ``textwrap.dedent()``, you need to start with ``"""\`` to avoid + including the first newline character, but with d-string, the string content + starts from the line after ``d"""``, so no backslash is needed. + + .. code-block:: python + + import textwrap + + s1 = textwrap.dedent("""\ + Hello + World! + """) + s2 = d""" + Hello + World! + """ + assert s1 == s2 + +* ``textwrap.dedent()`` ignores all blank lines when determining the common + indentation, but d-string also considers the indentation of the closing + quotes. + This allows d-string to preserve some indentation in the result when needed. + + .. code-block:: python + + import textwrap + + s1 = textwrap.dedent("""\ + Hello + World! + """) + s2 = d""" + Hello + World! + """ + assert s1 != s2 + assert s1 == 'Hello\nWorld!\n' + assert s2 == ' Hello\n World!\n' + +* Since d-string removes indentation before processing escape sequences, + when using line continuation (backslash at the end of a line), the next line + can also be dedented. + + .. code-block:: python + + import textwrap + + s1 = textwrap.dedent("""\ + Lorem ipsum dolor sit amet, consectetur adipiscing elit, \ + sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. + Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris \ + nisi ut aliquip ex ea commodo consequat. + """) + s2 = d""" + Lorem ipsum dolor sit amet, consectetur adipiscing elit, \ + sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. + Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris \ + nisi ut aliquip ex ea commodo consequat. + """ + assert s1 == s2 + Other Languages having Similar Features -======================================== +======================================= Java 15 introduced a feature called `text blocks `__. Since Java had not used triple qutes before, they introduced triple quotes for @@ -242,14 +331,14 @@ PHP 7.3 introduced `Flexible Heredoc and Nowdoc Syntaxes `__ +that removes indent from lines in heredoc. + +Java, Julia, and Ruby uses the least-indented line to determine the amount of indentation to be removed. Swift, C#, and PHP uses the indentation of the closing triple quotes or closing marker. -This PEP chose the Swift and C# approach because it is simpler and easier to -explain. - Reference Implementation ======================== @@ -312,6 +401,74 @@ Therefore, `many people preferred the new string prefix Date: Wed, 21 Jan 2026 18:14:28 +0900 Subject: [PATCH 2/2] replace '_' with '.' or just space. --- peps/pep-0822.rst | 136 +++++++++++++++++++++++----------------------- 1 file changed, 68 insertions(+), 68 deletions(-) diff --git a/peps/pep-0822.rst b/peps/pep-0822.rst index e44b2590994..dec2a7ef402 100644 --- a/peps/pep-0822.rst +++ b/peps/pep-0822.rst @@ -18,25 +18,25 @@ multiline string literals. Dedented multiline strings use a new prefix "d" (shorthand for "dedent") before the opening quote of a multiline string literal. -Example (spaces are visualized as ``_``): +Example (spaces are visualized as ``.``): .. code-block:: python def hello_paragraph() -> str: - ____return d""" - ________

- __________Hello, World! - ________

- ____""" + ....return d""" + ........

+ ..........Hello, World! + ........

+ ....""" Unlike ``textwrap.dedent()``, indentation before closing quotes is also considered when determining the amount of indentation to be removed. Therefore, the string returned in the example above consists of the following three lines. -* ``"____

\n"`` -* ``"______Hello, World!\n"`` -* ``"____

\n"`` +* ``"....

\n"`` +* ``"......Hello, World!\n"`` +* ``"....

\n"`` Motivation @@ -149,94 +149,94 @@ Examples: s = d"""""" # SyntaxError: d-string must start with a newline s = d"""Hello""" # SyntaxError: d-string must start with a newline s = d"""Hello - __World! + ..World! """ # SyntaxError: d-string must start with a newline # d-string removes the longest common indentation from each line. # Empty lines are ignored, but closing quotes line is always considered. s = d""" - __Hello - __World! - __""" + ..Hello + ..World! + ..""" print(repr(s)) # 'Hello\nWorld!\n' s = d""" - __Hello - __World! - _""" - print(repr(s)) # '_Hello\n_World!\n' + ..Hello + ..World! + .""" + print(repr(s)) # '.Hello\n.World!\n' s = d""" - __Hello - __World! + ..Hello + ..World! """ - print(repr(s)) # '__Hello\n__World!\n' + print(repr(s)) # '..Hello\n..World!\n' s = d""" - __Hello - _ + ..Hello + . - __World! - ___""" # Longest common indentation is '__'. - print(repr(s)) # 'Hello\n\n\nWorld!\n_' + ..World! + ...""" # Longest common indentation is '..'. + print(repr(s)) # 'Hello\n\n\nWorld!\n.' # Closing qutotes can be on the same line as the last content line. # In this case, the string does not end with a newline. s = d""" - __Hello - __World!""" + ..Hello + ..World!""" print(repr(s)) # 'Hello\nWorld!' # Tabs are allowed as indentation. # But tabs and spaces are treated as different characters. s = d""" - --->__Hello - --->__World! + --->..Hello + --->..World! --->""" - print(repr(s)) # '__Hello\n__World!\n' + print(repr(s)) # '..Hello\n..World!\n' s = d""" --->Hello - __World! - __""" # There is no common indentation. - print(repr(s)) # '\tHello\n__World!\n__' + ..World! + ..""" # There is no common indentation. + print(repr(s)) # '\tHello\n..World!\n..' # Line continuation with backslash works as usual. # But you cannot put a backslash right after the opening quotes. s = d""" - __Hello \ - __World!\ - __""" - print(repr(s)) # 'Hello_World!' + ..Hello \ + ..World!\ + ..""" + print(repr(s)) # 'Hello World!' s = d"""\ - __Hello - __World - __""" # SyntaxError: d-string must starts with a newline. + ..Hello + ..World + ..""" # SyntaxError: d-string must starts with a newline. # d-string can be combined with r-string, b-string, f-string, and t-string. s = dr""" - __Hello\ - __World!\ - __""" + ..Hello\ + ..World!\ + ..""" print(repr(s)) # 'Hello\\\nWorld!\\\n' s = db""" - __Hello - __World! - __""" + ..Hello + ..World! + ..""" print(repr(s)) # b'Hello\nWorld!\n' s = df""" - ____Hello, {"world".title()}! - ____""" - print(repr(s)) # 'Hello,_World!\n' + ....Hello, {"world".title()}! + ....""" + print(repr(s)) # 'Hello,.World!\n' s = dt""" - ____Hello, {"world".title()}! - ____""" + ....Hello, {"world".title()}! + ....""" print(type(s)) # - print(s.strings) # ('Hello,_', '!\n') + print(s.strings) # ('Hello, ', '!\n') print(s.values) # ('World',) @@ -413,15 +413,15 @@ newline like below: .. code-block:: python s = d""" - ____Hello - ____World! - __""" # "__Hello\n__World!" (no trailing newline) + Hello + World! + """ # " Hello\n World!" (no trailing newline) s = d""" - ____Hello - ____World! + Hello + World! - __""" # "__Hello\n__World!\n" (has a trailing newline) + """ # " Hello\n World!\n" (has a trailing newline) However, including a newline at the end of the last line of a multiline string literal is a very common case, and requiring an empty line at the end would @@ -444,21 +444,21 @@ you can use workarounds such as line continuation or ``str.rstrip()``. .. code-block:: python s = d""" - ____Hello - ____World!""" + Hello + World!""" assert s == "Hello\nWorld!" s = d""" - ____Hello - ____World!\ - __""" - assert s == "__Hello\n__World!" + Hello + World!\ + """ + assert s == " Hello\n World!" s = dr""" - ____Hello - ____World! - __""".rstrip() - assert s == "__Hello\n__World!" + Hello + World! + """.rstrip() + assert s == " Hello\n World!" While these workarounds are not ideal, the drawbacks are considered smaller than the confusion that would result from automatically removing the trailing