2.6. Unpack Assignment Expression

  • Since Python 3.8: PEP 572 -- Assignment Expressions

  • Also known as "Walrus operator"

  • Also known as "Named expression"

During discussion of this PEP, the operator became informally known as "the walrus operator". The construct's formal name is "Assignment Expressions" (as per the PEP title), but they may also be referred to as "Named Expressions". The CPython reference implementation uses that name internally). 1

Guido van Rossum stepped down after accepting PEP 572 -- Assignment Expressions:

../../_images/unpack-assignmentexpr-bdfl.png

2.6.1. Syntax

Scalar:

(x := <VALUE>)

Comprehension:

result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)]
result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)
          and (<VARIABLE3> := <EXPR>)]
result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)
          and (<VARIABLE3> := <EXPR>)
          or (<VARIABLE4> := <EXPR>)]

2.6.2. Example

  • First defines identifier with value

  • Then returns the value from the identifier

  • Both operations in the same line

>>> x = 1
>>> print(x)
1
>>> print(x = 1)
Traceback (most recent call last):
TypeError: 'x' is an invalid keyword argument for print()
>>> print(x := 1)
1

2.6.3. What is not

  • It's not substitution for equals

>>> x = 1
>>> print(x)
1
>>> x := 1
Traceback (most recent call last):
SyntaxError: invalid syntax

2.6.4. Processing Streams

  • Processing steams in chunks:

>>> 
... file = open('myfile.txt')
... chunk = file.read(100)
...
... while chunk:
...     print(chunk)
...     chunk = file.read(100)
>>> 
... file = open('myfile.txt')
...
... while chunk := file.read(100):
...     print(chunk)

2.6.5. Checking Match

>>> import re
>>>
>>> DATA = 'mwatney@nasa.gov'

Typically regular expressions requires to check if the value is not None before using it further:

>>> result = re.search(r'@nasa.gov', DATA)
>>>
>>> if result:
...     print(result)
<re.Match object; span=(7, 16), match='@nasa.gov'>

Assignment expressions allows to merge two independent lines into one coherent statement:

>>> if result := re.search(r'@nasa.gov', DATA):
...     print(result)
<re.Match object; span=(7, 16), match='@nasa.gov'>

2.6.6. Comprehensions

Let's define data:

>>> DATA = ['Mark Watney',
...         'Melissa Lewis',
...         'Rick Martinez']

Typical comprehension would require calling str.split() multiple times:

>>> result = [{'firstname': fullname.split()[0],
...            'lastname': fullname.split()[1]}
...           for fullname in DATA]
>>>
>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

Assignment expressions allows definition of a variable which can be used in the comprehension. It is not only more clear and readable, but also saves time and memory, especially if the function call would take a lot of resources:

>>> result = [{'firstname': name[0], 'lastname': name[1]}
...           for fullname in DATA
...           if (name := fullname.split())]
>>>
>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

2.6.7. Assignment vs Assignment Expression

>>> (x := 1)
1
>>>
>>> print(x)
1
>>> x = 1, 2
>>>
>>> print(x)
(1, 2)
>>> (x := 1, 2)
(1, 2)
>>>
>>> print(x)
1
>>> result = (x := 1, 2)
>>>
>>> print(result)
(1, 2)
>>> x = 0
>>> x += 1
>>>
>>> print(x)
1
>>> x = 0
>>> x +:= 1
Traceback (most recent call last):
SyntaxError: invalid syntax
>>> data = {}
>>> data['commander'] = 'Mark Watney'
>>>
>>> data = {}
>>> data['commander'] := 'Mark Watney'
Traceback (most recent call last):
SyntaxError: cannot use assignment expressions with subscript

2.6.8. Use Case - 0x01

  • Reusing Results

>>> def run(x):
...     return 1
>>>
>>>
>>> result = [run(x), run(x)+1, run(x)+2]
>>>
>>> result = [res := run(x), res+1, res+2]

2.6.9. Use Case - 0x02

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""
>>>
>>>
>>> result = [tuple(features + [species])
...           for row in DATA.splitlines()
...           if (line := row.split(','))
...           and (features := [float(x) for x in line[0:4]])
...           and (species := line[4])]
>>>
>>> print(result)  
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor')]

2.6.10. Use Case - 0x03

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""
>>> %%timeit -r 1000 -n 1000  
... result = []
... for line in DATA.splitlines():
...     *values, species = line.split(',')
...     values = map(float,values)
...     row = tuple(values) + (species,)
...     result.append(row)
3.18 µs ± 394 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> %%timeit -r 1000 -n 1000  
... result = [tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := [float(x) for x in row[:-1]])
...           and (species := row[-1])]
3.36 µs ± 423 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> %%timeit -r 1000 -n 1000  
... result = [tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1])]
2.97 µs ± 386 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> %%timeit -r 1000 -n 1000  
... result = (tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1]))
577 ns ± 53.3 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)

Note, that the generator expression will not return values, but create an object which execution will get values. This is the reason why this solution is such drastically fast.

2.6.11. Use Case - 0x04

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""
>>> result = [tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1])]
>>>
>>> result   
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor')]
>>> result = (tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1]))
>>>
>>> result  
<generator object <genexpr> at 0x...>
>>>
>>> next(result)
(5.8, 2.7, 5.1, 1.9, 'virginica')
>>>
>>> next(result)
(5.1, 3.5, 1.4, 0.2, 'setosa')
>>>
>>> next(result)
(5.7, 2.8, 4.1, 1.3, 'versicolor')
>>>
>>> next(result)
Traceback (most recent call last):
StopIteration

2.6.12. Use Case - 0x05

>>> DATA = [{'is_astronaut': True,  'name': 'Mark Jim WaTNey'},
...         {'is_astronaut': True,  'name': 'Melissa LewiS'},
...         {'is_astronaut': False, 'name': 'José Maria Jiménez'},
...         {'is_astronaut': True,  'name': 'RiCK MarTineZ'},
...         {'is_astronaut': False, 'name': 'Alex Vogel'}]

Comprehension:

>>> result = [{'firstname': person['name'].title().split()[0],
...            'lastname': person['name'].title().split()[-1]}
...           for person in DATA
...           if person['is_astronaut']]

Assignment expressions:

>>> result = [{'firstname': name[0],
...            'lastname': name[-1]}
...           for person in DATA
...           if person['is_astronaut']
...           and (name := person['name'].title().split())]

In both cases result is the same:

>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

2.6.13. Use Case - 0x06

>>> DATA = [{'is_astronaut': True,  'name': 'Mark Watney'},
...         {'is_astronaut': True,  'name': 'Melissa Lewis'},
...         {'is_astronaut': False, 'name': 'José Jiménez'},
...         {'is_astronaut': True,  'name': 'Rick Martinez'},
...         {'is_astronaut': False, 'name': 'Alex Vogel'}]
>>>
>>>
>>> astronauts = [{'firstname': fname, 'lastname': lname}
...                for person in DATA
...                if person['is_astronaut']
...                and (name := person['name'].split())
...                and (fname := name[0].capitalize())
...                and (lname := f'{name[1][0]}.')]
>>>
>>> print(astronauts)  
[{'firstname': 'Mark', 'lastname': 'W.'},
 {'firstname': 'Melissa', 'lastname': 'L.'},
 {'firstname': 'Rick', 'lastname': 'M.'}]

2.6.14. Use Case - 0x07

>>> DATA = [{'is_astronaut': True,  'name': 'Mark Watney'},
...         {'is_astronaut': True,  'name': 'Melissa Lewis'},
...         {'is_astronaut': False, 'name': 'José Jiménez'},
...         {'is_astronaut': True,  'name': 'Rick Martinez'},
...         {'is_astronaut': False, 'name': 'Alex Vogel'}]
>>>
>>>
>>> astronauts = [f'{fname} {lname[0]}.'
...               for person in DATA
...               if person['is_astronaut']
...               and (fullname := person['name'].split())
...               and (fname := fullname[0].capitalize())
...               and (lname := fullname[1].upper())]
>>>
>>>
>>> print(astronauts)
['Mark W.', 'Melissa L.', 'Rick M.']

2.6.15. Use Case - 0x08

In the following example dataclasses are used to automatically generate __init__() method based on the attributes:

>>> from dataclasses import dataclass
>>>
>>>
>>> @dataclass
... class Iris:
...     sepal_length: float
...     sepal_width: float
...     petal_length: float
...     petal_width: float
>>>
>>>
>>> class Versicolor(Iris):
...     pass
>>>
>>> class Virginica(Iris):
...     pass
>>>
>>> class Setosa(Iris):
...     pass
>>>
>>>
>>> DATA = [
...    ('SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'),
...    (5.8, 2.7, 5.1, 1.9, 'virginica'),
...    (5.1, 3.5, 1.4, 0.2, 'setosa'),
...    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
...    (6.3, 2.9, 5.6, 1.8, 'virginica'),
...    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
...    (4.7, 3.2, 1.3, 0.2, 'setosa'),
...    (7.0, 3.2, 4.7, 1.4, 'versicolor')]
>>>
>>>
>>> result = [iris(*values)
...           for *values, species in DATA[1:]
...           if (clsname := species.capitalize())
...           and (iris := globals()[clsname])]
>>>
>>> print(result)  
[Virginica(sepal_length=5.8, sepal_width=2.7, petal_length=5.1, petal_width=1.9),
 Setosa(sepal_length=5.1, sepal_width=3.5, petal_length=1.4, petal_width=0.2),
 Versicolor(sepal_length=5.7, sepal_width=2.8, petal_length=4.1, petal_width=1.3),
 Virginica(sepal_length=6.3, sepal_width=2.9, petal_length=5.6, petal_width=1.8),
 Versicolor(sepal_length=6.4, sepal_width=3.2, petal_length=4.5, petal_width=1.5),
 Setosa(sepal_length=4.7, sepal_width=3.2, petal_length=1.3, petal_width=0.2),
 Versicolor(sepal_length=7.0, sepal_width=3.2, petal_length=4.7, petal_width=1.4)]

2.6.16. Use Case - 0x09

>>> import re
>>>
>>>
>>> data = 'mark.watney@nasa.gov'
>>> pattern = r'([a-z]+)\.([a-z]+)@nasa.gov'

Procedural approach:

>>> match = re.match(pattern, data)
>>> result = match.groups() if match else None

Conditional statement requires to perform match twice in order to get results:

>>> result = re.match(pattern, data).groups() if re.match(pattern, data) else None

Assignment expressions allows to defile a variable and reuse it:

>>> result = x.groups() if (x := re.match(pattern, data)) else None

In all cases result is the same:

>>> print(result)
('mark', 'watney')

2.6.17. References

1

Angelico, C. and Peters, T. and van Rossum, G. PEP 572 -- Assignment Expressions. Python Software Foundation. Year: 2018. Retrieved: 2020-12-04. Url: https://www.python.org/dev/peps/pep-0572/#abstract

2.6.18. Assignments

Code 2.25. Solution
"""
* Assignment: Unpack Assignement Expression
* Complexity: medium
* Lines of code: 6 lines
* Time: 13 min

English:
    1. Split `DATA` by lines and then by colon `:`
    2. Extract system accounts
       (users with UID [third field] is less than 1000)
    3. Return list of system account logins
    4. Solve using list comprehension and assignment expression
    5. Mind the `root` user who has `uid == 0`
       (whether is not filtered-out in if statement)
    6. Run doctests - all must succeed

Polish:
    1. Podziel `DATA` po liniach a następnie po dwukropku `:`
    2. Wyciągnij konta systemowe
       (użytkownicy z UID [trzecie pole] mniejszym niż 1000)
    3. Zwróć listę loginów użytkowników systemowych
    4. Rozwiąż wykorzystując list comprehension i assignment expression
    5. Zwróć uwagę na użytkownika `root`, który ma `uid == 0`
       (czy nie jest odfiltrowany w instrukcji if)
    6. Uruchom doctesty - wszystkie muszą się powieść

Hint:
    * `str.splitlines()`
    * `str.strip()`
    * `str.split()`
    * `int()`
    * `bool(0) == False`
    * `bool('0') == True`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert len(result) > 0, \
    'Variable `result` cannot be empty'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is str for x in result), \
    'All rows in `result` should be str'

    >>> result
    ['root', 'bin', 'daemon', 'adm', 'shutdown', 'halt', 'nobody', 'sshd']
"""

DATA = """root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
watney:x:1000:1000:Mark Watney:/home/watney:/bin/bash
lewis:x:1001:1001:Melissa Lewis:/home/lewis:/bin/bash
martinez:x:1002:1002:Rick Martinez:/home/martinez:/bin/bash"""

# system account usernames (UID [third field] is less than 1000)
# type: list[str]
result = ...