No Image

The Python Language Summit 2023: Pattern Matching, __match__, and View Patterns

29 5 月, 2023 Alex Waygood 0
One of the most exciting new features in Python 3.10 was the introduction of pattern matching (introduced in PEPs 634, 635 and 636). Pattern matching has a wide variety of uses, but really shines in situations where you need to undergo complex destructurings of tree-like datastructures.

That’s a lot of words which may or may not mean very much to you – but consider, for example, using the ast module to parse Python source code. If you’re unfamiliar with the ast module: the module provides tools that enable you to compile Python source code into an “abstract syntax tree” (AST) representing the code’s structure. The Python interpreter itself converts Python source code into an AST in order to understand how to run that code – but parsing Python source code using ASTs is also a common task for linters, such as plugins for flake8 or pylint. In the following example, ast.parse() is used to parse the source code x = 42 into an ast.Module node, and ast.dump() is then used to reveal the tree-like structure of that node in a human-readable form:


>>> import ast
>>> source = "x = 42"
>>> node = ast.parse(source)
>>> node
<ast.Module object at 0x000002A70F928D80>
>>> print(ast.dump(node, indent=2))
Module(
  body=[
    Assign(
      targets=[
        Name(id='x', ctx=Store())],
      value=Constant(value=42))],
  type_ignores=[])

How does working with ASTs relate to pattern-matching? Well, a function to determine whether (to a reasonable approximation) an arbitrary AST node represents the symbol collections.deque might have looked something like this, before pattern matching…


import ast

# This obviously won't work if the symbol is imported with an alias
# in the source code we're inspecting
# (e.g. "from collections import deque as d").
# But let's not worry about that here :-)

def node_represents_collections_dot_deque(node: ast.AST) -> bool:
    """Determine if *node* represents 'deque' or 'collections.deque'"""
    return (
        isinstance(node, ast.Name) and node.id == "deque"
    ) or (
        isinstance(node, ast.Attribute)
        and isinstance(node.value, ast.Name)
        and node.value.id == "collections"
        and node.value.attr == "deque"
    )

But in Python 3.10, pattern matching allows an elegant destructuring syntax:


import ast

def node_represents_collections_dot_deque(node: ast.AST) -> bool:
    """Determine if *node* represents 'deque' or 'collections.deque'"""
    match node:
        case ast.Name("deque"):
            return True
        case ast.Attribute(ast.Name("collections"), "deque"):
            return True
        case _:
            return False

I know which one I prefer.

For some, though, this still isn’t enough – and Michael “Sully” Sullivan is one of them. At the Python Language Summit 2023, Sullivan shared ideas for where pattern matching could go next.


Playing with matches (without getting burned)


Sullivan’s contention is that, while pattern matching provides elegant syntactic sugar in simple cases such as the one above, our ability to chain destructurings using pattern matching is currently fairly limited. For example, say we want to write a function inspecting Python AST that takes an ast.FunctionDef node and identifies whether the node represents a synchronous function with exactly two parameters, both of them annotated as accepting integers. The function would behave so that the following holds true:


>>> import ast
>>> source = "def add_2(number1: int, number2: int): pass"
>>> node = ast.parse(source).body[0]
>>> type(node)
<class 'ast.FunctionDef'>
>>> is_function_taking_two_ints(node)
True

With pre-pattern-matching syntax, we might have written such a function like this:


def is_int(node: ast.AST | None) -> bool:
    """Determine if *node* represents 'int' or 'builtins.int'"""
    return (
        isinstance(node, ast.Name) and node.id == "int"
    ) or (
        isinstance(node, ast.Attribute)
        and isinstance(node.value, ast.Name)
        and node.value.id == "builtins"
        and node.attr == "int"
    )

def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
    """Determine if *node* represents a function that accepts two ints"""
    args = node.args.posonlyargs + node.args.args
    return len(args) == 2 and all(is_int(node.annotation) for node in args)
If we wanted to rewrite this using pattern matching, we could possibly do something like this:


def is_int(node: ast.AST | None) -> bool:
    """Determine if *node* represents 'int' or 'builtins.int'"""
    match node:
        case ast.Name("int"):
            return True
        case ast.Attribute(ast.Name("builtins"), "int"):
            return True
        case _:
            return False

def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
    """Determine if *node* represents a function that accepts two ints"""
    match node.args.posonlyargs + node.args.args:
        case [ast.arg(), ast.arg()] as arglist:
            return all(is_int(arg.annotation) for arg in arglist)
        case _:
            return False

That leaves a lot to be desired, however! The is_int() helper function can be rewritten in a much cleaner way. But integrating it into the is_function_taking_two_ints() is… somewhat icky! The code feels harder to understand than before, whereas the goal of pattern matching is to improve readability.

Something like this, (ab)using metaclasses, gets us a lot closer to what it feels pattern matching should be like. By using one of Python’s hooks for customising isinstance() logic, it’s possible to rewrite our is_int() helper function as a class, meaning we can seamlessly integrate it into our is_function_taking_two_ints() function in a very expressive way:


import abc
import ast

class PatternMeta(abc.ABCMeta):
    def __instancecheck__(cls, inst: object) -> bool:
        return cls.match(inst)

class Pattern(metaclass=PatternMeta):
    """Abstract base class for types representing 'abstract patterns'"""
    @staticmethod
    @abc.abstractmethod
    def match(node) -> bool:
        """Subclasses must override this method"""
        raise NotImplementedError

class int_node(Pattern):
    """Class representing AST patterns signifying `int` or `builtins.int`"""
    @staticmethod
    def match(node) -> bool:
        match node:
            case ast.Name("int"):
                return True
            case ast.Attribute(ast.Name("builtins"), "int"):
                return True
            case _:
                return False

def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
    """Determine if *node* represents a function that accepts two ints"""
    match node.args.posonlyargs + node.args.args:
        case [
            ast.arg(annotation=int_node()), 
            ast.arg(annotation=int_node()),
        ]:
            return True
        case _:
            return False

This is still hardly ideal, however – that’s a lot of boilerplate we’ve had to introduce to our helper function for identifying int annotations! And who wants to muck about with metaclasses?


A slide from Sullivan’s talk




A __match__ made in heaven?


Sullivan proposes that we make it easier to write helper functions for pattern matching, such as the example above, without having to resort to custom metaclasses. Two competing approaches were brought for discussion.

The first idea – a __match__ special method – is perhaps the easier of the two to immediately grasp, and appeared in early drafts of the pattern matching PEPs. (It was eventually removed from the PEPs in order to reduce the scope of the proposed changes to Python.) The proposal is that any class could define a __match__ method that could be used to customise how match statements apply to the class. Our is_function_taking_two_ints() case could be rewritten like so:


class int_node:
    """Class representing AST patterns signifying `int` or `builtins.int`"""
    # The __match__ method is understood by Python to be a static method,
    # even without the @staticmethod decorator,
    # similar to __new__ and __init_subclass__
    def __match__(node) -> ast.Name | ast.Attribute:
        match node:
            case ast.Name("int"):
                # Successful matches can return custom objects,
                # that can be bound to new variables by the caller
                return node
            case ast.Attribute(ast.Name("builtins"), "int"):
                return node
            case _:
                # Return `None` to indicate that there was no match
                return None

def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
    """Determine if *node* represents a function that accepts two ints"""
    match node.args.posonlyargs + node.args.args:
        case [
            ast.arg(annotation=int_node()), 
            ast.arg(annotation=int_node()),
        ]:
            return True
        case _:
            return False

The second idea is more radical: the introduction of some kind of new syntax (perhaps reusing Python’s -> operator) that would allow Python coders to “apply” functions during pattern matching. With this proposal, we could rewrite is_function_taking_two_ints() like so:


def is_int(node: ast.AST | None) -> bool:
    """Determine if *node* represents 'int' or 'builtins.int'"""
    match node:
        case ast.Name("int"):
            return True
        case ast.Attribute(ast.Name("builtins"), "int"):
            return True
        case _:
            return False

def is_function_taking_two_ints(node: ast.FunctionDef) -> bool:
    """Determine if *node* represents a function that accepts two ints"""
    match node.args.posonlyargs + node.args.args:
        case [
            ast.arg(annotation=is_int -> True),
            ast.arg(annotation=is_int -> True),
        ]
        case _:
            return False




Match-maker, match-maker, make me a __match__



A slide from Sullivan’s talk

The reception in the room to Sullivan’s ideas was positive; the consensus seemed to be that there was clearly room for improvement in this area. Brandt Bucher, author of the original pattern matching implementation in Python 3.10, concurred that this kind of enhancement was needed. Łukasz Langa, meanwhile, said he’d received many queries from users of other programming languages such as C#, asking how to tackle this kind of problem.

The proposal for a __match__ special method follows a pattern common in Python’s data model, where double-underscore “dunder” methods are overridden to provide a class with special behaviour. As such, it will likely be less jarring, at first glance, to those new to the idea. Attendees of Sullivan’s talk seemed, broadly, to slightly prefer the __match__ proposal, and Sullivan himself said he thought it “looked prettier”.

Jelle Zijlstra argued that the __match__ dunder would provide an elegant symmetry between the construction and destruction of objects. Brandt Bucher, meanwhile, said he thought the usability improvements weren’t significant enough to merit new syntax.

Nonetheless, the alternative proposal for new syntax also has much to recommend it. Sullivan argued that having dedicated syntax to express the idea of “applying” a function during pattern matching was more explicit. Mark Shannon agreed, noting the similarity between this idea and features in the Haskell programming language. “This is functional programming,” Shannon argued. “It feels weird to apply OOP models to this.”




Addendum: pattern-matching resources and recipes


In the meantime, while we wait for a PEP, there are plenty of innovative uses of pattern matching springing up in the ecosystem. For further reading/watching/listening, I recommend:

No Image

Python 3.12.0 beta 1 released

22 5 月, 2023 Thomas Wouters 0

I’m pleased to announce the release of Python 3.12 beta 1 (and feature freeze for Python 3.12).https://www.python.org/downloads/release/python-3120b1/This is a beta preview of Python 3.12Python 3.12 is still in development. This release, 3.12.0b1, is t…

No Image

PSF Board Election Dates for 2023

11 5 月, 2023 Deb Nicholson 0

Board elections are a chance for the community to choose representatives to help the PSF create a vision for and build the future of the Python community. This year there are 4 seats open on the PSF board. You can see who is on the board currently here…

No Image

How to create execution environments using ansible-builder

8 5 月, 2023 rh-ee-tpaul 0

How to create execution environments using ansible-builder

The execution environment builder (aka Ansible Builder) is a part of Red Hat Ansible Automation Platform. It is a command-line interface (CLI) tool for building and creating custom execution environments. The Ansible Builder project enables users to automate and accelerate the process of creating execution environments. This article will show you how to install and use the execution environment builder CLI tool.

Installing the execution environment builder

The execution environment builder makes it easier for Ansible Automation Platform content creators and administrators to build custom execution environments. They can use dependency information from various Ansible Content Collections and directly from the user.

Step 1: Install the execution environment builder tool

Install the execution environment builder tool from the Python Package Index (PyPI) by using the following command:

pip install ansible-builder

Step 2: Access the ansible-builder subcommands

To access the subcommands of ansible-builder, run build and create commands to get help output.

The build subcommand will build the execution environment using the definition file.

ansible-builder build –help

It populates the build context and then uses Podman or Docker to create the execution environment image. The help output appears as follows:

usage: ansible-builder build [-h] [-t TAG] [--container-runtime {podman,docker}] [--build-arg BUILD_ARGS] [-f FILENAME] [-c BUILD_CONTEXT]
                             [--output-filename {Containerfile,Dockerfile}] [-v {0,1,2,3}]

Creates a build context (including a Containerfile) from an execution environment spec. The build context will be populated from the execution environment spec. After that, the specified container runtime podman/docker will be invoked to build an image from that definition. After building the image, it can be used locally or published using the
supplied tag.

optional arguments:
  -h, --help            show this help message and exit
  -t TAG, --tag TAG     The name for the container image being built (default: ansible-execution-env:latest)
  --container-runtime {podman,docker}
                        Specifies which container runtime to use (default: podman)
  --build-arg BUILD_ARGS
                        Build-time variables to pass to any podman or docker calls. Internally ansible-builder makes use of ANSIBLE_GALAXY_CLI_COLLECTION_OPTS, EE_BASE_IMAGE,
                        EE_BUILDER_IMAGE.
  -f FILENAME, --file FILENAME
                        The definition of the execution environment (default: execution-environment.yml)
  -c BUILD_CONTEXT, --context BUILD_CONTEXT
                        The directory to use for the build context (default: context)
  --output-filename {Containerfile,Dockerfile}
                        Name of file to write image definition to (default depends on --container-runtime, Containerfile for podman and Dockerfile for docker)
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Increase the output verbosity, for up to three levels of verbosity (invoked via "--verbosity" or "-v" followed by an integer ranging in value from 0 to
                        3) (default: 2)

The create subcommand works similar to the build command.

ansible-builder create –help

However, it will not build the execution environment image as you will see in the following output:

usage: ansible-builder build [-h] [-t TAG] [--container-runtime {podman,docker}] [--build-arg BUILD_ARGS] [-f FILENAME] [-c BUILD_CONTEXT]
                             [--output-filename {Containerfile,Dockerfile}] [-v {0,1,2,3}]

Creates a build context (including a Containerfile) from an execution environment spec. The build context will be populated from the execution environment spec. After that,
the specified container runtime podman/docker will be invoked to build an image from that definition. After building the image, it can be used locally or published using the
supplied tag.

optional arguments:
  -h, --help            show this help message and exit
  -t TAG, --tag TAG     The name for the container image being built (default: ansible-execution-env:latest)
  --container-runtime {podman,docker}
                        Specifies which container runtime to use (default: podman)
  --build-arg BUILD_ARGS
                        Build-time variables to pass to any podman or docker calls. Internally ansible-builder makes use of ANSIBLE_GALAXY_CLI_COLLECTION_OPTS, EE_BASE_IMAGE,
                        EE_BUILDER_IMAGE.
  -f FILENAME, --file FILENAME
                        The definition of the execution environment (default: execution-environment.yml)
  -c BUILD_CONTEXT, --context BUILD_CONTEXT
                        The directory to use for the build context (default: context)
  --output-filename {Containerfile,Dockerfile}
                        Name of file to write image definition to (default depends on --container-runtime, Containerfile for podman and Dockerfile for docker)
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Increase the output verbosity, for up to three levels of verbosity (invoked via "--verbosity" or "-v" followed by an integer ranging in value from 0 to
                        3) (default: 2)

Step 3: Populate the ansible-builder spec

Populate the ansible-builder spec to build the custom execution environment by running the following command:

mkdir project_directory && cd project_directory

Populate the execution-environment.yml file:

cat > execution-environment.yml
---
version: 1
dependencies:
  galaxy: requirements.yml
EOT

Create a requirements.yml file and populate the contents with the following:

cat > requirements.yml
---
collections:
  - name: servicenow.itsm
EOT

Through the spec and requirements file, we ensure that execution environment builder will download the servicenow.itsm collection while building the execution environment. The default download location is galaxy.ansible.com. You can also point to an automation hub or your own hub instance in the spec file.

Step 4: Build the execution environment

Build the execution environment using the previously created files. Run the following command to create a new custom execution environment called custom-ee:

ansible-builder build -v3 -t custom-ee

The -v3 flag adds verbosity to the CLI run, and -t custom-ee will tag your image with the name you provided.

The output appears as follows:

Ansible Builder is building your execution environment image, "custom-ee".
File context/_build/requirements.yml will be created.
Rewriting Containerfile to capture collection requirements
Running command:
  podman build -f context/Containerfile -t custom-ee context
[1/3] STEP 1/7: FROM registry.redhat.io/ansible-automation-platform-21/ee-minimal-rhel8:latest AS galaxy
[1/3] STEP 2/7: ARG ANSIBLE_GALAXY_CLI_COLLECTION_OPTS=
--> 88d9ea223d0
[1/3] STEP 3/7: USER root
--> 549f29055c2
[1/3] STEP 4/7: ADD _build /build
--> 0d3e9515b12
[1/3] STEP 5/7: WORKDIR /build
--> 3b290acf78c
[1/3] STEP 6/7: RUN ansible-galaxy role install -r requirements.yml --roles-path /usr/share/ansible/roles
Skipping install, no requirements found
--> 8af36370e78
[1/3] STEP 7/7: RUN ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path /usr/share/ansible/collections
Starting galaxy collection install process
Process install dependency map
…

Run the following commands to check the image list:

podman images

The output appears as follows:

REPOSITORY                                                   TAG         IMAGE ID           CREATED                SIZE
localhost/custom-ee                                        latest      bfe6c40bad52    21 seconds ago      626 MB

Step 5: Build a complex execution environment

To build a complex execution environment, go back into the project directory with the following command:

cd project_directory

Edit the execution-environment.yml file and add the following content:

cat > execution-environment.yml
---
version: 1
dependencies:
  galaxy: requirements.yml
  python: requirements.txt
  system: bindep.txt
additional_build_steps:
  prepend: |
    RUN whoami
    RUN cat /etc/os-release
  append:
    - RUN echo This is a post-install command!
    - RUN ls -la /etc
EOT

We can see the following:

  • Python requirements were added through the requirements.txt file, which will hold the pip dependencies.
  • We added a bindep.txt, which will hold the rpm installs.
  • Additional build steps that will run before (prepend) and after (append) the build steps.

Now create a new file called requirements.yml and append the following content:

cat > requirements.yml
---

collections:

  - name: servicenow.itsm
  - name: ansible.utils
EOT

We added a new collection called ansible.utils alongside the servicenow.itsm file.

Create a new file called requirements.txt and then append the following:

cat > requirements.txt
gcp-cli
ncclient
netaddr
paramiko
EOT

This contains the Python requirements that need to be installed via pip.

Create a new file called bindep.txt and then append the following:

cat > bindep.txt
findutils [compile platform:centos-8 platform:rhel-8]
gcc [compile platform:centos-8 platform:rhel-8]
make [compile platform:centos-8 platform:rhel-8]
python38-devel [compile platform:centos-8 platform:rhel-8]
python38-cffi [platform:centos-8 platform:rhel-8]
python38-cryptography [platform:centos-8 platform:rhel-8]
python38-pycparser [platform:centos-8 platform:rhel-8]
EOT

This file contains the rpm requirements needed to be installed using dnf.

Run the following build:

ansible-builder build -v3 -t custom-ee

The output is as follows:

Ansible Builder is building your execution environment image, "custom-ee".
File context/_build/requirements.yml will be created.
File context/_build/requirements.txt will be created.
File context/_build/bindep.txt will be created.
Rewriting Containerfile to capture collection requirements
Running command:
  podman build -f context/Containerfile -t custom-ee context
[1/3] STEP 1/7: FROM registry.redhat.io/ansible-automation-platform-21/ee-minimal-rhel8:latest AS galaxy
[1/3] STEP 2/7: ARG ANSIBLE_GALAXY_CLI_COLLECTION_OPTS=
--> Using cache 88d9ea223d01bec0d53eb7efcf0e76b5f7da0285a411f2ce0116fe9641cbc3a0
--> 88d9ea223d0
[1/3] STEP 3/7: USER root
--> Using cache 549f29055c2f1ba0ef3f7c5dfdc67a40302ff0330af927adb94fbcd7b0b1e7b4
--> 549f29055c2
[1/3] STEP 4/7: ADD _build /build
--> 6b9ee91e773
[1/3] STEP 5/7: WORKDIR /build
--> 5518e019f2d
[1/3] STEP 6/7: RUN ansible-galaxy role install -r requirements.yml --roles-path /usr/share/ansible/roles
Skipping install, no requirements found
--> 60c1605d66c
[1/3] STEP 7/7: RUN ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path /usr/share/ansible/collections
Starting galaxy collection install process
Process install dependency map

You can check the context or Containerfile to see all the steps you took to build the execution environment. You can transfer the context directory to a different server and replicate the image creation via docker or podman commands.

Pushing the execution environment to a private automation hub

Log in to the private automation hub by using the podman command:

podman login 

Then tag the image before pushing it to the hub as follows:

podman tag localhost/custom-ee /developers-bu-aap-builder

Finally, push it to the private automation hub as follows:

podman push /developers-bu-aap-builder

We can see the image pushed to the private automation hub in Figure 1:

The private automation hub page showing multiple pushed execution environment images.
Figure 1: The private automation hub page showing multiple pushed execution environment images.

Continue your automation journey with Ansible Automation Platform

Get started with Ansible Automation Platform by exploring interactive labs. Check out Red Hat’s hands-on labs for all skill levels to learn more. The wide range of labs include useful Linux commands, Install software using package managers, and Deploying containers using container tools [podman]. Try these labs to see your favorite products in action. Ansible Automation Platform is also available as a managed offering on Microsoft Azure and as a self-managed offering on AWS.

rh-ee-tpaul
Mon, 05/08/2023 – 07:00
Tathagata Paul

No Image

Thank You for Many Years of Service, Van!

13 4 月, 2023 Deb Nicholson 0

We are wishing farewell to Van Lindberg after 16 years of service to the PSF, who has decided to step down from his Board Director and General Counsel roles. He helped us grow from a small volunteer organization with no staff, to an organization with 9…