Skip to content

collections.defaultdict's __repr__ can lead to infinite recursion #145492

@KowalskiThomas

Description

@KowalskiThomas

I have spotted what I believe is a bug in the implementation of defaultdict's __repr__ method (code).

Long story short, the current implementation does not properly reset the recursion guard after delegating to the factory's __repr__ (when the factory's __repr__ calls back into defaultdict's __repr__). As far as I can tell, this comes from Py_ReprLeave not being called under the right conditions.

I've made this reproducer, I think it's as simple as it can get:

from collections import defaultdict

class ProblematicFactory:
    def __call__(self) -> dict[str, str]:
        return {}

    def __repr__(self):
        print(f"dd: {repr(dd)}")

        return f"ProblematicFactory for {repr(dd)}"

dd = defaultdict[str, dict[str, str]](ProblematicFactory())
print(repr(dd.default_factory))

Running this gives...

% python3 repro.py
dd: defaultdict(..., {})
... LOTS OF THIS
dd: defaultdict(..., {})
Traceback (most recent call last):
  File "/Users/thomas.kowalski/Documents/cpython/repro.py", line 13, in <module>
    print(repr(dd.default_factory))
          ~~~~^^^^^^^^^^^^^^^^^^^^
  File "/Users/thomas.kowalski/Documents/cpython/repro.py", line 8, in __repr__
    print(f"dd: {repr(dd)}")
                 ~~~~^^^^
  File "/Users/thomas.kowalski/Documents/cpython/repro.py", line 10, in __repr__
    return f"ProblematicFactory for {repr(dd)}"
                                     ~~~~^^^^
  File "/Users/thomas.kowalski/Documents/cpython/repro.py", line 10, in __repr__
    return f"ProblematicFactory for {repr(dd)}"
                                     ~~~~^^^^
  File "/Users/thomas.kowalski/Documents/cpython/repro.py", line 10, in __repr__
    return f"ProblematicFactory for {repr(dd)}"
                                     ~~~~^^^^
  [Previous line repeated 995 more times]

Going back to the current code:

int status = Py_ReprEnter(dd->default_factory);
if (status != 0) { // if we already are in a __repr__ call for default factory
    if (status < 0) {
        Py_DECREF(baserepr);
        return NULL; // we early-return without "leaving" which is correct
    }
    defrepr = PyUnicode_FromString("...");
}
else // we are NOT in a __repr__ call for default factory
    defrepr = PyObject_Repr(dd->default_factory); // no early return, we fall through

// we may or may not be in a __repr__ call for default factory
// resetting here is not OK, it's not our responsibility
Py_ReprLeave(dd->default_factory);

As far as I can tell, the fix would be to move the final Py_ReprLeave to the else body so that if we already are in the factory's __repr__ we don't reset it ourselves, we let whoever entered it reset it.

Note I have a branch ready with that fix and can open it here if/once the bug is confirmed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions