Skip to content

gh-145376: Fix various reference leaks objects and modules#145385

Open
eendebakpt wants to merge 7 commits intopython:mainfrom
eendebakpt:refleaks
Open

gh-145376: Fix various reference leaks objects and modules#145385
eendebakpt wants to merge 7 commits intopython:mainfrom
eendebakpt:refleaks

Conversation

@eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Mar 1, 2026

@eendebakpt eendebakpt marked this pull request as ready for review March 1, 2026 10:28
@StanFromIreland
Copy link
Member

Is this going to be a competition between Claude and Codex now 😆?

@eendebakpt
Copy link
Contributor Author

eendebakpt commented Mar 1, 2026

Is this going to be a competition between Claude and Codex now 😆?

As long as the competition is on quality and not quantity we should be fine :-)

Update: I checked import.c which was changed in the PR by @JelleZijlstra. Claude found 4 (potential) issues, including the one in the other PR.

@eendebakpt eendebakpt changed the title gh-145376: Fix various reference leaks gh-145376: Fix various reference leaks objects and modules Mar 1, 2026
@encukou
Copy link
Member

encukou commented Mar 3, 2026

It defintely shouldn't be on quantity of PRs. Could you combine these rather than send several PRs at once?

@eendebakpt
Copy link
Contributor Author

It defintely shouldn't be on quantity of PRs. Could you combine these rather than send several PRs at once?

I know some other core devs strongly prefer to keep PRs small and isolated. For that reason I created more PRs targeting only a single module at a time. To give an idea: analyzing the Modules folder resulted in 60 potential bugs in 30 files. I estimate that about 20 of them are false positives, or tricky cases. With more prompting to the LLM additional potential issues might be found.

Let me know how you would like to group the remaining cases.

@encukou
Copy link
Member

encukou commented Mar 4, 2026

IMO, you should limit the number of similar PRs you have “in flight”, so if a common issue is found in one of them1 you don't need to update all the others. And to focus the review in one place.
Queue up PRs on your side, or combine similar changes to multiple files, or even send drafts :)


Footnotes

  1. for example: What needs a news entry? Or even: how much should you combine the PRs?

Comment on lines 3531 to 3553
static PyObject *
count_nextlong(countobject *lz)
{
PyObject *long_cnt;
PyObject *stepped_up;

long_cnt = lz->long_cnt;
if (long_cnt == NULL) {
/* Switch to slow_mode */
long_cnt = PyLong_FromSsize_t(PY_SSIZE_T_MAX);
if (long_cnt == NULL)
if (long_cnt == NULL) {
return NULL;
}
lz->long_cnt = long_cnt;
}
assert(lz->cnt == PY_SSIZE_T_MAX && long_cnt != NULL);

stepped_up = PyNumber_Add(long_cnt, lz->long_step);
if (stepped_up == NULL)
return NULL;
lz->long_cnt = stepped_up;
return long_cnt;
}
Copy link
Member

@encukou encukou Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the way to make it less confusing?

static PyObject *
count_nextlong(countobject *lz)
{
    if (lz->long_cnt == NULL) {
        /* Switch to slow_mode */
        lz->long_cnt = PyLong_FromSsize_t(PY_SSIZE_T_MAX);
        if (lz->long_cnt == NULL) {
            return NULL;
        }
    }
    assert(lz->cnt == PY_SSIZE_T_MAX && lz->long_cnt != NULL);

    // We hold one reference to "result" (a.k.a. the old value of
    // lz->long_cnt); we'll either return it or keep it in lz->long_cnt.
    PyObject *result = lz->long_cnt;

    PyObject *stepped_up = PyNumber_Add(result, lz->long_step);
    if (stepped_up == NULL) {
        return NULL;
    }
    lz->long_cnt = stepped_up;

    return result;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks less confusing.

@@ -0,0 +1 @@
Fix reference leaks in various unusual error scenarios.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that a NEWS entry is needed since it's unlikely to hit the issue, users are not directly impacted.

@vstinner
Copy link
Member

vstinner commented Mar 4, 2026

The current count_nextlong() code doesn't leak at if (long_cnt == NULL) return NULL; but it does leak long_cnt at if (stepped_up == NULL) return NULL;.

I like @encukou's code. It's more readable and doesn't leak on error.

// increment en_longindex with lock held, return the next index to be used
// or NULL on error
static inline PyObject *
increment_longindex_lock_held(enumobject *en)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refactor this function as you did for count_nextlong() to make it less confusing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants