[mypyc] feat: cache ids for fallback pythonic method lookups [1/1]#19870
[mypyc] feat: cache ids for fallback pythonic method lookups [1/1]#19870BobTheBuidler wants to merge 9 commits intopython:masterfrom
Conversation
| return NULL; | ||
| if (join_id_unicode == NULL) { | ||
| _Py_IDENTIFIER(join); | ||
| join_id_unicode = _PyUnicode_FromId(&PyId_join); /* borrowed */ |
There was a problem hiding this comment.
This is not thread-safe on free-threaded builds. I'm not sure what's the best way to work around this though. Using a relaxed memory order read could be sufficient. If we intern the string, this seems like a thread safe approach (as long as we don't use multiple subinterpreters, which are currently not supported).
Also the _Py_IDENTIFIER API is no longer part of the public API: python/cpython#108593. It would be good to have a replacement that uses public API as much as feasible if we are going to change these.
Below I give some ideas.
Here's how the atomic read/store might work (didn't check, based on LLM output):
#include <stdatomic.h>
static _Atomic(PyObject *) join_id_unicode = ATOMIC_VAR_INIT(NULL);
...
if (atomic_load_explicit(&join_id_unicode, memory_order_relaxed) == NULL) {
... atomic_store_explicit(...) ...
}Use PyUnicode_InternFromString to create a unicode object (once), instead of _PyUnicode_FromId.
Only update one use case first, and once we've agreed on a good approach, create a follow-up PR that migrates remaining use cases. This minimizes extra effort required to iteratively update based on review feedback.
Uses of `_PY_IDENTIFIER` will trigger a deprecation warning for Python 3.15. Replace these with statically allocated PyObject unicode objects which are being prevented from garbage collection via `PyUnicode_InternFromString`. The implementation is adopted from a similar one in numpy. https://github.com/numpy/numpy/blob/v2.4.0/numpy/_core/src/multiarray/npy_static_data.h https://github.com/numpy/numpy/blob/v2.4.0/numpy/_core/src/multiarray/npy_static_data.c https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_InternFromString Replaces #19870
|
#20460 replaces this |
This PR microoptimizes the usage of
_Py_IDENTIFIERand_PyUnicode_FromIdin various C files.Changes:
setdefault,update,keys,values,items,clear,copy), a static cache variable (e.g.,setdefault_id_unicode) is used to store the result of_PyUnicode_FromId._Py_IDENTIFIER(name)macro is now declared only inside the conditional block that runs the first time a function is called (i.e., when the corresponding cache variable isNULL)._PyUnicode_FromIdand repeated static identifier declarations.Example pattern after refactor: