Skip to content

fs: add c++ fast path for writeFileSync string utf8#49884

Merged
nodejs-github-bot merged 1 commit intonodejs:mainfrom
CanadaHonk:fast-writefilesync-utf8
Nov 27, 2023
Merged

fs: add c++ fast path for writeFileSync string utf8#49884
nodejs-github-bot merged 1 commit intonodejs:mainfrom
CanadaHonk:fast-writefilesync-utf8

Conversation

@CanadaHonk
Copy link
Member

@CanadaHonk CanadaHonk commented Sep 26, 2023

Summary

Added fast path in almost entirely C++ for writeFileSync with UTF8 encoding and string data. Also improves appendFileSync as it just uses writeFileSync under the hood. Only string data as Buffer seems questionable in benchmarks for this so I'll just leave it for strings for now.

TL;DR: This makes writeFileSync(path, data: string) UTF8 up to ~2.5x faster depending on data size, especially when using file descriptors

Bench results

Benchmark in this PR

Note: running locally on Linux/i9/SSD

                                                                                                                 confidence improvement accuracy (*)   (**)  (***)
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=1024 useFd='false' encoding='utf8'                   -0.37 %       ±4.40% ±5.90% ±7.77%
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=1024 useFd='true' encoding='utf8'            ***     49.15 %       ±5.11% ±6.82% ±8.91%
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=102400 useFd='false' encoding='utf8'         ***      6.69 %       ±1.67% ±2.22% ±2.89%
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=102400 useFd='true' encoding='utf8'          ***     74.52 %       ±4.35% ±5.85% ±7.74%
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=1048576 useFd='false' encoding='utf8'        ***     16.05 %       ±1.60% ±2.14% ±2.81%
fs/bench-writeFileSync.js n=1000 func='writeFile' useBuffer='false' length=1048576 useFd='true' encoding='utf8'         ***     85.82 %       ±3.58% ±4.77% ±6.23%

Benchmark CI (old): https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1421/

Alternative benchmark

Alternative benchmark using Bun's bench for fs.copyFileSync (string data, using paths)

Current (main)

benchmark           time (avg)             (min … max)       p75       p99      p995
------------------------------------------------------ -----------------------------
12 ascii          2.19 µs/iter     (2.07 µs … 2.67 µs)   2.21 µs   2.67 µs   2.67 µs
12 utf8           2.23 µs/iter     (2.08 µs … 2.32 µs)   2.26 µs   2.32 µs   2.32 µs
12288 ascii       6.14 µs/iter     (5.97 µs … 6.62 µs)   6.14 µs   6.62 µs   6.62 µs
18432 utf8       40.02 µs/iter  (36.13 µs … 204.16 µs)  40.05 µs   63.6 µs  69.01 µs

This PR

Long ascii: 6.14µs -> 3.46µs (~1.8x speedup)
Long utf8: 40.02µs -> 18.25µs (~2.2x speedup)

benchmark           time (avg)             (min … max)       p75       p99      p995
------------------------------------------------------ -----------------------------
12 ascii          1.97 µs/iter      (1.9 µs … 2.03 µs)   1.98 µs   2.03 µs   2.03 µs
12 utf8           1.95 µs/iter        (1.91 µs … 2 µs)   1.97 µs      2 µs      2 µs
12288 ascii       3.46 µs/iter      (3.39 µs … 3.6 µs)   3.47 µs    3.6 µs    3.6 µs
18432 utf8       18.25 µs/iter  (16.06 µs … 147.95 µs)  18.48 µs  21.77 µs  24.03 µs

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Sep 26, 2023
@CanadaHonk CanadaHonk marked this pull request as draft September 26, 2023 18:43
@anonrig anonrig added performance Issues and PRs related to the performance of Node.js. notable-change PRs with changes that should be highlighted in changelogs. labels Sep 26, 2023
@github-actions
Copy link
Contributor

The notable-change PRs with changes that should be highlighted in changelogs. label has been added by @anonrig.

Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment.

@CanadaHonk CanadaHonk marked this pull request as ready for review September 26, 2023 20:36
@CanadaHonk CanadaHonk force-pushed the fast-writefilesync-utf8 branch from 98c5cd5 to 7ab6586 Compare September 26, 2023 20:36
@CanadaHonk
Copy link
Member Author

Rebased on latest main and addressed various review comments. Should be good for review fully now.

@CanadaHonk CanadaHonk requested review from anonrig and ronag September 26, 2023 21:21
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@anonrig
Copy link
Member

anonrig commented Sep 26, 2023

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@nodejs-github-bot
Copy link
Collaborator

@CanadaHonk
Copy link
Member Author

@santigimeno and/or @bnoordhuis, would you mind checking/reviewing the latest revision? Uses a write loop like original JS source now. Thanks!

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 23, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 23, 2023
@nodejs-github-bot
Copy link
Collaborator

@CanadaHonk CanadaHonk force-pushed the fast-writefilesync-utf8 branch 2 times, most recently from 67f5efd to 6704ac9 Compare November 23, 2023 14:28
@CanadaHonk CanadaHonk added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 25, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 25, 2023
@nodejs-github-bot
Copy link
Collaborator

@CanadaHonk CanadaHonk force-pushed the fast-writefilesync-utf8 branch from 6704ac9 to 6747d5d Compare November 26, 2023 12:10
@CanadaHonk
Copy link
Member Author

Rebased on latest main (+ poking flaky GH CI)

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

Landed in 4466dee

@RafaelGSS RafaelGSS mentioned this pull request Nov 28, 2023
@RafaelGSS
Copy link
Member

@CanadaHonk Do you have benchmark results from our CI? I don't we should mention local benchmarks in the changelog.

@CanadaHonk
Copy link
Member Author

@RafaelGSS Not any recent/existing :p

@jeremymeng
Copy link

Some of our tests using writeFileSync are now broken in 21.3.0 #50989

Is this related?

@ChALkeR
Copy link
Member

ChALkeR commented Nov 2, 2025

Was flush option considered here?
This landed after #50009

@ChALkeR
Copy link
Member

ChALkeR commented Nov 2, 2025

base64 writes are also a thing, as it covers binary writes when content is already in base64
and could also be optimized via a fast path (e.g. by adding an encoding parameter here)

7113870	selenium-webdriver-4.38.0.tgz/lib/webdriver.js:1742:    fs.writeFileSync(zipFilePath, Buffer.from(base64Content, 'base64'))
1563070	cucumber-html-reporter-7.2.0.tgz/lib/reporter.js:320:                  fs.writeFileSync(filename, embedding.data, 'base64');
1563070	cucumber-html-reporter-7.2.0.tgz/lib/reporter.js:334:                  fs.writeFileSync(filename, embedding.data, 'base64');
313200	@loki/diff-looks-same-0.35.0.tgz/src/create-looks-same-differ.spec.js:13:  fs.writeFileSync(outputPath, imageData, { encoding: 'base64' });
307699	@loki/diff-pixelmatch-0.35.0.tgz/src/create-pixelmatch-differ.spec.js:13:  fs.writeFileSync(outputPath, imageData, { encoding: 'base64' });
204239	backstopjs-6.3.25.tgz/core/util/compare/store-failed-diff-stub.js:10:  fs.writeFileSync(getFailedDiffFilename(testPath), BASE64_PNG_STUB, 'base64');
195127	vscode-extension-tester-8.19.0.tgz/out/browser.js:233:        fs.writeFileSync(path.join(dir, `${name}.png`), data, 'base64');
82614	base64-img-1.0.4.tgz/base64-img.js:113:  fs.writeFileSync(filepath, result.base64, { encoding: 'base64' });
74527	remove.bg-1.3.0.tgz/index.js:77:                fs.writeFileSync(options.outputFile, result.body.data.result_b64, { encoding: "base64" });
73121	pxt-core-12.2.3.tgz/built/cli.js:2936:            nodeutil.writeFileSync(p, contents, { encoding: "base64" });
73121	pxt-core-12.2.3.tgz/built/pxt.js:165454:            nodeutil.writeFileSync(p, contents, { encoding: "base64" });
...

Left column is downloads/month

selenium-webdriver needs a fix though: SeleniumHQ/selenium#16542

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs. performance Issues and PRs related to the performance of Node.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.