Skip to content

stream: construct#29656

Closed
ronag wants to merge 2 commits intonodejs:masterfrom
nxtedition:stream-construct
Closed

stream: construct#29656
ronag wants to merge 2 commits intonodejs:masterfrom
nxtedition:stream-construct

Conversation

@ronag
Copy link
Member

@ronag ronag commented Sep 22, 2019

Provide a standardized way of asynchronously creating and initializing resources before performing any work.

Some streams need to first asynchronously create resources before they can perform any work. Currently this is implemented in the different stream implementations which both makes things partly duplicated, more difficult, more error prone and often incorrect to some degree (e.g. 'open' and 'ready' are emitted after 'close').

This PR provides a standardized way of asynchronously constructing streams and handles the "pending" state that occurs until construction has either completed or failed.

This will allow further simplification and improved consistency for various stream implementations such as fs and net stream.

Passes the graceful-fs test suite.

This will make it possible to easily implement more complex stream such as e.g. fs streams:

const { Writable } = require('stream');
const fs = require('fs')

class WriteStream extends Writable {
  constructor (options) {
    options.autoDestroy = true;
    super(options);
  }
  _construct({ filename }, callback) {
    this.filename = filename;
    this.fd = null;
    fs.open(this.filename, (fd, err) => {
      if (err) {
        callback(err);
      } else {
        this.fd = fd;
        callback();
      }
    });
  }
  _write(chunk, encoding, callback) {
    fs.write(this.fd, chunk, callback);
  }
  _destroy(err, callback) {
    if (this.fd) {
      fs.close(this.fd, (er) => callback(er || err));
    } else {
      callback(err);
    }
  }
}
const { Readable } = require('stream');
const fs = require('fs')

class ReadStream extends Readable {
  constructor (options) {
    options.autoDestroy = true;
    super(options);
  }
  _construct({ filename }, callback) {
    this.filename = filename;
    this.fd = null;
    fs.open(this.filename, (fd, err) => {
      if (err) {
        callback(err);
      } else {
        this.fd = fd;
        callback();
      }
    });
  }
  _read(n) {
    const buf = Buffer.alloc(n);
    fs.read(this.fd, buf, 0, n, null, (err, bytesRead) => {
      if (err) {
        this.destroy(err);
      } else {
        this.push(bytesRead > 0 ? buf : null);
      }
    });
  }
  _destroy(err, callback) {
    if (this.fd) {
      fs.close(this.fd, (er) => callback(er || err));
    } else {
      callback(err);
    }
  }
}

Furthermore it makes it easier to e.g. add transforms into pipeline by inlining initialization:

const { Duplex } = require('stream');
const fs = require('fs')

stream.pipeline(
  fs.createReadStream('object.json')
    .setEncoding('utf-8'),
  new Duplex({
    construct (options, callback) {
      this.data = '';
      callback();
    },
    transform (chunk, encoding, callback) {
      this.data += chunk;
      callback();
    },
    flush(callback) {
      try {
        // Make sure is valid json.
        JSON.parse(this.data);
        this.push(this.data);
      } catch (err) {
        callback(err);
      }
    }
  }),
  fs.createWriteStream('valid-object.json')
);
Semver

I think this should be semver-major.

  • fs: Emit some previously synchronous errors asynchronously. See updated test. Bug fix.
  • fs: open() is removed but it still works if monkey-patched.
  • fs: Don't emit 'close' twice if emitClose: true.
  • fs: stream.fd is nulled during destroy instead of final. See updated test.

Otherwise, this should be pretty non breaking as far as I can tell. graceful-fs tests pass and there are tests for compat.

The changes are mostly opt-in through the construct method, except for the previously listed updates to fs streams.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

Refs: #29314, #23133

NOTE TO SELF: After merge look into:

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Sep 22, 2019
@ronag ronag mentioned this pull request Sep 22, 2019
4 tasks
@ronag ronag changed the title stream: provide a _construct endpoint for async stream initialization. stream: construct Sep 22, 2019
@ronag
Copy link
Member Author

ronag commented Sep 22, 2019

@Fishrock123 I just started looking a bit into Bob streams and I'm curious about your take on this.

@ronag ronag force-pushed the stream-construct branch 6 times, most recently from 10bf659 to 879c459 Compare September 22, 2019 21:24
@mscdex
Copy link
Contributor

mscdex commented Sep 22, 2019

Can we use a different event name? I recognize fs.ReadStream/fs.WriteStream already use it, but by using this at the stream.Readable/stream.Writable level it can cause disruption in userland modules that may already use this event name (I'm speaking from experience here as ssh2-streams has used 'ready' since 2016 -- long before even fs started using it).

@ronag
Copy link
Member Author

ronag commented Sep 23, 2019

Can we use a different event name?

Sure, though I'm not sure what that could be, https://www.thesaurus.com/browse/ready?s=t.

  • 'primed'?
  • 'init'?

@ronag
Copy link
Member Author

ronag commented Sep 23, 2019

rebased

@ronag
Copy link
Member Author

ronag commented Sep 23, 2019

@addaleax

@ronag
Copy link
Member Author

ronag commented Sep 23, 2019

@mcollina is it an option to use Symbols as event "names" to avoid collisions?

e.g.

const r = new Readable();
r
 .on(Readable.READY, () => {})
 .on(Readable.DATA, (buf) => {})
 .on(Readable.ERROR, (err) => {});

Probably a too big of a change... just a possibly interesting idea...

@ronag ronag force-pushed the stream-construct branch 3 times, most recently from 4188668 to 2a16612 Compare September 23, 2019 11:40
@mscdex
Copy link
Contributor

mscdex commented Sep 23, 2019

I'm not sure what would be a better name that reduces the risk of clashing with userland. Perhaps an added prefix of some kind might help with that, like 'stream-init' or similar. It might seem redundant but it would help differentiate from someone already emitting 'init', as I think that would be more common than 'stream-init'.

@mcollina mcollina added the semver-major PRs that contain breaking changes and should be released in the next major version. label Sep 24, 2019
@mcollina
Copy link
Member

I’ve tagged semver-major out of cautiousness because it’s a significant refactor.

@ronag
Copy link
Member Author

ronag commented Sep 24, 2019

@mcollina do you have any feedback or preference regarding @mscdex's concern?

@mcollina
Copy link
Member

I agree with @mscdex concern, we need to find a different name.

@ronag ronag mentioned this pull request Sep 24, 2019
4 tasks
@Trott
Copy link
Member

Trott commented Oct 1, 2019

This needs a rebase.

@ronag ronag force-pushed the stream-construct branch from 2a16612 to 250b78b Compare October 4, 2019 06:36
@ronag
Copy link
Member Author

ronag commented Apr 28, 2020

@nodejs/streams @nodejs/tsc This PR have been stale for a while. Maybe another round of reviews might be appropriate to ensure we are all comfortable with the changes?

I have one thing I wanted to point out. Given the current implementation _construct must succeed eventually and there is no way to abort it since _destroy will wait for it by design. I don't think this is a problem just wanted to point it out in case anyone else see a problem with this.

Also, for those returning to this PR. The biggest change since the last round of reviews is that _construct is invoked in the next tick after the constructor, instead of directly at the end of the constructor.

@mscdex
Copy link
Contributor

mscdex commented Apr 28, 2020

Why the addition of the empty file, 'does not exist' ?

@ronag
Copy link
Member Author

ronag commented Apr 28, 2020

Why the addition of the empty file, 'does not exist' ?

Mistake. Removed.

@mscdex
Copy link
Contributor

mscdex commented Apr 29, 2020

There seems to be a consistent perf regression after multiple runs for Duplex creation:

streams/creation.js kind='duplex' n=50000000               ***     -8.15 %       ±1.76% ±2.34%  ±3.04%

@ronag
Copy link
Member Author

ronag commented Apr 29, 2020

There seems to be a consistent perf regression after multiple runs for Duplex creation:

This duplex creation regression seems to be re-occuring. Anyway, I think this is an unfortunate but acceptable regression given what this PR does.

@nodejs-github-bot
Copy link
Collaborator

ronag added 2 commits May 25, 2020 10:21
Provide a standardized way of asynchronously creating and
initializing resources before performing any work.

Refs: nodejs#29314
@nodejs-github-bot
Copy link
Collaborator

@mafintosh
Copy link
Member

The async construction feature is def nice and tricky to impl atm.
Do I read this correctly that no event is added after construction now in the latest draft?
Name collisions with that event would be my main concern with userland streams so if no event then I'm soft +1

@ronag
Copy link
Member Author

ronag commented May 26, 2020

Do I read this correctly that no event is added after construction now in the latest draft?

We do add an event but it's a Symbol to avoid collisions. Only used internally to keep the implementation simple.

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented May 26, 2020

@ronag
Copy link
Member Author

ronag commented May 26, 2020

@addaleax @lpinca I assume you are still LGTM?

@ronag
Copy link
Member Author

ronag commented May 26, 2020

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes 👍

@ronag
Copy link
Member Author

ronag commented May 26, 2020

@mcollina I'm getting a suspicious CITGM failures on pino. Any insight? https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2386/nodes=rhel7-s390x/testReport/(root)/citgm/pino_v6_3_0/ Is this a pino issue or issue with this PR?

@mcollina
Copy link
Member

Pino is ok with the change. A test fail because of another deprecation.

@ronag
Copy link
Member Author

ronag commented May 27, 2020

Landed in 9949a2e...54b36e4

@rvagg
Copy link
Member

rvagg commented Jun 8, 2020

arm-fanned has been offline for a while now and it's back on and should be working again but this PR is causing a perma-fail that wasn't caught because it was merged while they were offline.

17:42:04 not ok 583 parallel/test-fs-stream-construct
17:42:04   ---
17:42:04   duration_ms: 240.87
17:42:04   severity: fail
17:42:04   exitcode: -15
17:42:04   stack: |-
17:42:04     timeout
17:42:04   ...

Sorry @ronag, would you mind taking a look?

I haven't had a look at what this test does but a brief scan suggests that a timeout is probably odd so maybe the slower platforms are catching a genuine bug? If not, either use the platformTimeout() to adjust timings in this test for slower machines or maybe even skip it entirely if it can't be avoided.

@ronag
Copy link
Member Author

ronag commented Jun 8, 2020

I haven't had a look at what this test does but a brief scan suggests that a timeout is probably odd so maybe the slower platforms are catching a genuine bug? If not, either use the platformTimeout() to adjust timings in this test for slower machines or maybe even skip it entirely if it can't be avoided.

There is no timeout. It's probably the test not completing for whatever reason.

@ronag
Copy link
Member Author

ronag commented Jun 8, 2020

@rvagg Do you have a machine I can access for reproducing?

@gabrielschulhof
Copy link
Contributor

@ronag you can request access to a machine in https://github.com/nodejs/build/issues/new

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

notable-change PRs with changes that should be highlighted in changelogs. semver-major PRs that contain breaking changes and should be released in the next major version. stream Issues and PRs related to the stream subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.