sogaiu/niche-notes.txt

## niche-notes.txt
* description

  create and run certain "tests" that live in comment forms -
  "comment-hidden expression tests".

* rationale

  * want some meaningful "tests" / "usages" to co-evolve with
    development from as early as desired with little activation
    effort.

  * easily record some of one's intent / actually tested expressions
    and results for later testing and reference.

* non-goals

  * not meant as a single-stop testing approach.  meant to be usable
    alongside other testing tools, frameworks, etc.

* random

  * batch testing with VERBOSE is perhaps 4 times slower (> 85s vs
    22s)

  * for batch testing, provide option to randomize order of projects
    to test?

  * have vendored files not end up as sibling files of source files
    one intends to edit as a precaution against accidental editing of
    vendored files?

  * pre-commit steps continue to not be followed regularly.  would a
    "do nothing" script help?

  * batch-niche.janet is not shipped with the repository...should it
    be a separate project?

  * once niche is at some stage, start replacing jeat with it in
    various repositories?  can this be automated?

  * should use of dynamic variables to access "settings" be applied
    more widely?  currently :test/color? is used.

* limits

  * no testing for errors (e.g. parse or test run error)

  * no way to indicate executing expressions up through n-th `comment`
    form.  slightly tricky because some `comment` forms will be
    ignored because they lack tests?

  * documentation output does not adjust to terminal width

  * documentation output is not in any specified format

  * no option to pause when certain "issues" are encountered.
    possibly could implement using `getline`?  this kind of
    semi-interactive mode could give the user control over whether to
    halt processing or continue.  "pause-at-issue"?

  * no way to "resume" testing from last failure or a way to specify
    to start testing from some path (in a list and continue) or nth
    item.

  * no way to "skip" a test that appears hung.

  * verbosity cannot really be controled

    * no option to not report "<path>: no tests found".  nicer when
      batch testing?

    * stdout and stderr from test runs are printed if tests fail.
      there is no way to turn this off.  for an example where this
      might matter, try niche * from
      janet-usages-tests/repos/janet-checksums.  there is a lot of
      output.  possibly consider truncation option?  truncation /
      silencing might be good in the context of batch testing.

    * processing time is always reported (except in raw jdn mode).
      better to only turn on when verbosity level is increased beyond
      some point?

  * some tests that involve unreadable values work fine:

      (comment

        printf
        # =>
        printf

        )

    limitations for tests involving unreadable values are a bit
    unclear.  is it only when there are failures?

    unreadable values can lead to runner halting.  may be this is
    fine.  one example is data/unreadable-value/a.janet

  * summary of test results might report more.  total number of files
    with tests that passed is reported if all tests passed.

    no summarizing of:

    * which files had test failures
    * which files were skipped
      * parse errors
      * linting errors
      * run-test errors
    * number of files considered (some may not have tests)
    * total number of tests
    * total number of test failures
    * timing info (apart from total time) - individual timing info
      is not measured and hence cannot yet be collected

    more useful when testing across multiple roots / repositories.

    would it be useful to have summary information as jdn?  if so,
    should that be merged with other raw jdn output or separate?

  * raw results are not output incrementally, i.e. output comes
    all at once.

  * raw output only covers test results.  warnings about parse errors,
    linting errors, and test run errors are not covered.

  * only test failure data (plus total number of tests) is
    communicated back to the runner via jdn.  captured stdout and
    stderr is sent via stdout and stderr and is also avaliable.  is
    there anything else that could / should be sent along?

  * testing can completely halt (?) when individual source files can
    stall upon execution see janet-ref/disabled/fiber[que].janet for
    an example.  at first glance, not obvious what one might do about
    this?  could restructure everything to have a loop that waits up
    to some amount of time total per test?  this becomes more relevant
    as batch processing of multiple repositories is done more?  would
    fibers and channels be useful for this?  an initial attempt was
    made toward this end using os/spawn, ev/gather, etc. but problems
    were encountered and not understood well, so progress on this
    front has halted.

  * some testing that involves the filesystem uses the tmp
    subdirectory.  this subdirectory is hard-wired.

  * could ordinary output build on top of "structured data" output?
    possibly, but this might have the drawback that "real-time"
    reporting would be adversely affected if some portion of the
    testing takes a long time (or is hanging)...at least waiting for
    all results before reporting anything doesn't seem great from a ux
    perspective.

    is it practical and desirable to rework the code so that all of
    the output code is "at the edges"?  this might be related to the
    previous paragraph.

  * no programmatic testing for cli invocation.  revisit at some
    point?  look at some of jackal's usage testing for cli-ish
    example.  it's not quite testing the invocation of a cli, but is
    it close enough?  in what ways are it lacking?

  * each test file is unconditionally removed after it is used to run
    tests (at least that's the intention).  there is currently no way
    to retain generated test files even if there were problems during
    execution.

  * separators in output have fixed width of 60.  could try to apply
    janet-termsize.

  * no distinction between error result strings and ordinary strings
    in test output is made.  the status is now available (:test-status
    and :expected-status) so making a distinction may be possible in
    failure output (the distinction can be see in raw mode output).

  * how 3rd party abstract types print is decided by the implementor
    of that abstract type and does not need to follow any particular
    convention(?).  for example, a "set" abstract type could print in
    a way such that it looks like a tuple.  this might be problematic,
    because the printed representation cannot be relied upon as an
    indicator of the "type" of value.  however, perhaps the `type`
    function could be used to make certain distinctions.

  * printing of tables with prototypes does not show prototype info.
    consequently, checking such table values requires extra work.  can
    use table/getproto and/or struct/getproto to check for existence
    of prototype table/struct.

  * source files are not linted before creating test files.  however,
    an attempt is made to parse.  linting might issue warnings for
    legitimate files so better to make this optional?

  * pretty printing

    * for replacing expected value expressions for the updating
      feature (no longer within niche, but exists in quiche project)

    * pretty printer would also be nice for displaying failure
      information.

    * test data in "raw" jdn form is not so pretty.  it's slightly
      better than using pp or printf with %m / %M, but this is another
      place a nice pretty printer would be appreciated.

  * there is no checking of unsupported / potentially problematic
    constructs (e.g. forms that look like but are not tests, disabled
    tests, etc.).  might be nice to have a checker / linter one can
    run to get reports.

  * errors related to test file creation and running may be hard to
    understand.  try to watch out for and document specific cases that
    are not already covered (e.g. linting and some(?) run-time errors
    are detected and handled...are there things that are not covered
    by these or has this limitation been addressed adequately?)

  * missing explanation of files in repository like:

    https://github.com/sogaiu/janet-ex-as-tests/blob/master/doc_notes.md

    could include in this notes.txt file

  * no stacktrace when error for actual value

    * possibly could apply the transformation used to create a file
      for linting to produce a stack trace with proper line numbers?

    * not a clear win to have a stacktrace because the
      actual code being run is the transformed code and
      this might be confusing (could the output be
      tweaked appropriately?)

    * current recommendation is to reproduce the error in one's
      editor.  the line number (and file) of the relevant test
      indicator should be in the test run output.  doing this should
      yield appropriate stack traces.

  * there is limited to no testing of:

    * feedback from use of unreadable value - trickier?

    * non-raw output

    * other things?

* questions

  * what benefits were there be to using a separate channel of
    communication that:

    * is multiplatform
    * does not involve creating and cleaning up temp files
    * not overly complex

    to communicate either test output or test result reporting?

  * is it a flaw that file paths within usages seem to be relative to
    the root of the project even though relative paths for imports are
    relative to the source file itself?  it does seem confusing...
    however, it seems to be fundamental because the current working
    directory of the janet process is the root of the project
    directory.  this seems inescapable...not sure though.

    one consequence seems to be that running such files from
    particular directories for correct operation becomes a
    requirement.  if the current working directory is not appropriate,
    code won't work properly.

  * is it possible to avoid the problem of `jpm test` executing all
    .janet files within the test/ subdir?  currently, placing
    test-related files in test/ is awkward because of `jpm test`'s
    behavior.

    * change file extension to be something other than .janet so that
      `jpm test` and friends don't trigger the code

    * can the module loader be tweaked to load the files with a
      different extension?

* on hold features / ideas

  * long-running tests (e.g. the sha one in janet-checksums) might
    appear indistinguishable from hung tests.  could some kind of
    "progress indicator" be implemented and/or timeout?  perhaps it is
    doable if fibers / channels and os/spawn are used?

    pieces that might be useful include os/spawn, try, ev/gather,
    ev/read, os/proc-wait, ev/sleep, error, os/proc-kill

    experimented a bit (see timeout-test branch) but the code seems to
    be faulty because a number of usages timeout when the expectation
    is that they should not:

    usages/some-tests-fail.janet - test file execution was stopped
    usages/single-update.janet - test file execution was stopped
    usages/all-tests-pass.janet - test file execution was stopped
    usages/multi-update.janet - test file execution was stopped

    all involve launching main/main with {:raw true} {:no-color true}
    {:no-exit true}.  not really related?

  * alternate style of test "packaging" might be to treat each
    `comment` form with "metadata" as a test, e.g.:

    (comment :isolated

      (+ 1 1)
      # =>
      2

      )

    needless complexity?

    otoh, metadata could indicate whether the tests depend on relative
    file paths...this might be useful information to adjust how the
    generated test file gets executed (e.g. by adjusting the current
    working directory...).  may be that's too fancy.

    platform-specific tests might be gated using keywords as metadata.

    comment blocks could be "named" and referred to.

* hidden features

  * {:overwrite true} overwrites old test files (typically those that
    have failures)

  * {:includes ...} and {:excludes ...} can also be used to augment
    paths specified on the command line or via the configuration file.

  * {:raw true} causes program output to be jdn

  * {:no-exit true} prevents main from calling `os/exit` -- helpful
    for testing

  * {:no-color true} and NO_COLOR environment variable support

  * three exit codes: 0, 1, 2 (explained in main.janet)

* semi-recurring tasks

  * just reading the existing code and this file on a tablet for
    review purposes

  * look for code that could do with more testing.

* pre commit steps

  0. check that vendored files have not been edited

  1. run jell and examine results

     * if problem detected, investigate, take action, and go to step 1

  2. run tests and examine results

     * if problem detected, investigate, take action, and go to step 1

  3. batch test across all projects that use niche (do a VERBOSE=1 run
     too?)

     * if problem detected, investigate, take action, and go to step 1

  4. ensure README is up-to-date

  5. assemble and review staging for commit

     * if it looks like changes are needed, make them and go to step 1

     * if not, commit and push
	* description

	create and run certain "tests" that live in comment forms -
	"comment-hidden expression tests".

	* rationale

	* want some meaningful "tests" / "usages" to co-evolve with
	development from as early as desired with little activation
	effort.

	* easily record some of one's intent / actually tested expressions
	and results for later testing and reference.

	* non-goals

	* not meant as a single-stop testing approach. meant to be usable
	alongside other testing tools, frameworks, etc.

	* random

	* batch testing with VERBOSE is perhaps 4 times slower (> 85s vs
	22s)

	* for batch testing, provide option to randomize order of projects
	to test?

	* have vendored files not end up as sibling files of source files
	one intends to edit as a precaution against accidental editing of
	vendored files?

	* pre-commit steps continue to not be followed regularly. would a
	"do nothing" script help?

	* batch-niche.janet is not shipped with the repository...should it
	be a separate project?

	* once niche is at some stage, start replacing jeat with it in
	various repositories? can this be automated?

	* should use of dynamic variables to access "settings" be applied
	more widely? currently :test/color? is used.

	* limits

	* no testing for errors (e.g. parse or test run error)

	* no way to indicate executing expressions up through n-th `comment`
	form. slightly tricky because some `comment` forms will be
	ignored because they lack tests?

	* documentation output does not adjust to terminal width

	* documentation output is not in any specified format

	* no option to pause when certain "issues" are encountered.
	possibly could implement using `getline`? this kind of
	semi-interactive mode could give the user control over whether to
	halt processing or continue. "pause-at-issue"?

	* no way to "resume" testing from last failure or a way to specify
	to start testing from some path (in a list and continue) or nth
	item.

	* no way to "skip" a test that appears hung.

	* verbosity cannot really be controled

	* no option to not report "<path>: no tests found". nicer when
	batch testing?

	* stdout and stderr from test runs are printed if tests fail.
	there is no way to turn this off. for an example where this
	might matter, try niche * from
	janet-usages-tests/repos/janet-checksums. there is a lot of
	output. possibly consider truncation option? truncation /
	silencing might be good in the context of batch testing.

	* processing time is always reported (except in raw jdn mode).
	better to only turn on when verbosity level is increased beyond
	some point?

	* some tests that involve unreadable values work fine:

	(comment

	printf
	# =>
	printf

	)

	limitations for tests involving unreadable values are a bit
	unclear. is it only when there are failures?

	unreadable values can lead to runner halting. may be this is
	fine. one example is data/unreadable-value/a.janet

	* summary of test results might report more. total number of files
	with tests that passed is reported if all tests passed.

	no summarizing of:

	* which files had test failures
	* which files were skipped
	* parse errors
	* linting errors
	* run-test errors
	* number of files considered (some may not have tests)
	* total number of tests
	* total number of test failures
	* timing info (apart from total time) - individual timing info
	is not measured and hence cannot yet be collected

	more useful when testing across multiple roots / repositories.

	would it be useful to have summary information as jdn? if so,
	should that be merged with other raw jdn output or separate?

	* raw results are not output incrementally, i.e. output comes
	all at once.

	* raw output only covers test results. warnings about parse errors,
	linting errors, and test run errors are not covered.

	* only test failure data (plus total number of tests) is
	communicated back to the runner via jdn. captured stdout and
	stderr is sent via stdout and stderr and is also avaliable. is
	there anything else that could / should be sent along?

	* testing can completely halt (?) when individual source files can
	stall upon execution see janet-ref/disabled/fiber[que].janet for
	an example. at first glance, not obvious what one might do about
	this? could restructure everything to have a loop that waits up
	to some amount of time total per test? this becomes more relevant
	as batch processing of multiple repositories is done more? would
	fibers and channels be useful for this? an initial attempt was
	made toward this end using os/spawn, ev/gather, etc. but problems
	were encountered and not understood well, so progress on this
	front has halted.

	* some testing that involves the filesystem uses the tmp
	subdirectory. this subdirectory is hard-wired.

	* could ordinary output build on top of "structured data" output?
	possibly, but this might have the drawback that "real-time"
	reporting would be adversely affected if some portion of the
	testing takes a long time (or is hanging)...at least waiting for
	all results before reporting anything doesn't seem great from a ux
	perspective.

	is it practical and desirable to rework the code so that all of
	the output code is "at the edges"? this might be related to the
	previous paragraph.

	* no programmatic testing for cli invocation. revisit at some
	point? look at some of jackal's usage testing for cli-ish
	example. it's not quite testing the invocation of a cli, but is
	it close enough? in what ways are it lacking?

	* each test file is unconditionally removed after it is used to run
	tests (at least that's the intention). there is currently no way
	to retain generated test files even if there were problems during
	execution.

	* separators in output have fixed width of 60. could try to apply
	janet-termsize.

	* no distinction between error result strings and ordinary strings
	in test output is made. the status is now available (:test-status
	and :expected-status) so making a distinction may be possible in
	failure output (the distinction can be see in raw mode output).

	* how 3rd party abstract types print is decided by the implementor
	of that abstract type and does not need to follow any particular
	convention(?). for example, a "set" abstract type could print in
	a way such that it looks like a tuple. this might be problematic,
	because the printed representation cannot be relied upon as an
	indicator of the "type" of value. however, perhaps the `type`
	function could be used to make certain distinctions.

	* printing of tables with prototypes does not show prototype info.
	consequently, checking such table values requires extra work. can
	use table/getproto and/or struct/getproto to check for existence
	of prototype table/struct.

	* source files are not linted before creating test files. however,
	an attempt is made to parse. linting might issue warnings for
	legitimate files so better to make this optional?

	* pretty printing

	* for replacing expected value expressions for the updating
	feature (no longer within niche, but exists in quiche project)

	* pretty printer would also be nice for displaying failure
	information.

	* test data in "raw" jdn form is not so pretty. it's slightly
	better than using pp or printf with %m / %M, but this is another
	place a nice pretty printer would be appreciated.

	* there is no checking of unsupported / potentially problematic
	constructs (e.g. forms that look like but are not tests, disabled
	tests, etc.). might be nice to have a checker / linter one can
	run to get reports.

	* errors related to test file creation and running may be hard to
	understand. try to watch out for and document specific cases that
	are not already covered (e.g. linting and some(?) run-time errors
	are detected and handled...are there things that are not covered
	by these or has this limitation been addressed adequately?)

	* missing explanation of files in repository like:

	https://github.com/sogaiu/janet-ex-as-tests/blob/master/doc_notes.md

	could include in this notes.txt file

	* no stacktrace when error for actual value

	* possibly could apply the transformation used to create a file
	for linting to produce a stack trace with proper line numbers?

	* not a clear win to have a stacktrace because the
	actual code being run is the transformed code and
	this might be confusing (could the output be
	tweaked appropriately?)

	* current recommendation is to reproduce the error in one's
	editor. the line number (and file) of the relevant test
	indicator should be in the test run output. doing this should
	yield appropriate stack traces.

	* there is limited to no testing of:

	* feedback from use of unreadable value - trickier?

	* non-raw output

	* other things?

	* questions

	* what benefits were there be to using a separate channel of
	communication that:

	* is multiplatform
	* does not involve creating and cleaning up temp files
	* not overly complex

	to communicate either test output or test result reporting?

	* is it a flaw that file paths within usages seem to be relative to
	the root of the project even though relative paths for imports are
	relative to the source file itself? it does seem confusing...
	however, it seems to be fundamental because the current working
	directory of the janet process is the root of the project
	directory. this seems inescapable...not sure though.

	one consequence seems to be that running such files from
	particular directories for correct operation becomes a
	requirement. if the current working directory is not appropriate,
	code won't work properly.

	* is it possible to avoid the problem of `jpm test` executing all
	.janet files within the test/ subdir? currently, placing
	test-related files in test/ is awkward because of `jpm test`'s
	behavior.

	* change file extension to be something other than .janet so that
	`jpm test` and friends don't trigger the code

	* can the module loader be tweaked to load the files with a
	different extension?

	* on hold features / ideas

	* long-running tests (e.g. the sha one in janet-checksums) might
	appear indistinguishable from hung tests. could some kind of
	"progress indicator" be implemented and/or timeout? perhaps it is
	doable if fibers / channels and os/spawn are used?

	pieces that might be useful include os/spawn, try, ev/gather,
	ev/read, os/proc-wait, ev/sleep, error, os/proc-kill

	experimented a bit (see timeout-test branch) but the code seems to
	be faulty because a number of usages timeout when the expectation
	is that they should not:

	usages/some-tests-fail.janet - test file execution was stopped
	usages/single-update.janet - test file execution was stopped
	usages/all-tests-pass.janet - test file execution was stopped
	usages/multi-update.janet - test file execution was stopped

	all involve launching main/main with {:raw true} {:no-color true}
	{:no-exit true}. not really related?

	* alternate style of test "packaging" might be to treat each
	`comment` form with "metadata" as a test, e.g.:

	(comment :isolated

	(+ 1 1)
	# =>
	2

	)

	needless complexity?

	otoh, metadata could indicate whether the tests depend on relative
	file paths...this might be useful information to adjust how the
	generated test file gets executed (e.g. by adjusting the current
	working directory...). may be that's too fancy.

	platform-specific tests might be gated using keywords as metadata.

	comment blocks could be "named" and referred to.

	* hidden features

	* {:overwrite true} overwrites old test files (typically those that
	have failures)

	* {:includes ...} and {:excludes ...} can also be used to augment
	paths specified on the command line or via the configuration file.

	* {:raw true} causes program output to be jdn

	* {:no-exit true} prevents main from calling `os/exit` -- helpful
	for testing

	* {:no-color true} and NO_COLOR environment variable support

	* three exit codes: 0, 1, 2 (explained in main.janet)

	* semi-recurring tasks

	* just reading the existing code and this file on a tablet for
	review purposes

	* look for code that could do with more testing.

	* pre commit steps

	0. check that vendored files have not been edited

	1. run jell and examine results

	* if problem detected, investigate, take action, and go to step 1

	2. run tests and examine results

	* if problem detected, investigate, take action, and go to step 1

	3. batch test across all projects that use niche (do a VERBOSE=1 run
	too?)

	* if problem detected, investigate, take action, and go to step 1

	4. ensure README is up-to-date

	5. assemble and review staging for commit

	* if it looks like changes are needed, make them and go to step 1

	* if not, commit and push
No results found