Reusing the runners for multiple repeats of the test run gets in the
way of the progress report, which stores runners in an OrderedSet.
Instead, create a separate SingleTestRunner object for each repeat.
While at it, fix the "duplicate suite" assertion as it can fire
with TAP tests and --repeat=N.
Fixes: #8405
Some time between 0.56 and 0.57 the TAP parser broke when a test exits
with a nonzero status.
The TAP protocol does not specify this behaviour - giving latitude to
implementers, and meson's previous behaviour was to report the exit
status gracefully.
This patch restores the old behaviour and adds a regression test
os.path.relpath(f, wd) returns path with \ seperator on Windows, but ninja
targets always uses / separator.
See for example https://gitlab.freedesktop.org/ocrete/libnice/-/jobs/7348274.
Analyzed-by: Xavier Claessens <xavier.claessens@collabora.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It looks like GitLab ignores the suite name and actually uses
the classname. Adjust the output accordingly.
Fixes: #8316
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid that the tasks linger and SingleTestRunner.run() never terminates.
In order to do this, we need read_decode() and read_decode_lines() to be
cancellable, and to handle the CancelledError gracefully while returning
the output they have collected so far.
For read_decode(), this means always operating on a line-by-line basis,
even if console_mode is not ConsoleUser.STDOUT. For read_decode_lines(),
instead, we cannot return an iterator. Rather, read_decode_lines()
returns the output directly (similar to read_decode) and communication
with the parser is mediated by an asyncio.Queue.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes non-parallel tests emit their output on the fly,
similar to ninja console jobs. It also cleans up the code
a bit, avoiding the repetition of "self.options.num_processes"
tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Ensure the output is terminated with a \n even if the test does not
include one.
- Ensure that stdout is flushed for each reported result
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix "meson test --wrapper foo --setup bar", it should work just fine
if the setup does not define a wrapper.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Start the parsing of the output early; this avoids a deadlock
if the test writes to stdout but no one reads from it. It
also reports TAP or Rust subtest results as they happen,
which was the intention all along.
While at it, use a consistent naming conventions for coroutines
vs tasks.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid printing something like "30/-1s" when tests are run without
a timeout or with --timeout-multiplier 0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new keyword argument makes it possible to run specific
test setups only on a subset of the tests. For example, to
mark some tests as slow and avoid running them by default:
add_test_setup('quick', exclude_suites: ['slow'], is_default: true)
add_test_setup('slow')
It will then be possible to run the slow tests with either
`meson test --setup slow` or `meson test --suite slow`.
This will be needed to exclude testsuites from test setups (which
are stored in the build data). While at it, since a chdir is
needed simplify a bit the loading of tests and benchmarks.
Print the (shortened) output of the failed tests as they happen.
If neither --verbose nor --print-errorlogs was specified, omit the
summary of failures, because it is pretty much the same as the earlier
output of "meson test".
In non-parallel verbose mode the output of the test/benchmark
is not buffered, therefore the command line is only printed by
ConsoleLogger for failing tests and only after the test has run.
Verbose mode is designed mostly for CI systems, where output must
be human readable but is generally consumed from a browser with "Find"
commands rather than from a terminal. With this usecase in mind, it
is better to provide as much detail as possible, so add more output
and just tell the user which tests have started. Do so, using the
recently introduced TestResult.RUNNING state.
Ensure that all the required modifications are included in the logs.
This makes it possible for users to cut-and-paste from the logs when
trying to reproduce failures outside Meson.
Right now the same code is used to print the logs for both the console
and the text log. Differentiating them lets the important bits of
the console output stand out, and makes the console output a bit more
readable.
Using verbose mode dropped stdout/stderr from the logs, because
it was not captured.
Now that we can easily stick code in the middle of the reading of
stdout/stderr, use that to print stdout and stderr on the fly
while also capturing them for the logs. The output is line-buffered.
As a side effect, this also fixes a possible deadlock due to
not using ensure_future around stdo_task and stde_task. In
particular:
- the stdo_task coroutine would not terminate until the test closed
stdo_task
- the stde_task coroutine would not start until the stdo_task
coroutine finished
Therefore, the test could get stuck waiting for its parent to
read the contents of stderr, but that would not happen because
Meson was still in the stdo_task coroutine.
Move the logic to start the read/decode
tasks to TestSubprocess and keep SingleTestRunner simple.
The lines() inner function is tweaked to produce stdout as a future.
This removes the nonlocal access (which is not possible anymore
when the code is moved out of _run_cmd), and also lets _run_cmd
use "await stdo_task" for both parsed and unparsed output.
After the next patch, we will need to complete parse_task before
stdo_task (because parse_task will not set the "stdo" variable
anymore but it will still collect stdout just like now). Do
the change now to isolate the more complicated changes.
The new behavior of interrupting the longest running test with Ctrl-C is useful
when tests hang, but not when the run is completely broken for some reason.
Psychology tells us that the user will compulsively spam Ctrl-C in this case,
so exit if three Ctrl-C's are detected within a second.
This correctly formats tests with CJK names or, well, emoji. It is not perfect
(for example it does not correctly format emoji that are variations of 1-wide
characters), but it is as good as most terminal emulators.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of slurping in the entire stream, build the TestResult along
the way. This allows reporting the results of TAP and Rust subtests as
they come in, either as part of the progress report or (in the future)
as individual lines of the output.