These functions constantly want the current asyncio loop, and we run a
function call each time to get it. And the function call is a deprecated
one. Python 3.7 brings the more explicit get_running_loop for use when
we know we're inside one, with the aim of getting rid of get_event_loop
once support for python <3.7 disappears. Meson no longer supports python
<3.7 either.
Switch to the new API, and save the reference for reuse instead of
constantly re-calculating it.
Fix a TODO comment about moving to asyncio.run, now that we use
sufficiently new python to do it.
Note that we create an event loop for Windows using the new python
defaults, but in a completely different part of the code from where we
need to use it. Since asyncio.run creates the loop on its own, we need
to set the default policy instead -- which we probably should have done
all along.
In https://bugs.python.org/issue42392 this stopped implicitly creating
an event loop if none exists. We created it before running _run_tests(),
so it would auto-create an event loop and set the default, which means
we cannot create one explicitly on our own schedule or we end up with
two of them.
Delay this until we actually start the logger. This happens inside the
actual testsuite loop, so it finds the running loop and doesn't create a
new one, even on python <3.10.
TAP 14 states:
> Harnesses may treat any TAP stream lacking a version as a failed test.
TAP 13 states:
> In the absence of any version line version 12 is assumed. It is an
> error to explicitly specify any version lower than 13.
So, modern TAP is saying that we should treat a missing version as a
test definition bug, it's no longer okay to use a missing version as
saying "let's use TAP 12". But, we can choose whether to treat it that
way or error out.
Let's do a diagnostic, as we do elsewhere. But allow TAP streams that
aren't well defined, if they used to be well defined (back in TAP 12).
In commit a7e458effa we stopped erroring
out on invalid TAP stream contents, with the rationale that "prove" has
become more lenient.
A close reading of the TAP spec indicates why, though:
> A TAP parser is required to not consider an unknown line as an error but
> may optionally choose to capture said line and hand it to the test
> harness, which may have custom behavior attached. This is to allow for
> forward compatability. Test::Harness silently ignores incorrect lines,
> but will become more stringent in the future. TAP::Harness reports TAP
> syntax errors at the end of a test run.
The goal of treating unknown lines as an error in the TAP parser is not
because unknown lines are fine and dandy. The goal is to allow
implementing future versions of TAP, and handling it via existing
parsers. Since Meson has both a parser and a harness, let's do exactly
that -- pass these lines as a distinctive status to the test harness,
then have the test harness complain.
Just like comment lines, blank lines do nothing. Before commit
a7e458effa we ended off the parser by
returning if the line was blank, because we needed to in order to catch
non-blank lines as errors. But really, we should have always returned
much earlier and not wasted time attempting to process anything.
This allows early exit of the project tests once a certain number of
failures are detected. For example `meson test --maxfail=1` will abort
as soon as a single test fails.
Currently running tests are marked as failed via INTERRUPT.
Resolves#9352
When the build definition has changed since the last ninja invocation meson
test operated on an outdated list of tests and their dependencies. That could
lead to some tests not being run / not all dependencies being built. This was
particularly confusing because the user would see the output of
reconfiguration and rebuilding, and the next mtest invocation would have the
updated configuration.
One issue with this is that that we will now output more useless ninja output
when nothing needs to be done (the "Entering directory" part is not repeated,
as we happen to be in the build directory already). It likely is worth
removing that output, perhaps by testing if anything needs to be done with
ninja -n, but that seems better addressed separately.
Fixes: #9852
A later commit will add a second invocation of ninja, no point in having the
detection code twice.
This changes the exit code in case ninja isn't found from 125 to 127, which
seems more appropriate given the justification for returning 125 (to make git
bisect run skip that commit). If we can't find ninja we'll not succeed in
other commits either.
A subsequent commit will do a bit more during test data loading, making a
dedicated function seem advisable.
The diff looks a bit odd, using git show --diff-algorithm=patience
will make it clearer.
googletest 1.12.1 generates new JUnit4 invalid attributes file and
line.
Maybe all gtest "invalid" attributes are actually valid JUnit5
attributes, and maybe schema should be upgraded to JUni5.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Although the former accepts multiple awaitables, it is only ever called
with a single one, so just use `wait_for` instead.
Additionally, the `try_wait_one` fails in Python 3.11, as
`Process.wait()` returns a coroutine, and `asyncio.wait` only accepts
tasks, so it errors out.
- Remove duplicated code in mdevenv.py
- Change the limit to 1024 instead of 2048 which is what has been
tested.
- Skip shortening if it is already short enough.
- Skip shortening with wine >= 6.4 which does not seems to have that
limitation any more.
- Downgrade exception to warning in the case WINEPATH cannot be
shortened under 1024 chars, it is possible that it will still work.
AsyncIO.StreamReader.readuntil() occasionally raises IncompleteRead
exception before a byte of data has been read. Do not process the "read"
data in those cases.
Store a reference to the console logger instance in a test harness'
member variable to allow accessing it (and its logging utilities) from
any other functions in test harness.
This added functionality will be used in future commits.
Store return code, test result and additional error directly to the
relevant TestRun instance. This reduces the number of individual
arguments to other relevant functions that need to be passed around and
thus simplifies the code. The test output (and error) were earlier
similarly moved to be stored directly to the TestRun instance for the
same reason.
By storing test output directly to the TestRun instance we avoid the
need to pass the outputs around in individual function arguments thus
simplifying the code.
The amount of individual arguments will be further reduced in a
future commit.
Make --no-stdsplit option affect test log text files as well. This means
that if the option --no-stdsplit is used only "output" is seen not only
on the console but in the test log text file as well.
Since running only one test sort of implies --num-processes=1 the "live"
output of the test should be printed out when --verbose option has been
given and running only a single test.
The only time the argument would matter (console_mode ==
ConsoleUser.STDOUT) never happens as the only time the function is
ever called is when parsing of the output is needed which in turns
implies that console_mode != ConsoleUser.STDOUT.
As fetching the returned data is non-trivial (we e.g. iterate over all
subtest results) it is best not to hide that fact from the caller of the
property / function.
TAP version 14 introduced subtests, that are supposedly backward compatible
because "TAP13 specifies that non-TAP output should be ignored". Meson
reported TAP syntax errors based on behavior of "prove" at the time,
but it seems that now "prove" has become a lot more lenient; it even
accepts the following completely bogus input just fine:
---
ok 1
ok 2
x
1..1
---
So do the same and make Meson's parser accept invalid TAP input silently.
Fixes: #10032
This was added in commit 01be50fdd9 with
zero explanation as a side effect of moving code around. It seems like a
really bad idea and it causes people to view debugging Meson projects on
e.g. debuginfod systems as "painful".
In this case, the test fname might have an implicit extension and cannot
be found by `os.path.isfile()`.
We cannot use `shutil.which()` to handle platform differences, because
not all test fnames are executable -- for example Java jars.
The test representation does have an "is built" attribute which in
theory should work here, because all built targets definitely have their
full filename known to Meson, but it turns out to be misnamed. Rename it
correctly and add an actual "is built" attribute to check.
Tests which aren't built by Meson can be assumed to exist without
consulting their existence on the filesystem.
Fixes#10027
For a test to be displayed to stdout without buffering, it has to be
1) in verbose mode 2) not executed in parallel with others
3) not parsed, i.e. not TAP or Rust. Include these three
conditions in a new property of stdout and use it in the
three places where it matters: when printing the initial
and final delimiters of the output, and when deciding the
console mode to use.
Suggested-by: Jussi Pakkanen <jpakkane@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If --no-rebuild is used, the test program might not exist, spawning a
FileNotFoundError inside a long traceback rooted in subprocess.Popen
trying to run that test program. Current versions of Meson even say it's
an "unhandled python exception".
But we can do one better and actually tell the user what is wrong, why,
and what to do to fix it. And we can do so before getting waist deep in
partially running tests.
Fixes#10006
Verbose, non-parallel tests generally print their output as they run, rather than
after they finish. This however is not the case if stdout of the test is parsed
as is the case for TAP and Rust tests. In this case, the output during the run
is the list of the subtests, but stderr still has to be printed after the test
finishes.
Store in TestSerialisation whether a particular test must always be logged
verbosely. This is particularly useful for long-running tests or when a
single Meson test() is wrapping an external test harness. In this case,
TAP can be used by the external harness and Meson will log each subtest as
it runs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While the horizontal line and the other pictograms in mtest have an ASCII-only
version, the spinner does not. This causes mtest to fail with a
UnicodeEncodeError exception on non-Unicode locales.
Fixes: #9538
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unless parsing TAP output, there is no strict requirement for
"meson test" to process test output one line at a time; it simply
looks nicer to not print a partial line if it can be avoided.
However, in the case of extremely long lines StreamReader.readline
can fail with a ValueError. Use readuntil('\n') instead and
just process whatever pieces of the line it returns.
Fixes: #8591
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If meson is not a process group leader, a SIGINT will be delivered also to
its parent process (and possibly other processes). The parent process then
will probably exit and mtest will continue running in the background, without
any way to interrupt the run completely.
To fix this, treat SIGINT and SIGTERM the same way unless mtest is a
process group leader.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(var,) is the correct way to pass values to a percent formatted string,
but not to .format so we would end up printing something like:
unexpected input at line (4,)
Upgrade to an f-string and insert the correct value correctly.
All changes were created by running
"pyupgrade --py3-only"
and committing the results. Although this has been performed in the
past, newer versions of pyupgrade can automatically catch more
opportunities, notably list comprehensions can use generators instead,
in the following cases:
- unpacking into function arguments as function(*generator)
- unpacking into assignments of the form x, y = generator
- as the argument to some builtin functions such as min/max/sorted
Also catch a few creeping cases of new code added using older styles.
This reverts commit 5fcb0e6525.
The commit is a massive change that should have been split in
separate pieces, and it also removes a few features:
* in verbose mode, subtests are not printed as they happen
* in non-verbose mode the progress report does not include the
number of subtests that have been run
* in non-parallel mode, output is batched rather than printed as
it happens
Furthermore, none of these changes are not documented in the release
notes. Revert so that each proposal can be tested, evaluated and
documented individually.
This change set aims to fix various "issues" seen with the current
implementation. The changes can be summarized with the following list:
* Replace emojis and spinners with multiline status displaying the name
and running time of each currently running test.
* The test output (especially in verbose mode or when multiple failing
tests' output gets printed out) can get confusing. Try to make the
output easier to read and grasp. Most notable change here is the
addition of the test number to the beginning of each printed line.
* Print exit details (i.e. exit code) of the test in verbose mode.
* Try to make the verbose "live" output from tests to match the look and
feel of otherwise produced (verbose) test output.
Currently, if the test fails to produce XML (or valid XML) then the test
fails with a backtrace. It's actually pretty easy to get into this
situation, a total failure of the test will result in no XML being
written (this can happen, for example, if rpaths to gtest are not
correctly set up). If we can't read the test, go ahead and complete
using `TestRunExitCode.complete()`, which will fail for the bad exit
code.