On 30 Dec 2001, Alexandre Julliard wrote: [...]
In fact here's a 10-minute hack to add a make test target. With that all you have to do is create a test script in dlls/xxx/tests/foo.test, put the expected output in tests/foo.test.ref (presumably generated by running the test under Windows), add your script to the makefile and run make test.
I think what we need with this is a couple of guidelines and documentation for potential test writers, and maybe a couple of extensions. The following is half a proposed documentation that we could put in the Wine Developper Guide, and half a proposed specification for some possible extensions. As usual, comments and suggestions are welcome.
What is a test --------------
A test unit is an executable or script. You can name it anyway you like (please no space in the names, it's always annoying). All test units should be non-interactive. A test unit called xxx generates two outputs: * its exit code * text output on either or both of stdout and stderr, both of which are normally redirected to a file called 'xxx.out'.
A test succeeds if: * its exit code is 0 * and its output, 'xxx.out' matches the reference output according to the rules described later.
Reciprocally it fails if: * its exit code is non zero Either because one aspect of the test failed and thus the test unit decided to return a non-zero code to indicate failure, or because it crashed and thus the parent got a >= 128 error code. * or because its output differs from the reference output established on Windows
Under this model each test unit may actually be comprised of more than one process (for instance to test CreateProcess, inter-process messaging, inter-process DDE, etc.). All that counts is that the original process does not finish until the testing is complete so that the testing framework knows when to check the test output and move on. (There is no provision for hung tests. A time out based mechanism, with a large time out, like 5 minutes, could do the trick.)
A test unit can also exercise more than one aspect of one or more APIs. But, as a rule of thumb, a specific test should not exercise more than a couple to a handful related APIs (or up to a dozen in extreme cases). Also, more than one test could exercise different aspects a given API. So when running the Wine regression tests, if we find that 3 tests out of 50 failed, it means that three processes had an incorrect exit code or output out of fifty. One should then analyze in more details what went wrong during the execution of these processes to determine which specific API or aspect of an API misbehaved. This can be done either by looking at their output, by running them again with Wine traces, or even by running them in a debugger.
Test Output -----------
Wine tests can write their output in any form they like. The only important criteria are: * it should be reproducible from one run to the next: don't print pointer values. They are most likely to change in the next run and thus cannot be checked * it should be the same on a wide range of systems: don't print things like the screen resolution! * it should be easy to correlate with the source of the test. For instance if a check fails, it would be a good idea to print a message that can easily be grepped in the source code, or even the line number for that check. But don't print line numbers for success messages, they will change whenever someone changes the test and would require an update to the reference files.. * the output should not be empty (just in case the process may die with a 0 return code / fail to start before writing anything to the output) * finally it should be easy to read by the people who are going to be debugging the test when something goes wrong.
To each test we associate a file containing the reference output for that test. If the test's output consists of a single "Test Ok", then that file may be ommitted. (I am not sure if this shortcut is actually very needed/useful)
Otherwise this file is either called: * 'xxx.ref' * or 'xxx.win95' or 'xxx.win98' ... if the output depends on the Windows version being emulated. The winever-specific file takes precedence over the '.ref' file, and the '.ref' file, which should exist, serves as a fallback.
This second feature is probably best avoided as much as possible as multiple reference files are harder to maintain than a single reference file. But they maybe be useful for some APIs (can anyone think of any?). In any case I propose not to implement it until we actually find the need for it.
One may also create a file called 'xxx.ref.diff' (resp. 'xxx.win95.diff', etc.) which contains a diff between the test output in Windows and the test output in Wine. The goal is to: * make it unnecessary to tweak tests to not report known Wine shortcomings/bugs, or to remove these tests altogether * but not have a hundreds of tests that systematically fail due to these shortcomings either (I can think of at least one case related to command line passing). Because then you would not know if more aspects of these tests fail than usual or not.
The criteria to determine success/failure of a test unit xxx then becomes: xxx >xxx.out 2>&1 if the return code is != 0 then the test failed diff -u xxx.ref xxx.out >xxx.diff if there is no xxx.ref.diff && xxx.diff is not empty then the test failed if xxx.diff is different from xxx.ref.diff then the test failed otherwise the test is successful
The test framework can then return three numbers: * the number of failed tests * the number of tests with known bugs * the total number of tests
(and/or various other differences between these numbers)
Test coverage -------------
Each test should contain a section that looks something like:
# @START_TEST_COVERAGE@ # kernel32.CreateProcess # kernel32.GetCommandLineA # kernel32.GetCommandLineW # __getmainargs # __p__argc # __p_argv # __p_wargv # @END_TEST_COVERAGE@
The goal of this section is to identify which APIs are being tested by a given test unit. Each API is specified as 'dll.api'. If the dll name is omitted, then the API is assumed to be in the dll in which the test unit is located (in the above example that would be msvcrt).
Note that we cannot simply extract the list of APIs being called by a given test. For instance most tests are likely to call APIs like printf. And yet, printf should not be recorded as being tested by a thousand tests.
We can then write a script that uses this information to: * list the tests that cover a given API * build a list of the APIs that have no associated tests * build all sorts of fancy and aybe useful statistics
(the above section would work just as well inside C-style comments, one way to handle this is to ignore leading non-alphanumeric characters)
Test environment ----------------
A test may need to know things about its environment although hopefully this will be relatively rare. So I propose to store some information in environment variables as this seems the least intrusive way to provide them with information:
* TEST_OUT The name of the test output file. This may be useful if a test needs to create new processes and to redirect their output to temporary files. If the child processes need to output infomation to the test output, then they can use this environment variable to open the rght file.
* TEST_WINVER This contains the value of the '-winver' Wine argument, or the Windows version if the test is being run in Windows. We should mandate the use of specific values of winver so that test don't have to recognize all the synonyms of win2000 (nt2k...), etc. (Do we need to distinguish between Windows and Wine?)
* TEST_BATCH If true (equal to 1) or unset, then the test should assume that it is being run from within the test framework and thus that it should be non-interactive. If TEST_BATCH is set to 0, then the test can assume that it is being run in interactive mode, and thus ask questions to the user. Of course most tests will simply behave identically in both cases, but in some cases an interactive mode may be useful. For instance the test concerning CommandLineToArgvW could have an interactive mode where it asks the user for a command line and prints the corresponding argc/argv. This would allow a developper to manually check how the API behaves for specific command lines.
I thought about passing these arguments on the command line but: * it seems less practical, especially since 'main' would have to parse it and store is somewhere * it may interfer with some tests (command line related tests, although only child processes should care about that) * it seems less expandable and flexible
Running tests -------------
In Wine:
'make tests' seems the best way to do things. But some tests may need to create windows. For instance I have a DIB test that creates a window, draws a number of DIBs in it and checks the bitmap bits of these DIBs and then exits. Thus it is a non-interactive test. I am not really sure whether the window actually needs to be made visible or not, but even if this particular exemple does not require it, I suspect that othersm checking for message sequences for instance, may need to make the window visible. And if the Wine test suite contains many such tests, then there will be windows popping up and down all the time and it would make it impossible to use the computer while the tests are running. So we may: * have to test lists in the Makefiles: cui-tests (those that don't pop up windows) and gui-tests (those that do) * and add two corresponding targets: 'make cui-tests' runs only those tests that do not pop up windows, and 'make gui-tests' runs only those tests that do pop up windows * 'make tests' would be 'tests: cui-tests gui-tests'
Of course, it should be noted that one way to deal with tests that pop up windows is to run them inside a VNC X server, a second X server or some other similar X trick.
In Windows:
Hmmm, not sure how that is done. Run 'winetest.exe'?
Writing tests -------------
This is where we describe which APIs are available to a test writer... if any. I believe that very little functionality is necessary/useful.
* test_failed(message) Sets a global variable to 1 to indicate failure and printf the specified message
* test_result() Returns 0 if the 'test_failed' was never called, and 1 otherwise.
* get_test_out() Returns the contents of $TEST_OUT
* get_test_batch() Returns true if the test is being run in batch mode and false owtherwise. Of course this is based on $TEST_BATCH
* get_test_winver() Returns the current Windows version, or the version being emulated if in Wine. This coould simply be based on $TEST_WINVER.
That's about all. As you can see this is pretty basic stuff and I am not sure more is needed. But if you have ideas...
What is needed most is a two sample tests: * one simple console based test * another one involving some GUI stuff
Then the documentation could use them as examples and discuss the interesting aspects.
I believe that all the above is fairly neutral as far as perl vs. C is concerned. Except for the compilation issues, and maybe the exact command to use to invoke a test, whether a test is written in C or in perl should not make any difference.
-- Francois Gouget fgouget@free.fr http://fgouget.free.fr/ 1 + e ^ ( i * pi ) = 0