Vit Hrachovy wrote:
Stefan Dösinger wrote:
[Any suggestions for a good app test framework?]
I'm personally using AutoHotkey(http://www.autohotkey.com) as an automatic testing framework backend. It's a Windows application, so it can be run through Wine itself and it's sandboxed from X11.
I second Vit's recommendation. We're using Autohotkey scripts in our test framework http://code.google.com/p/yawt/ and I've been meaning to work with Vit to get his scripts incorporated into our framework. (We've been distracted getting the next release of Picasa out, sorry.) - Dan
I second Vit's recommendation. We're using Autohotkey scripts in our test framework http://code.google.com/p/yawt/ and I've been meaning to work with Vit to get his scripts incorporated into our framework. (We've been distracted getting the next release of Picasa out, sorry.)
I had a look at autohotkey, and after some fiddling I managed to automatically load a predefined 3DMark2000 test, run it and write the results to the disk.
The whole thing has some rough edges, like hotkeys not working sometimes from autohotkey(but they do if entered manually) which I have to overcome by hardcoding button click coordinates. Also I found myself unable to send a backslash(for a path), but overall I think its a useable solution.
Appart of controlling the apps we'll need some programs to start the application scripts, extract the results and do something with them. That could be some bash or perl scripts.
We also need some database to collect the results. This could be something webserver based with some simple SQL database, or just another set of perl scripts on a server. Here I'd like to look how much cxtest infrastructure we can reuse. Any volunteers for writing such a thing?
On Tue, Jun 26, 2007 at 01:06:02PM +0200, Stefan Dösinger wrote:
I second Vit's recommendation. We're using Autohotkey scripts in our test framework http://code.google.com/p/yawt/ and I've been meaning to work with Vit to get his scripts incorporated into our framework.
Appart of controlling the apps we'll need some programs to start the application scripts, extract the results and do something with them. That could be some bash or perl scripts.
We also need some database to collect the results. This could be something webserver based with some simple SQL database, or just another set of perl scripts on a server. Here I'd like to look how much cxtest infrastructure we can reuse. Any volunteers for writing such a thing?
Since yawt may be used as a part of cxtest framework, how about that, Dan? It may be less work to set up the cxtest pool than to create a new testing framework.
If cxtest shows some great architectural weaknesses, Vincent Povirk and me can do the necessary infrastructure tailored to Direct3D specific needs.
Webserver & PHP for frontend, perl/bash/SQL for backend.
We shall discuss the requirements in more exact manner.
1. Define demo packages 2. Define testing environment configurations 3. Define tracked results data items (SQL db architecture)
The resulting infrastructure shall be human independent in following terms: human will prepare a test 'package' (one application) prepare test units (individual tests) machine will run the test units per package interpret the results if possible store the results and run logs (with screenshots)
This will allow for automated test runs at given time. AFAIK cxtest passes all defined requirements above.
I'll try to play with cxtest and report back, preferably with some testing setup.
Regards Vit
We shall discuss the requirements in more exact manner.
- Define demo packages
I think for a start we don't need the ability for users to add new demo packages. A fixed set of test packages which is modifyable by raw database access would be enough. Otherwise we end up with everyone adding new apps, then we are flooded with loads of junk data.
- Define testing environment configurations
Yep, thats needed for sure.
The resulting infrastructure shall be human independent in following terms: human will prepare a test 'package' (one application) prepare test units (individual tests)
I'm rather thinking about preparing a set of downloadable test packages. That should be modularized though.
machine will run the test units per package interpret the results if possible store the results and run logs (with screenshots)
The machine running the tests and the machine storing the results and interpreting them are different in my planning. Also I think we don't want to take screenshots when doing performance tests, since taking screenshots is an expensive operation. Simmilar for verifying the rendering(which IMHO we shouldn't) - that would have to be a secound pass.
I must admit that I never looked very detailed at cxtest, but I thought a bit about a possible data model. We would roughly need the following tables:
Users: A username, and whatelse is needed for authentication. We don't want anonymous access I think.
Configurations: Testing environemnt(hardware, operating system, etc). Has to be creatable/modifyable by the user. One user should be able to use different configurations(like 2 different computers running the tests). The configs have to be editable without invalidating previous results, but whenever a config is edited a test has to be run before and after the modification with the same wine version. A possible reason for a config modification is a driver update.
Tests: A set of test apps. Can be admin editable only, I'm also happy with setting that up with raw database access. A test specifies the application and the settings of the app, like 3Dmark2000 800x600, 16 bit color depth. The test also has to know what sort of results are returned. Maybe we split this table up into base apps(3dmark2000, half life 2, ...) and configs(different resolutions, etc).
Concrete Test(or some different name): A configuration + a test + a test configuration. It should store some reference result value uppon which we decide wether the app should generate a regression warning. Also an optional windows comparison value can be stored.
Test result: The result of a test run. An application can provide more than one result, like the various tests performed by 3dmark or different hl2 timedemos. Test results are provieded by the users. Test results have to know the wine version used to run the tests, the test and the configuration they were run on(ie the concrete test).
Ideally users would download the test packages and run the tests daily, thus providing a constant flow of information about our d3d performance, but we should also be able to handle manual runs.
Thats just my thought about the requirements. If cxtest has something slightly different I'm sure we can adjust the requirements a bit.
Stefan Dösinger wrote:
We shall discuss the requirements in more exact manner.
- Define demo packages
I think for a start we don't need the ability for users to add new demo packages. A fixed set of test packages which is modifyable by raw database access would be enough. Otherwise we end up with everyone adding new apps, then we are flooded with loads of junk data.
- Define testing environment configurations
Yep, thats needed for sure.
The resulting infrastructure shall be human independent in following terms: human will prepare a test 'package' (one application) prepare test units (individual tests)
I'm rather thinking about preparing a set of downloadable test packages. That should be modularized though.
machine will run the test units per package interpret the results if possible store the results and run logs (with screenshots)
The machine running the tests and the machine storing the results and interpreting them are different in my planning. Also I think we don't want to take screenshots when doing performance tests, since taking screenshots is an expensive operation. Simmilar for verifying the rendering(which IMHO we shouldn't) - that would have to be a secound pass.
I must admit that I never looked very detailed at cxtest, but I thought a bit about a possible data model. We would roughly need the following tables:
Users: A username, and whatelse is needed for authentication. We don't want anonymous access I think.
Configurations: Testing environemnt(hardware, operating system, etc). Has to be creatable/modifyable by the user. One user should be able to use different configurations(like 2 different computers running the tests). The configs have to be editable without invalidating previous results, but whenever a config is edited a test has to be run before and after the modification with the same wine version. A possible reason for a config modification is a driver update.
Tests: A set of test apps. Can be admin editable only, I'm also happy with setting that up with raw database access. A test specifies the application and the settings of the app, like 3Dmark2000 800x600, 16 bit color depth. The test also has to know what sort of results are returned. Maybe we split this table up into base apps(3dmark2000, half life 2, ...) and configs(different resolutions, etc).
Concrete Test(or some different name): A configuration + a test + a test configuration. It should store some reference result value uppon which we decide wether the app should generate a regression warning. Also an optional windows comparison value can be stored.
Test result: The result of a test run. An application can provide more than one result, like the various tests performed by 3dmark or different hl2 timedemos. Test results are provieded by the users. Test results have to know the wine version used to run the tests, the test and the configuration they were run on(ie the concrete test).
Ideally users would download the test packages and run the tests daily, thus providing a constant flow of information about our d3d performance, but we should also be able to handle manual runs.
Thats just my thought about the requirements. If cxtest has something slightly different I'm sure we can adjust the requirements a bit.
Hi Stefan,
CxTest has simpler data model than one described above. For a long time CxTest is general framework for testing applications not games.
We (people around CxTest) think that extending CxTest is better approach to achieve your goals rather than writing new application or modifying existing. It could be done with smaller effort (in opposite of AutoHotkey, AutoIt, ...) because CxTest is specialised for testing Wine and CrossOver and it can deal with wine's specifics. One of big advantages is that result's web page exists and it is collecting reports from nightly tests.
For the start we can set up new evaluation page on the CxTest's page http://www.cxtest.org/evaluation. System under test (SUT) will send report which will be interpreted by server as game performance results. For achieving this SUT will also send XML file with processing instructions. For example: attachment result1.raw contains data for 2D graphs and so on.
CxTest core needs some modifications because it is extensively using screen shots to detect responsibility of tested application. Implementation of new functions for test scripts is also possible.
Note, if you are experimenting with cxtest, I strongly recommend to you to use cvs version of CxTest because cxtest.sh script is not often updated. And I have checked 3DMark package from you. It is OK, but for better reliability of tests use --window "window name" parameter for send_keystroke. In cvs version fvwm is used instead of metacity and --window parameter is essential to achieve best results. For ID's of windows, buttons, ... you may use wpickclick/pickclick from utils directory.
-- Michal Okresa
Hello Stefan,
following is specification proposal for first few iterations. It is more focused on functionality than technical details. Feel free to comment.
1st iteration =============
* We take your 3dMark test, adjust it a bit and make it run and submit results regularly on one of our testing machines. This involves adding support for user defined results interpretation into CxTest framework. That means test creator says in test what data put into tables/graphs generated by CxTest server. Also some other things, like ability to turn off taking of screenshots.
* We add Direct3d tests evaluation here: http://www.cxtest.org/evaluation Basically, you select time frame and get table similar to attached one. Please comment.
* After few iterations, when we are satisfied how previous point is implemented, we help you to regularly run 3dMark test on your machine.
2nd iteration =============
* We add graph support on Direct3d evaluation page
* We add 2 more benchmark tests
3rd iteration =============
* We make it easy for end user to run our set of tests. They will be probably part of cxtest.sh standard tests (Wine "make test", WordViewer, ExcelViewer, PptViewer, Picasa, Direct3D tests - optional, user will be asked whether he has 3D acceleration and wants to run them, since they make computer unusable for the duration of the test)
Martin
On 24/07/07, martin pilka mpilka@codeweavers.com wrote:
Hello Stefan,
following is specification proposal for first few iterations. It is more focused on functionality than technical details. Feel free to comment.
1st iteration
- We take your 3dMark test, adjust it a bit and make it run and submit
results regularly on one of our testing machines. This involves adding support for user defined results interpretation into CxTest framework. That means test creator says in test what data put into tables/graphs generated by CxTest server. Also some other things, like ability to turn off taking of screenshots.
- We add Direct3d tests evaluation here: http://www.cxtest.org/evaluation Basically, you select time frame and get table similar to attached one.
Please comment.
- After few iterations, when we are satisfied how previous point is
implemented, we help you to regularly run 3dMark test on your machine.
2nd iteration
We add graph support on Direct3d evaluation page
We add 2 more benchmark tests
3rd iteration
- We make it easy for end user to run our set of tests. They will be
probably part of cxtest.sh standard tests (Wine "make test", WordViewer, ExcelViewer, PptViewer, Picasa, Direct3D tests - optional, user will be asked whether he has 3D acceleration and wants to run them, since they make computer unusable for the duration of the test)
Martin
Looks ok to me. We'll probably want a different date format in the table and list the wine versions below each other rather than next to each other though.
Am Dienstag, 24. Juli 2007 12:41 schrieb martin pilka:
Hello Stefan,
following is specification proposal for first few iterations. It is more focused on functionality than technical details. Feel free to comment.
The roadmap looks good to me. I am missing some clarification how to deal with different hardware configurations.
Stefan Dösinger wrote:
The roadmap looks good to me. I am missing some clarification how to deal with different hardware configurations.
This is how CxTest supports it now:
* There is 1:1 association between submitter and his configuration, i.e. "Stefan - Desktop" --> "AMD 1.8 GHz 1 GB RAM NVidia 256 MB VRAM, Debian 4.0" "Stefan - Laptop" --> "Intel 2 GHz 1 GB RAM ATI 128 MB VRAM, Ubuntu 7.04"
See http://www.cxtest.org/raw-data?id_result=592220 (note we need to add Video Chip and VRAM info)
* There are two different rows in evaluation table, one for Desktop, another for Laptop. Each row has quite different values because of different HW configuration. Significant decrease of FPS in particular row means something is broken.
I think it could be enough for first few iterations.
Regarding table attached to my previous post, is that ok or do you want to change rows with columns as H. suggested? Note that having product (Wine) version in columns is consistent with rest of CxTest site and helps with regression detection, see i.e. http://www.cxtest.org/product-evaluation?id_product=3&id_failure=170&...
Martin
On Di, 2007-06-26 at 13:06 +0200, Stefan Dösinger wrote:
Appart of controlling the apps we'll need some programs to start the application scripts, extract the results and do something with them.
We also need some database to collect the results.
We can also reuse an existing Framework: winetest.exe and http://test.winehq.org/data
(Some small updates might be needed: the timeout in winetest for the crash-detection as example)
Detlef Riekenberg wrote:
On Di, 2007-06-26 at 13:06 +0200, Stefan Dösinger wrote:
Appart of controlling the apps we'll need some programs to start the application scripts, extract the results and do something with them.
We also need some database to collect the results.
We can also reuse an existing Framework: winetest.exe and http://test.winehq.org/data
Ah, this was the site I saw while ago and could not remember anymore. Thanks!
http://test.winehq.org/data/200707031000/
I am wondering, why is the Wine column always empty? And what is tested under i.e. 'NT 4' column? Same tests are ran against same Windows version every day?
Thanks, Martin
On Do, 2007-07-05 at 15:16 +0200, martin pilka wrote:
Detlef Riekenberg wrote:
We can also reuse an existing Framework: winetest.exe and http://test.winehq.org/data
http://test.winehq.org/data/200707031000/
I am wondering, why is the Wine column always empty?
Nobody send the result from winetest.exe on wine for this version of winetest.exe
You can see actual results for wine here: http://www.astro.gla.ac.uk/users/paulm/WRT/wrt.php
And what is tested under i.e. 'NT 4' column?
winetest.exe on Windows NT 4.0
You can also see entries for NT3: I have NT3.51 server installed in qemu here
Same tests are ran against same Windows version every day?
Optimal are different Installations for every Windows-Version, that run winetest after every commit (turn). We are far away from the optimal testing situation.
On Thursday 12 July 2007 16:19:05 Detlef Riekenberg wrote:
You can see actual results for wine here: http://www.astro.gla.ac.uk/users/paulm/WRT/wrt.php
Sadly, WRT is very much in the past-tense. It was based on a Heath Robinson arrangement involving CVS email notifications bouncing around our internal mail servers, delivered to a script, which created a file, which triggered ...
It was awful. It should be possible to have a much more clean solution using git.
Paul.
Dan,
on YAWT homepage, I read: "YAWT can be used either standalone, or as part of the cxtest.org automated regression test suite for Wine"
You mean, it can generate result emails which can be parsed by CxTest server? Or how does it work?
Thanks, Martin
Dan Kegel wrote:
Vit Hrachovy wrote:
Stefan Dösinger wrote:
[Any suggestions for a good app test framework?]
I'm personally using AutoHotkey(http://www.autohotkey.com) as an automatic testing framework backend. It's a Windows application, so it can be run through Wine itself and it's sandboxed from X11.
I second Vit's recommendation. We're using Autohotkey scripts in our test framework http://code.google.com/p/yawt/ and I've been meaning to work with Vit to get his scripts incorporated into our framework. (We've been distracted getting the next release of Picasa out, sorry.)
- Dan