Test data files

Dear OCCT-members,

I've scanned the document "Automatic Test System" and have come across this section:

"Before running tests, make sure to define environment variable CSF_TestDataPath pointing to the directory containing test data files. (Publicly available data files can be downloaded from http://dev.opencascade.org separately from OCCT code.)"

Unfortunately, I cannot find the test data on this website.

Have I overlooked something or have the test data files not been uploaded yet?

Please advise!

Thank you
Pawel

Andrey BETENEV's picture

Hello Pawel,

Thank you for the question!
Sorry for not making this clear before: unfortunately the data files are not yet ready for publication, waiting for analysis and validation by our legal department.
I hope this will be completed in a few coming months.

Andrey

jordi's picture

Hello Andrey,
Is there some news about this issue? I'm also interested in running the tests in our environment...

Regards
Jordi

Andrey BETENEV's picture

Hello Jordi,

Sorry, publication of data files is still waiting for validation by legals and we cannot commit on any definite date when this will happen.

Nevertheless you can run tests in their current state: about 1/3 of tests do not require data files and should pass well, the remaining will be skipped.

Andrey

Sébastien Raymond's picture

Hello,

Any news about the release of these datas?
Since the upgrade to occt6.9.0, I'm having troubles on MacOSX 10.10.
Geom_BezierCurve constructor is crashing.
And I'm trying to understand what is going on.

Regards,

SГ©bastien

Kirill Gavrilov's picture

This is most likely an issue of new XCode compiler - see the issue #0026042.
You may try the workaround for this issue in branch CR26042_2 of OCCT git.

Sébastien Raymond's picture

Thanks, I'll have a look.

Dc Lu's picture

Hello kgv,

Is there some news about the test data files for the automatic testing system?

Because i want to run the whole test cases. Thanks!

Bests
Dc

Kirill Gavrilov's picture

Hello hsldcen,

OCCT test data is currently spread around two main locations:
1. The "data" folder within main OCCT repository.
This location is expected to be kept small to avoid swelling of the source code repository and mostly consists of sample files being a historical part of OCCT distribution.
2. Dedicated location (referred by CSF_TestDataPath) with ever-growing test files accumulated from users and customers.

Many OCCT tests do not require any test data at all - they generate geometry on-the-fly or perform some unit-test-alike, although the portion of such tests is relatively small compared to the whole test system. Data coming with OCCT itself ("data" folder) is enough for running a little bit more regression tests. Currently, tests are not separated by test data availability, so that whole test grid could be run, and tests in "SKIPPED" state just ignored.

This is the way to go if you are external contributor - just work with the available data, and it should already help for testing a patch. Other tests might help detecting tricky issues.

The complete database consists of a lot of files (some gigabytes). Apart from testing the original problem coming with a file, this data is massively used to detect a possible collateral damage of a new patch by checking deviations in shape statistics, performance statistics, screenshots and crashes compared to previously tested state.

This strategy helps increasing an overall test coverage and to check algorithms robustness on real-world data (compared to limited number of unit-tests and to generated datasets).

Coming back to the database of files, it can be easily deduced that most users do not want their models being publicly available for various reasons - confidentiality, copyright, etc. So that sometimes users are not able sharing the data files just for reproducing the problem.

This is like a testing video/audio decoding software - you wouldn't expect Disney, ABC or any other movie producer making their entire movies and television series being publicly available just for testing an open source FFmpeg project. But obviously, the real-world data is necessary for such projects for continuous regression testing. It is not enough writing a specification-compliant decoder, you should also cope with shadow points in specifications, handle a lot of broken implementations writing ill-formed files, process corrupted files, and deal with other problems.

In case of FFmpeg project, FATE collects various truncated video/audio samples contributed over years by FFmpeg users and developers for testing purpose. Truncated data samples sound like a perfect solution, but this approach is barely applicable to CAD software suffering from numerical instabilities, combinatorial explosion, CAD kernels diversion in shape validity criteria and other kind of problems, very different from multimedia domain.

So that one small geometry piece extracted from a large STEP file is not necessarily similar to another parts (unlike video stream, where video sequence remains very similar for entire duration). As result, CAD software needs real-world data for regression testing, which is in mass confidential or could not be disclosed for other reasons.

The exception could be the models attached to public bugs on Bugtracker by OCCT Community, but the amount of such contributions is relatively very small for comprehensive testing. This Forum thread opens a question if it is possible to collect such contributed data files, detach it from non-public part and making it available to external developers for regular testings, but obviously there is no change for confidential models becoming ever available for testing to external contributors.

Regards,
Kirill

Dc Lu's picture

Just saw this long answer, thank you for your answer, got it, please ignore other message sent to you

Best,
Dc

Kirill Gavrilov's picture

In case, if you just want to understand what particular test case or test grid does - you may ask on the Forum if it could help with your research. While reference data might be confidential, what test case does is usually open / can be described in common words.

For historical reasons many tests lack description within test itself and contains only reference to originating bug (either in text like "OCC26922" or in file name), and the bug might be not publicly available. Note, however, that although bug itself might be inaccessible on Bugtracker, it's description is normally could be found in Changelog coming with Release Notes or/and within git history - as each commit contains a bug number.

In some rare cases, test case might even lose original bug scenario or test nothing useful - it might be difficult to figure out such cases without deep analysis of scenario and related bug descriptions.