https://bugs.winehq.org/show_bug.cgi?id=45417
Bug ID: 45417 Summary: Proprietary .NET 4.x program using Solid Framework .NET PDF OCR libraries generates different Word files in Wine compared to Windows Product: Wine Version: 3.11 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: programs Assignee: wine-bugs@winehq.org Reporter: s.jegen@gmail.com Distribution: ---
Created attachment 61734 --> https://bugs.winehq.org/attachment.cgi?id=61734 Word conversion result Linux
A proprietary program using the .NET SolidFramework for PDF processing with OCR produces different output when ran on Wine compared to Windows (both result files attached).
The binaries in both cases differ only in the development license used. The dev licenses are machine specific so we had to generate one for the Wine environment in order to get our program to run. The dev license do not differ in capability. All the dll files are the exact same version (I compared the sha256 hashes).
The only difference between the Windows and the Wine runtime environment that I can think of is that in Windows we use the .NET 4.7 SDK and in Wine we use .NET 3.6.2. In Windows we most likely have a different set of fonts installed but I am not sure that is relevant in this case since the OCR detection accuracy, which is what seems to differ, should not depend on the installed fonts.
I assume that the OCR detection works worse in Wine for some reason and because of that the background image does not get removed in the Wine result.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #1 from Silvan s.jegen@gmail.com --- Created attachment 61735 --> https://bugs.winehq.org/attachment.cgi?id=61735 Word conversion result Windows
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #2 from Silvan s.jegen@gmail.com --- I wanted to attach the WINEDEBUG=+seh,+relay debug output when processing the file in Wine but the debug log is 315MB when gzipped...
https://bugs.winehq.org/show_bug.cgi?id=45417
Zebediah Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |z.figura12@gmail.com Component|programs |-unknown
--- Comment #3 from Zebediah Figura z.figura12@gmail.com --- This will admittedly probably be difficult to debug.
Still, at a guess, does native windowscodecs (e.g. `winetricks windowscodecs`) help at all?
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #4 from Silvan s.jegen@gmail.com --- (In reply to Zebediah Figura from comment #3)
This will admittedly probably be difficult to debug.
Still, at a guess, does native windowscodecs (e.g. `winetricks windowscodecs`) help at all?
I tried to run it but it looks like it failed:
Using winetricks 20180603 - sha256sum: cad4e699f55c297afe5b177d68dccf1ef54e9dd23518a6f6343caa0ab7636615 with wine-3.11 and WINEARCH=win64 Executing w_do_call windowscodecs Executing load_windowscodecs Executing rm -f /home/silvan/.wine/dosdevices/c:/windows/syswow64/windowscodecs.dll /home/silvan/.wine/dosdevices/c:/windows/syswow64/windowscodecsext.dll /home/silvan/.wine/dosdevices/c:/windows/syswow64/wmphoto.dll /home/silvan/.wine/dosdevices/c:/windows/syswow64/photometadatahandler.dll Executing rm -f /home/silvan/.wine/dosdevices/c:/windows/system32/windowscodecs.dll /home/silvan/.wine/dosdevices/c:/windows/system32/windowscodecsext.dll /home/silvan/.wine/dosdevices/c:/windows/system32/wmphoto.dll /home/silvan/.wine/dosdevices/c:/windows/system32/photometadatahandler.dll Using native override for following DLLs: windowscodecs windowscodecsext Executing wine regedit C:\windows\Temp_windowscodecs\override-dll.reg Executing wine64 regedit C:\windows\Temp_windowscodecs\override-dll.reg Setting Windows version to winxp Executing wine regedit C:\windows\Temp_windowscodecs\set-winver.reg Executing wine64 regedit C:\windows\Temp_windowscodecs\set-winver.reg ------------------------------------------------------ Running /usr/bin/wineserver -w. This will hang until all wine processes in prefix=/home/silvan/.wine terminate ------------------------------------------------------ Executing cd /home/silvan/.cache/winetricks/windowscodecs ------------------------------------------------------ Working around wine bug 32859 -- Working around possibly broken libX11 ------------------------------------------------------ Executing taskset -c 0 wine wic_x64_enu.exe /passive 000d:err:module:import_dll Library windowscodecs.dll (which is needed by L"C:\windows\system32\winemenubuilder.exe") not found 000d:err:module:attach_dlls Importing dlls for L"C:\windows\system32\winemenubuilder.exe" failed, status c0000135 0012:fixme:wer:WerSetFlags (2) stub! 0012:fixme:heap:RtlSetHeapInformation (nil) 1 (nil) 0 stub 0019:fixme:heap:RtlSetHeapInformation 0x240000 0 0x23e830 4 stub 0019:fixme:wer:WerSetFlags (2) stub! 0019:fixme:heap:RtlSetHeapInformation (nil) 1 (nil) 0 stub 0033:err:winedevice:async_create_driver failed to create driver L"WineBus": c0000142 000f:fixme:service:scmdatabase_autostart_services Auto-start service L"WineBus" failed to start: 31 0009:fixme:clusapi:GetNodeClusterState ((null),0x32ec54) stub! 0009:fixme:advapi:DecryptFileA ("y:\376ca604fe9f503e67a964429c165189\", 00000000): stub 003b:fixme:setupapi:pSetupGetGlobalFlags stub ------------------------------------------------------ Note: command taskset -c 0 wine wic_x64_enu.exe /passive returned status 67. Aborting. ------------------------------------------------------
Seems to be this issue: https://github.com/Winetricks/winetricks/issues/954
Is there another way in which I can install these windowscodecs?
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #5 from Zebediah Figura z.figura12@gmail.com --- Hmm, it seems like it doesn't work on 64-bit for some reason. If your program is 32-bit you could try in a 32-bit prefix.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #6 from Silvan s.jegen@gmail.com --- (In reply to Zebediah Figura from comment #5)
Hmm, it seems like it doesn't work on 64-bit for some reason. If your program is 32-bit you could try in a 32-bit prefix.
Currently our program is only 64-bit but I have to compile a 32-bit version anyways. I will do that next week and check that one with windowscodecs in a 32-bit prefix.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #7 from Zebediah Figura z.figura12@gmail.com --- (In reply to Silvan from comment #6)
(In reply to Zebediah Figura from comment #5)
Hmm, it seems like it doesn't work on 64-bit for some reason. If your program is 32-bit you could try in a 32-bit prefix.
Currently our program is only 64-bit but I have to compile a 32-bit version anyways. I will do that next week and check that one with windowscodecs in a 32-bit prefix.
FWIW, it should also work if you manually edit the winetricks script to set the version to win2k3 instead of winxp (function "load_windowscodecs"). See https://github.com/Winetricks/winetricks/issues/970.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #8 from Silvan s.jegen@gmail.com --- (In reply to Zebediah Figura from comment #7)
(In reply to Silvan from comment #6)
(In reply to Zebediah Figura from comment #5)
Hmm, it seems like it doesn't work on 64-bit for some reason. If your program is 32-bit you could try in a 32-bit prefix.
Currently our program is only 64-bit but I have to compile a 32-bit version anyways. I will do that next week and check that one with windowscodecs in a 32-bit prefix.
FWIW, it should also work if you manually edit the winetricks script to set the version to win2k3 instead of winxp (function "load_windowscodecs"). See https://github.com/Winetricks/winetricks/issues/970.
I edited the file manually and then installing the codecs worked. When running our app there was no difference between before installing them and after however.
I agree that this will be a difficult thing to debug. If you think it would help, I can upload the big trace somewhere for you to download?
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #9 from Silvan s.jegen@gmail.com --- Ah, I forgot to mention that after installing windowscodecs and running our program again there were a lot of
000d:fixme:seh:RtlCaptureStackBackTrace (1, 3, 0x530cc2d0, (nil)) stub! 000d:err:menubuilder:convert_to_native_icon error 0x88982F81 initializing encoder 000d:fixme:seh:RtlCaptureStackBackTrace (1, 3, 0x530cc2f8, (nil)) stub! 000d:err:menubuilder:convert_to_native_icon error 0x88982F81 initializing encoder
errors showing up. Everything worked the same as before though.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #10 from Zebediah Figura z.figura12@gmail.com --- I suppose then windowscodecs isn't involved. I'm not sure what else would be used to read images (and I'm not sure what else could cause a discrepancy). I doubt I can get anything from a relay log, but I'll try taking a look at one anyway. If not perhaps someone else will know what to look for.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #11 from Silvan s.jegen@gmail.com --- (In reply to Zebediah Figura from comment #10)
I suppose then windowscodecs isn't involved. I'm not sure what else would be used to read images (and I'm not sure what else could cause a discrepancy). I doubt I can get anything from a relay log, but I'll try taking a look at one anyway. If not perhaps someone else will know what to look for.
I uploaded the WINEDEBUG=+seh,+relay output here:
https://drive.google.com/file/d/1oKM97VFQ2zWSq1FY9IQcDSa2XNuCXC9X/view?usp=s...
Please let me know if you can think of anything else we could try.
I now tried another file containing not scans but jpgs resulting from converting an existing PDF. Those JPGs are very clean which means they shouldn't pose a big challenge for the OCR system. It turns out however that on Linux with Wine, while the text has been recognized correctly, some of the lines are missing the spaces between words for some reason. As far as I remember this is not an issue on Windows. I will attach the resulting .docx file and the original PDF file.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #12 from Silvan s.jegen@gmail.com --- Created attachment 61758 --> https://bugs.winehq.org/attachment.cgi?id=61758 Word conversion result for cleaner source file - Wine
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #13 from Silvan s.jegen@gmail.com --- (In reply to Silvan from comment #12)
Created attachment 61758 [details] Word conversion result for cleaner source file - Wine
The source PDF file is at the link below:
https://drive.google.com/file/d/1heqZLGOd4HE1cPEPX8VbJyYZismvqi7-/view?usp=s...
I wanted to attach it but it was too big.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #14 from Silvan s.jegen@gmail.com --- (In reply to Silvan from comment #11)
I now tried another file containing not scans but jpgs resulting from converting an existing PDF. Those JPGs are very clean which means they shouldn't pose a big challenge for the OCR system. It turns out however that on Linux with Wine, while the text has been recognized correctly, some of the lines are missing the spaces between words for some reason. As far as I remember this is not an issue on Windows. I will attach the resulting .docx file and the original PDF file.
This is actually a false positive. It turns out that the output files seem to be identical from what I can tell. It's just that Libreoffice Writer, which I use on Linux to look at the .docx files, cannot properly display these files. The Windows and the Linix/Wine version actually look identical when viewed in Microsoft Word.
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #15 from Zebediah Figura z.figura12@gmail.com --- (In reply to Silvan from comment #14)
This is actually a false positive. It turns out that the output files seem to be identical from what I can tell. It's just that Libreoffice Writer, which I use on Linux to look at the .docx files, cannot properly display these files. The Windows and the Linix/Wine version actually look identical when viewed in Microsoft Word.
In that case I'm presuming this bug can be closed (and a new one probably filed with OpenOffice)? Or is there still a discrepancy somewhere?
https://bugs.winehq.org/show_bug.cgi?id=45417
--- Comment #16 from Silvan s.jegen@gmail.com --- (In reply to Zebediah Figura from comment #15)
(In reply to Silvan from comment #14)
This is actually a false positive. It turns out that the output files seem to be identical from what I can tell. It's just that Libreoffice Writer, which I use on Linux to look at the .docx files, cannot properly display these files. The Windows and the Linix/Wine version actually look identical when viewed in Microsoft Word.
In that case I'm presuming this bug can be closed (and a new one probably filed with OpenOffice)? Or is there still a discrepancy somewhere?
While the results of the cleaner source file do seem to be identical, the results for the less clean file (the first two attachments) are not (neither in LibreOffice nor in Word on Windows).
I am not sure what else to try and at the moment we work around the problem anyways by using a different approach...