Hi,
Jan Kratochvil <rcpt-ros-kernel.AT.reactos.com_at_jankratochvil.net> wrote:
As GPLed Linux-NTFS still has no NTFS r/w capability I completed the project for reliable r/w access in Wine way by using MS-Windows ntfs.sys driver.
The goal of the Linux-NTFS project is the provide completely open source code to access, manage NTFS filesystems, not only r/w access whatever way. It's a big difference. There is already a full r/w driver for Linux by Paragon for a while - although it costs money but it doesn't have legal threats.
Why the open source write capability isn't achieved? In short, no active development for over a year and before that the code base was completely rewritten, after former developer(s) stopped working on it 3-4 years ago. Otherwise write access is pretty close, I'd say 2-4 months hard work (if one isn't familiar with NTFS that's another 1-3 months).
There are rumours GPL may be problem for linking with proprietary code (ntfs.sys&ntoskrnl.exe)
From GPL point of view I don't think so. But I could imagine one isn't
allowed to use ntfs.sys, ntoskrnl.exe the way he/she wants legally. They need to be checked (plus consider what's legal today can be illegal tomorrow, so keep checking).
Or just ask Microsoft. At the same time you could also ask if they have patents on NTFS. I couldn't find any contrary to the rumours.
There was a fuss years ago about Microsoft wanted to sue Linux NTFS developers, but the fact seems to be Microsoft threatened only Jeff V. Merkey because he had signed NDA with Microsoft, still he was developing and distributing NTFS tools for Linux. Microsoft denied all of this publicly. You can google up the full story but IMHO it's better just to ask Microsoft.
Szaka
Hi,
On Tue, 21 Oct 2003 11:47:48 +0200, Szakacsits Szabolcs wrote: ...
The goal of the Linux-NTFS project is the provide completely open source code to access, manage NTFS filesystems, not only r/w access whatever way. It's a big difference.
I respect your Free software approach, I'm a big Free software advocate. I consider NTFS problem a special case.
My solution gives a bridge to people who grew up on or are forced to use MS Windows (like employees in the office) to GNU/Linux environment. These people run proprietary SW already and with my solution they've got a way out.
Using original ntfs.sys can give just better compatibility than any other 3rd party GPLed reimplementation could ever have.
NTFS has no meaning as a standalone filesystem - there are better GPL high performance mature filesystems such as ext3. NTFS for GNU/Linux OS has its only meaning as a temporary compatibility hack.
There is already a full r/w driver for Linux by Paragon for a while - although it costs money but it doesn't have legal threats.
Paragon http://www.ntfs-linux.com/ really supports r/w NTFS for GNU/Linux (thanks for reference) but it is a commercial closed-source (*) product and it is even an IMO dangerous way to modify your NTFS drives due to the reverse-engineering disadvantages described above.
(*) By running any closed-source software on your GNU/Linux OS you completely loose your system security. Although it is already lost for dualbooting machines the security sandboxing feature of Captive NTFS may get handy for GNU/Linux sysadmins needing to repair NTFS disk drives.
Besides using the original NTFS driver Captive NTFS further ensures the disk safety by commiting any disk modifications only after the successful unmount of the modified disk by its emulated MS-Windows kernel subsystem.
...
But I could imagine one isn't allowed to use ntfs.sys, ntoskrnl.exe the way he/she wants legally.
Applicable laws vary in different countries - legal analysis for major countries would be welcome. I have an affirmation of the professional IT lawyer JUDr. Jiri Cermak Captive NTFS is legally valid at least in my home country.
Regards, Lace
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
I consider NTFS problem a special case.
Probably because you consider NTFS as the product of the "evil empire", not as a matured technology designed/developed by experienced professionals and used over a hundred millions of computers today ;)
NTFS has no meaning as a standalone filesystem - there are better GPL
Why not? NTFS has all the features that other Linux filesystems have and even more. Why couldn't it be used as a standalone filesystem?
Let me tell an example, the (Linux) NTFS driver supports transparent compression. Today CPU's are very fast and the disk bandwidth is very slow compared. So doing bulk data transfers, the disk bandwidth is the bootleneck. Would you be faster using the filesystem's transparent compression? In theory yes, CPU can [de]compress when transfering data and you must also transfer less data. No main Linux fs supports this, hopefully Reiser4 will in the future ... (there was hope for e2compr as well for many years but it didn't work out after all).
high performance mature filesystems such as ext3. NTFS for GNU/Linux OS has its only meaning as a temporary compatibility hack.
You mean just like the fat driver, samba, wine, etc? They make things to work together. You have the source, you can port it to other OS'es, fix bugs, improve it, etc.
Paragon http://www.ntfs-linux.com/ really supports r/w NTFS for GNU/Linux (thanks for reference) but it is a commercial closed-source (*) product and it is even an IMO dangerous way to modify your NTFS drives due to the reverse-engineering disadvantages described above.
I don't think Paragon reverse engineered the NTFS driver but licensed the technology from Microsoft. SDK, documents, whatever.
But I could imagine one isn't allowed to use ntfs.sys, ntoskrnl.exe the way he/she wants legally.
Applicable laws vary in different countries - legal analysis for major countries would be welcome. I have an affirmation of the professional IT lawyer JUDr. Jiri Cermak Captive NTFS is legally valid at least in my home country.
IMHO today definitely but AFAIK your country will become part of the EU next year and the patent laws are just discussed nowadays (it doesn't look too promising). This may or may not be related using ntfs.sys legally outside of Windows. Also, there is actually any patent issue?
Szaka
Hi,
On Tue, 21 Oct 2003 17:41:19 +0200, Szakacsits Szabolcs wrote:
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
...
NTFS has no meaning as a standalone filesystem - there are better GPL
Why not? NTFS has all the features that other Linux filesystems have and even more. Why couldn't it be used as a standalone filesystem?
As its data structures are undocumented. And it IMO does not make sense to fork its documented derivative out of it (there was already one with unhappy end).
Let me tell an example, the (Linux) NTFS driver supports transparent compression.
OK, it is a feature missing in current GPLed (incl. specification) filesystems. It still makes more sense to me implementing the feature to existing GPL filesystem instead of implementing the same feature to NTFS with uncertain data structures. But we are talking about free software - do what you get paid for.
...
IMHO today definitely but AFAIK your country will become part of the EU next year and the patent laws are just discussed nowadays (it doesn't look too promising).
Compatibility reasons should permit the use case even by these illegal laws. Fortunately these laws do not apply now and the future is not known (at least for some people).
Regards, Lace
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
On Tue, 21 Oct 2003 17:41:19 +0200, Szakacsits Szabolcs wrote:
Why not? NTFS has all the features that other Linux filesystems have and even more. Why couldn't it be used as a standalone filesystem?
As its data structures are undocumented.
Lots of people spent lots of time many years ago to figure them out. There are at least 4-5 different ntfs implementations based on it. Most of them is read-only but if you know how to read, sure you can write as well. But due to the concurrenct access, it's much more difficult to implement.
And it IMO does not make sense to fork its documented derivative out of it (there was already one with unhappy end).
Sorry I don't get what you mean. The old NTFS driver? It didn't check the NTFS version so when Microsoft improved it slightly (Win2K) and thus updated its on-disk version number then the old driver tried to use it as an NT4 NTFS. Trivial driver bug. Unfortunately nobody fixed it for a very long time thus it ruined many people's filesystems. Thus because of the Linux driver bug Microsoft became even more evil, NTFS completely undocumented, later on the patent rumours added and etc.
The rewritten drivers and ntfsprogs check the NTFS version and exit if it's unknown.
BTW, it would be interesting if one could check, try out what's Longhorn's NTFS version numbers (i.e. if it changed or not). E.g. ntfsresize -i /device would tell it.
Let me tell an example, the (Linux) NTFS driver supports transparent compression.
OK, it is a feature missing in current GPLed (incl. specification) filesystems. It still makes more sense to me implementing the feature to existing GPL filesystem instead of implementing the same feature to NTFS with uncertain data structures.
It's not uncertain, it's know for a while. Moreover NTFS isn't only a new filesystem. It's also an important interoperability issue. Think about FAT. Is it good there are all over open source FAT drivers? It even made possible to quickly adopt to X-BOX's FATX. Soon FAT will go away completely, only NTFS stays.
But we are talking about free software - do what you get paid for.
I don't understand this. There are many reasons why people write open source software. Charity, bust ego, religion, fun, get paid, etc.
Szaka
Hi,
On Tue, 21 Oct 2003 21:05:57 +0200, Szakacsits Szabolcs wrote:
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
...
Most of them is read-only but if you know how to read, sure you can write as well.
Block allocation bitmaps (or Btrees or whatever) do not need to be understood for writing. You can ignore a zillion of unknown data fields during read (such as checksums). You can read the whole hashtables while not understanding the hash function. You do not need to know maximum sizes of data structures. etc.
[ NTFS ]
And it IMO does not make sense to fork its documented derivative out of it (there was already one with unhappy end).
Sorry I don't get what you mean. The old NTFS driver?
Sorry, I meant OS/2 HPFS/NTFS. The old story as the development of NT kernel split between IBM and Microsoft at some point.
...
OK, it is a feature missing in current GPLed (incl. specification) filesystems. It still makes more sense to me implementing the feature to existing GPL filesystem instead of implementing the same feature to NTFS with uncertain data structures.
It's not uncertain, it's know for a while.
Filesystem must be the rock solid data storage structure. You must know the meaning of each byte (*) for such reliable and interoperable filesystem. NTFS is not documented in such level - even in the case of documented compression structures there are missing points in the underlying generic NTFS data structures.
(*) You do not need to know the journalling metadata as long as you do not support journalling and/or its recovery.
...
But we are talking about free software - do what you get paid for.
I don't understand this. There are many reasons why people write open source software. Charity, bust ego, religion, fun, get paid, etc.
I did not see much real software written for a different purpose, YMMV.
Lace
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
... Most of them is read-only but if you know how to read, sure you can write as well.
Block allocation bitmaps (or Btrees or whatever)
Yes, there are all the three :)
do not need to be understood for writing.
I guess you mean reading? Anyway it depends on the driver quality. All needs to be known for both read and write.
For exmaple the block allocation bitmap is used for filesystem consistency check by ntfsresize. It caught a lot of inconsistent NTFS, hardware errors and an extremely rare NTFS case we waren't aware before (but it's supported now).
You can ignore a zillion of unknown data fields during read (such as checksums).
Yes, one can but we don't. The new NTFS driver has very rigour checking, including checking the several checksums.
You do not need to know maximum sizes of data structures. etc.
NTFS (and Microsoft) defines the maximum size. Moreover even if it wasn't, one could have built in limits known to work and test bigger limits when needs arise and release if tests pass.
Sorry, I meant OS/2 HPFS/NTFS. The old story as the development of NT kernel split between IBM and Microsoft at some point.
OS/2 wasn't open source.
Filesystem must be the rock solid data storage structure. You must know the meaning of each byte (*) for such reliable and interoperable filesystem.
Exactly. Every needed byte is known.
NTFS is not documented in such level - even in the case of documented compression structures there are missing points in the underlying generic NTFS data structures.
I don't know what you exactly mean. The public NTFS documentation somewhat outdated. The real documentation is the source code. And as I wrote, the Linux NTFS driver can handle compressed files.
(*) You do not need to know the journalling metadata as long as you do not support journalling and/or its recovery.
One of the unknown issues is journaling :) If volume is marked dirty, driver refuses to mount it (unless forced).
Szaka
Hi,
On Tue, 21 Oct 2003 22:38:41 +0200, Szakacsits Szabolcs wrote:
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
...
do not need to be understood for writing.
I guess you mean reading? Anyway it depends on the driver quality. All needs to be known for both read and write.
Not all information needs to be known for reading:
For exmaple the block allocation bitmap is used for filesystem consistency check by ntfsresize.
Consistency is an addon useful feature although it is not required for successful reading.
Sorry, I meant OS/2 HPFS/NTFS. The old story as the development of NT kernel split between IBM and Microsoft at some point.
OS/2 wasn't open source.
I spoke about OS/2 as an example of a split NTFS development. As I hope you agree it is not appropriate to split the development of NTFS filesystem in a way incompatible with Microsoft Windows NT.
Filesystem must be the rock solid data storage structure. You must know the meaning of each byte (*) for such reliable and interoperable filesystem.
Exactly. Every needed byte is known.
It may be true but at the start of my Captive NTFS project coding there was no free implementation of read/write NTFS. As I do not trust reverse engineered complex data structures for such sensitive content as the user data I implemented the read/write NTFS in a way most reliable in my opinion.
BTW there is still no non-Captive free implementation of read/write NTFS.
"free" as my implementation is not "Free" as it is using proprietary drivers. "Free" as defined by: http://www.gnu.org/philosophy/free-sw.html
"free" as there exist commercial read/write NTFSes for GNU/Linux. VMware Workstation+W32 ($299+???), Paragon NTFS for Linux ($69.95)
...
One of the unknown issues is journaling :) If volume is marked dirty, driver refuses to mount it (unless forced).
BTW my implementation will recover the journalled data by LFS (Log File System) contained in the ntfs.sys with the support of Captive Cache Manager subsystem featuring LSNs (Linear Sequence Numbers). But I admit the compatible journalling is not much a requirement for GNU/Linux read/write NTFS.
Regards, Lace
Hi,
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
On Tue, 21 Oct 2003 22:38:41 +0200, Szakacsits Szabolcs wrote:
I guess you mean reading? Anyway it depends on the driver quality. All needs to be known for both read and write.
Not all information needs to be known for reading:
It depends on the driver quality. All needs to be known for both read and write. In other words, one must know how the system works if he/she wants _reliable_ reading, aka the data one read is trustable, not some bogus value.
For exmaple the block allocation bitmap is used for filesystem consistency check by ntfsresize.
Consistency is an addon useful feature although it is not required for successful reading.
Again, depends on quality. Specifically we are talking about NTFS, right? Below is one of the many rebuttals from practice.
Partimage supports NTFS (experimental). It was implemented based on the public NTFS documentation (it doesn't use the Linux-NTFS source base). For example it doesn't check the filesystem consistency, just saves the blocks marked in use in the block allocation bitmap ($Bitmap). Thus if the filesystem is inconsistent (it do happen) then it will save the wrong blocks: it might save what are unused and/or what's really fatal, it doesn't save what are in use. Lost data.
I spoke about OS/2 as an example of a split NTFS development. As I hope you agree it is not appropriate to split the development of NTFS filesystem in a way incompatible with Microsoft Windows NT.
If you have source you can make transparent extensions or just fork it and do whatever needed. Many open source project do this, it's one of the strength of open source.
But I can't see such "danger". NTFS is pretty modular, plugin-like, something what Reiser4 promises (but Reiser4 looks even cooler). MS just plugged in what people need: compression, encryption, quota, link, WinFS support, etc.
It may be true but at the start of my Captive NTFS project coding there was no free implementation of read/write NTFS.
No-one should expect NTFS kernel driver to be done in the near future unless active development starts. But because people think NTFS is undocumented (untrue) the chances are even smaller than it could be (and those who could do it don't have the time for it).
As I do not trust reverse engineered complex data structures for such sensitive content as the user data
The reverse engineered, complex data structures were validated and in use for years.
I implemented the read/write NTFS in a way most reliable in my opinion.
Ok, I don't repeat here what I wrote on the other list, if one is interested he/she can read it here:
http://reactos.com:8080/archives/public/ros-kernel/2003-October/000294.html
Just a new one, robustness. I didn't test Captive NTFS but looking at the system design it's a bit too complex. Error-prone. With a variable black box (there are a lot of different ntfs.sys) in the middle. Sorry but I wouldn't trust it for a minute. I sincerely hope I will be disproved by millions of happy users.
BTW there is still no non-Captive free implementation of read/write NTFS.
How long did it take to implement Captive NTFS? My estimation for a free (full GPL) r/w implementation is
needed kernel knowledge: 3-9 months if one doesn't have it yet NTFS knowledge: 1-3 months if one doesn't have it yet finish r/w NTFS v2 driver: 2-6 months
Szaka
Hi,
On Wed, 22 Oct 2003 10:32:31 +0200, Szakacsits Szabolcs wrote:
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
...
Just a new one, robustness. I didn't test Captive NTFS but looking at the system design it's a bit too complex. Error-prone. With a variable black box (there are a lot of different ntfs.sys) in the middle. Sorry but I wouldn't trust it for a minute.
The project design separated all the bug-prone code to an inner isolated box (sandbox) while leaving only fully open source thin UNIX layer around: http://www.jankratochvil.net/project/captive/doc/Details.html.pl#sandbox
(The diagram does not appropriately show the real amount of codebase as 95% of the project effort is implemented in the center 'ntoskrnl.exe' box.)
As the disk data blocks are flushed only after the successful unmount operation there is a little chance the data structures would be corrupted. Any Captive bugs (*) during the project development cause sanity crash where no disk blocks get modified. If any crash occurs you may loose your last-minute written files although the disk remains intact. Such possible crash is logged to /var/log/messages with all the filenames involved.
(*) There remains a sensitive part of undocumented Cache Manager Windows NT subsystem as its incompatible emulation directly affects the disk data blocks. Behaviour of the original Cache Manager was thoroughly traced by my written TraceFS.sys and its output was afterwards sanity checked by my special Perl error-checking Cache Manager implementation to fully understand it for the successfuly emulation: http://www.jankratochvil.net/project/captive/doc/CacheManager.html.pl http://www.jankratochvil.net/pipermail/captive-devel-list/2003-October/00000...
To ensure the maximum level safety it is also recommended to use the Microsoft Checked Build (assert()s enabled) release of ntfs.sys/ntoskrnl.exe although its main benefits were seen during the Captive development. More discussion about Captive NTFS paranoia error prevention can be found at: http://www.jankratochvil.net/project/captive/doc/Details.html.pl#paranoia http://www.jankratochvil.net/project/captive/doc/Details.html.pl#parent_conn...
...
How long did it take to implement Captive NTFS? My estimation for a free (full GPL) r/w implementation is
needed kernel knowledge: 3-9 months if one doesn't have it yet NTFS knowledge: 1-3 months if one doesn't have it yet finish r/w NTFS v2 driver: 2-6 months
It does not make sense to me compare LinuxNTFS and Captive NTFS as there is just a completely a different level of reliability (no replies, please :-) ). Captive NTFS primary goal was to reach _reliable_ NTFS and according to the betatesting phase of the project it was successfuly reached.
Also any further versions of NTFS filesystem require no or little effort for Captive NTFS while it may - or may not - require another major reverse engineering effort for the LinuxNTFS approach.
My implementation of the part of Microsoft Windows Kernel API required for ntfs.sys (own 116 funcs, orig/modified ReactOS funcs 113, 83 of ntoskrnl.exe) took 14 months without any Microsoft Windows experience before. Time includes the final deployment preparation such as the packaging and installer. As the whole project is GPLed and no profit was made out of it it could not be my primary task.
BTW I expected less effort for my project initially. 14 months above is the real final duration while any work estimations are always very optimistic. Any NTFS implementation is about debugging where is no real estimation possible.
Lace
Reading data from a file system is much easier then writing. It's kind of like saying, I can extract the data from a database raw, but just because I can do that doesn't mean I can take into account all the undocumented nuances that I am ignoring to get the data in the first place, that the original system will consider corrupt if I miss even one bit in the proper place.
I can almost certainly extract data from most anything in a reasonable time frame (not encrypted) but it takes hundreds of times longer to figure out how to write that data in a manner that won't trigger a issue.
Steven
On Tue, 2003-10-21 at 15:05, Szakacsits Szabolcs wrote:
On Tue, 21 Oct 2003, Jan Kratochvil wrote:
On Tue, 21 Oct 2003 17:41:19 +0200, Szakacsits Szabolcs wrote:
Why not? NTFS has all the features that other Linux filesystems have and even more. Why couldn't it be used as a standalone filesystem?
As its data structures are undocumented.
Lots of people spent lots of time many years ago to figure them out. There are at least 4-5 different ntfs implementations based on it. Most of them is read-only but if you know how to read, sure you can write as well. But due to the concurrenct access, it's much more difficult to implement.
And it IMO does not make sense to fork its documented derivative out of it (there was already one with unhappy end).
Sorry I don't get what you mean. The old NTFS driver? It didn't check the NTFS version so when Microsoft improved it slightly (Win2K) and thus updated its on-disk version number then the old driver tried to use it as an NT4 NTFS. Trivial driver bug. Unfortunately nobody fixed it for a very long time thus it ruined many people's filesystems. Thus because of the Linux driver bug Microsoft became even more evil, NTFS completely undocumented, later on the patent rumours added and etc.
The rewritten drivers and ntfsprogs check the NTFS version and exit if it's unknown.
BTW, it would be interesting if one could check, try out what's Longhorn's NTFS version numbers (i.e. if it changed or not). E.g. ntfsresize -i /device would tell it.
Let me tell an example, the (Linux) NTFS driver supports transparent compression.
OK, it is a feature missing in current GPLed (incl. specification) filesystems. It still makes more sense to me implementing the feature to existing GPL filesystem instead of implementing the same feature to NTFS with uncertain data structures.
It's not uncertain, it's know for a while. Moreover NTFS isn't only a new filesystem. It's also an important interoperability issue. Think about FAT. Is it good there are all over open source FAT drivers? It even made possible to quickly adopt to X-BOX's FATX. Soon FAT will go away completely, only NTFS stays.
But we are talking about free software - do what you get paid for.
I don't understand this. There are many reasons why people write open source software. Charity, bust ego, religion, fun, get paid, etc.
Szaka