Re: [PATCH 0/6] Extended file stat system call

28 Apr 2012


      On Fri, Apr 27, 2012 at 01:31:07PM -0600, Andreas Dilger wrote:
...
On 2012-04-27, at 7:13 AM, Dave Chinner wrote:
...
Have a look at fs/xfs/xfs_dinode.h. There's a bunch of flags defined
at the bottom of the file.
Stuff like the "nodefrag", "nodump", and "prealloc" bits seem fairly
generic - they are for indicating that files are to be avoided for
defrag or backup purposes, the prealloc bit indicates that fallocate
has been used to reserve space on the inode (finding files that space
can be punched out of safely), and so on.
There is already the FS_NODUMP_FL in the standard FS_IOC_GETFLAGS ioctl
and I expect this to be in statxat() also.
I forgot that was one of the generic flags :/
...
In ext4 there was also an
EXT4_EOFBLOCKS_FL added for inodes with fallocate'd data beyond EOF,
but Eric thought it was a pain to maintain and it has been deprecated
in ext4 and e2fsprogs recently.
I'd think that flag is more of a "filesystem implementation
specific" flag than a general "this file contained persistent
preallocation" flag, which is essentially what the XFS flag says.
XFS uses in various ways to optimise extent management on the file
(e.g. don't truncate extents past EOF when closing the file), but it
is not specific to one particular aspect of the preallocation
implementation.
...
...
...
OTOH, there's plenty of uncommitted space, so if we can condense
the hints down to something small, we could perhaps add it later -
but from your paragraph above, it doesn't sound like it'll be small.
Allocation block size, minimum sane IO size (to avoid page cache RMW
cycles or DIO zeroing), minimum prefered IO size (e.g. stripe unit),
optimal IO size for bandwidth (e.g. stripe width). I don't think
there's much more than that which will be really usable by
applications.
I think this is a minimal set that makes sense, and is manageable for
both the interface and for users.  Even if it isn't 100% correct for
every file of every filesystem, it still makes sense for many systems.
That's the aim, isn't it? To expose what is useful to the majority
in a simple manner?
...
I'd suggest st_frsize (like BSD statvfs() f_frsize) would be the
minimum fragment or page size, st_iosize (BSD f_iosize) could be
the optimal IO size, and "st_stripesize" for the minimum preferred RAID/chunk size.
Personally, I think those names are, well, terribly lacking in
obviousness. Something more along the lines of:
st_blksize		- file block size
    st_alloc_blksize	- allocation block size/alignment
    st_small_io_size	- IO size/alignment that avoids
    			  filesystem/page cache RMW
    st_preferred_io_size	- preferred IO size for general
    			  usage.
    st_large_io_size	- IO size/alignment for high
    			  bandwidth sequential IO
With the aim that applications tend to use st_preferred_io_size for
all general IO (i.e. the default), st_small_io_size for small IO,
IOPS intensive workloads, and st_large_io_size for writing large
chunks of sequential data.
...
One could argue that "st_blksize" is used for the "optimal IO size"
on Linux today, but this is an overloaded term.  It _appears_ to
represent the filesystem blocksize, which it usually is not, and on
BSD st_bsize means the minimum blocksize and has a confusingly
similar name.  Since any application using this API needs to do some
extra coding already, we may as well give the structure members good
names that are not ambiguous.
Well said - I couldn't have stated the case better myself. ;)
Cheers,
Dave.
-- 
Dave Chinner
david@fromorbit.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH 0/6] Extended file stat system call