safe_rename() and verifying the result of link(2)

Steffen Nurpmeso steffen at sdaoden.eu
Sun Sep 2 00:33:46 UTC 2018


Vincent Lefevre wrote in <20180831155541.GA30644 at cventin.lip.ens-lyon.fr>:
 |On 2018-08-24 18:54:16 +0200, Steffen Nurpmeso wrote:
 |> Oh, wait!  This was false rememberance, i referred to a message
 |> from Casper Dik of Oracle who wrote on 2015-12-31
 |> 
 |>>/* Create a unique file. O_EXCL does not really work over NFS so \
 |>>we follow
 |>> * the following trick (inspired by S.R. van den Berg):
 |>> * - make a mostly unique filename and try to create it
 |>> * - link the unique filename to our target
 |>> * - get the link count of the target
 |>> * - unlink the mostly unique filename
 |>> * - if the link count was 2, then we are ok; else we've failed */
 |> 
 |>   The problem of not being able to create a file with O_EXCL was, \
 |>   I think,
 |>   fixed in NFSv3 (if not, certainly in NFSv4)
 |> 
 |>   Casper
 |> 
 |> so this was not about link but about O_EXCL.
 |
 |Yes, a well-known issue.

Well.  RFC 1094 from 1989, section 2.2.10, Create File, says

   The file "name" is created in the directory given by "dir".  The
   initial attributes of the new file are given by "attributes".  A
   reply "status" of NFS_OK indicates that the file was created, and
   reply "file" and reply "attributes" are its file handle and
   attributes.  Any other reply "status" means that the operation failed
   and no file was created.

   Notes:  This routine should pass an exclusive create flag, meaning
   "create the file only if it is not already there".

So well-known yes, but i have no context of knowledge.  Maybe
i should ask or look in some NFS sources.

 |> About a year later
 |> (2016-11-02) there was a pair of message in between Stèphane and
 |> Jörg about links via NFS, as in "IIRC there were issues with ln on
 |> NFS for instance." and "Could you please explain what you have in
 |> mind?  I would like to understand whether there really is a NFS
 |> problem or whether there is just a NFS bug in Linux", but nothing
 |> more than that.
 |
 |Perhaps it could just be the one already mentioned:
 |
 |  On NFS filesystems, the return code may be wrong in case the NFS server
 |  performs the link creation and dies before it can say so.  Use  stat(2)
 |  to find out if the link got created.
 |
 |which, I suppose, is impossible to solve.

..The network connection can break at any time.  RFC 1094 says

  Also, most server failures occur between operations, not
  between the receipt of an operation and the response.  Finally,

A nice encouragement.

  although actual server failures may be rare, in complex networks,
  failures of any network, router, or bridge may be indistinguishable
  from a server failure.

One could also go and look locally whether the file exists or not.
No, i mean, it is hard to say and depends on the context.  If you
are linking/xy over a hundreds files in order an additional stat
is a problem except (maybe) for the last file, if you are only
doing one it may very well provide confidence.  Of course it is
still racy, some other process may surely have had its rights to
overwrite/remove/move away the target in the meantime.  Maybe
doing a stat on the target directory in order to verify that the
server is still alive would be it.  Since: which software is
actually flexible enough to create a (real) batch if batching as
above would be possible?

A nice Sunday i wish.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


More information about the Mutt-dev mailing list