Telephone +44(0)1524 64544
Email: info@shadowcat.co.uk

mstpan 5 - Files

Fri Dec 5 22:00:00 2014

mstpan 5 - Files

I can C: for files and files and files ...

This being perl, there's a wide range of things you might want to do and lots of ways to do them, of varying qualities.

File::Spec

Core, standard, method based, means of slicing, dicing and reassembling paths. If your code might ever need to run on windows, you're going to be using this or something that uses it for you. For trivial scripts, File::Spec is usually quite sufficient.

File::Spec::Functions

... but you probably wanted to use File::Spec::Functions instead, because if it's a script then the faffing involved in calling methods is entirely pointless. Actually, the same is probably true in most classes, just remember to apply namespace::autoclean or whatever so people subclassing don't end up terminally confused.

File::stat

Can you remember the order for the returns values of stat? Congratulations, you aren't going senile yet. Apparently I am, so I much prefer to load the (core) module that overrides the builtin to provide a named interface.

There's actually a bunch of such modules - Time::localtime and ::gmtime, plus User::grent and ::pwent - and they're disappointingly underused in my experience.

autodie

autodie is, basically, a giant bag of crack, balanced precariously atop Fatal, which is an even bigger bag of tainted crack.

Actually, the autodie implementation seems to be inside Fatal.pm now with the added fun of some splicing of @_ and goto on the way through (which probably segfaults 5.8.4 but never mind) so apparently somebody's mixed the crack and ... no, ok, this metaphor has broken down now. Start again.

autodie is awesome and if you're doing trivial scripty stuff and just want things like chdir() to throw an exception for you, it's a brilliant idea. Just try and avoid ever finding out how the sausage is made. The author already sacrificed his SAN so that we don't have to (thanks, PJF).

I would, in general, try and avoid needing it by using something slightly higher level but it's been in core since 5.10.1 so if you haven't yet read the entry that explains fatpacking (which, incidentally, I haven't yet written, so it's not actually your fault yet), it's going to seem like a great answer.

Be warned that it can't wrap system() without IPC::System::Simple installed.

Be aware that IPC::System::Simple is awesome in its own right and I use that regularly even when I don't have autodie enabled. Seriously. Go have a look.

File::Open

Probably an even better answer than using autodie to make open() do the right thing - File::Open provides an fopen function that returns a filehandle on success and throws an exception on error, with minimal cleverness in the process. Written by Lukas Mai, whose pedantry and attention to detail know no bounds (see also Data::Munge, which may also make you happy).

File::Slurp

Obsolete. Avoid. Port where possible.

Written many many years ago with all sorts of attempts at optimisations that made total sense on the systems of the time - or so people who still remember that far back assure me - but really don't now.

The version number is 9999 because it used to use year.date release versions and then somebody accidentally did a release in the future. It'll probably never handle unicode quite right.

Look at File::Slurp::Tiny for a relatively direct replacement and basically everything else in this article for better ways to do it.

Path::Tiny

Small, tight, OO wrapper around File::Spec and the relevant builtins. A good minimalist default, and probably the obvious way to replace uses of legacy modules like File::Slurp and Path::Class.

I feel like this section should be longer because honestly this is probably your best default option for most tasks, but Path::Tiny is really unexciting.

In fact, its unexcitingness is exactly what makes it an excellent default. If you only look at one module from this article, look at this one.

IO::All

The description "IO::All of it" is pretty much apt. It slices, it dices, it reads, it writes, it traverses, it searchs, it slurps, it greps.

If you have the right extensions involved, it'll quite happily handle TCP and FTP and pretty much anything else you throw at it.

Be aware that lots of people like to declare it a terrible idea due to the uber-dwim io(...) interface provided for one-liners, as opposed to the io->file(...) and etc. interfaces that do exactly what they're told to do; the docs have been updated to reflect the fact that that's a choice, but it'll presumably be another half decade or so before the whining stops.

IO::All is the nuclear powered swiss army chainsaw of file and path handling, perlish to the hilt both for good and for ill.

It's also, to my knowledge, the only thing that handles the concept of "I am on a unix machine, attempting to generate a path for a win32 machine, please use the right bits of File::Spec so that actually works".

This is what I use by default, because in my experience once you start messing around with directory structures you end up wanting to do a bunch of complicated stuff sooner or later, and the mix of procedural and OO involved in doing that with Path::Tiny tends to annoy me.

Summary

If IO::All seems like overkill, I heartily recommend Path::Tiny.

If Path::Tiny seems like underkill, I heartily recommend IO::All.

If OO seems like a WOFTAM for the problem, remember that File::Open and autodie exist (and IPC::System::Simple).

If you find yourself using File::Slurp, I can always lend you a spare bag of razorblades to go with it.

-- mst, out.