NAME

RS::Handy - a grab-bag of useful routines

SYNOPSIS

    use RS::Handy qw(:stat xdie);

    my @st = stat $path
        or xdie "can't stat $path:";
    print "$path modified ", scalar localtime $st[ST_MTIME], "\n";

    # and many more, sorry for leaving them out of the synopsis

DESCRIPTION

This module provides some assorted functions I like to use in my programs. I've tossed many of my generic routines in here, I should really have more discipline about categorizing all these things and creating separate modules for them. I've split some modules out of here in the past (Proc::SyncExec, IPC::Signal, Proc::WaitStat, String::ShellQuote, plus some which made it into the core), if you find any of these compellingly useful let me know so I can prioritize splitting them out, too.

Nothing is automatically exported into your namespace.

Almost all of these functions die() if they encounter any sort of problem.

IMPORTABLE SYMBOLS

$Me

The basename of the currently running script.

fileline [level]

Return a string describing the caller's file and line number. If level is given it's an additional number of stack frames to go back.

chompexpr_fileline s

Remove a trailing newline from s and then try to remove a trailing "at $file line $num\n" or similar from the result. In a scalar context return just the initial part of the string, in a list context also include the $file and $num.

subname [level]

Return the name of the current subroutine (that is, the one which invokes subname). If level is given, go back that many additional stack frames and give the name that sub instead.

subcall_info [level]

Return a text string which describes the invocation of the current subroutine. Like subname, current really means subcall_info's caller, and you can specify a level to go back to a different stack frame.

badinvo [level [message]]

Die with a message indicating that the current subroutine was invoked improperly. If level is given go back that many additional levels before the current one. If message is also given include it in the die message.

xwarn [message]...
xdie [message]...

Like warn and die but these functions prepend the message with the name of the script and a colon. Further, if the last message ends with a colon the current string value of $! and a newline are appended to it.

xwarn_caller additional_frames, [message]...
xdie_caller additional_frames, [message]...

Like xwarn and xdie but the given additional number of stack frames are climbed to find the file/line from which to warn. An additional_frames argument of 0 is equivalent to the non-_caller form.

xcarp arg...
xcluck arg...
xcroak arg...
xconfess arg...

Like the corresponding function from the Carp module, but as with xwarn supply the script's name.

process_arg_pairs rargs [key => ref to var]...

This sub is used to translate argument lists of key/value pairs. The rargs is a reference to the list of arguments which were given to your sub. The key/ref to var pairs are your subs arguments and the variables they set. If an invalid argument is present in the rargs list this sub will die.

dirents dir

Returns all the entries in dir except dot and dotdot.

dirents_qualified [dir]...

Returns fully qualified pathnames of all the entries in each listed dir except dot and dotdot. If no dirs are given then an empty list is returned.

fopen_mode mode

This function turns an fopen()-type mode string (like 'r+') and returns the Perl equivalent ('+<').

fdopen fd mode

Like the system fdopen(). I dislike that POSIX.pm tells you to use FileHandle->new_from_fd. Further, FileHandle's new_from_fd() is broken, it does not get the mode right if it's read/write.

fhbits filehandle...

This routine returns a bitmask which includes all the specified filehandles.

uncontrol [string]...

This function joins all the string arguments and expands control characters as ^A and meta characters as \200 and the like. It returns the concatenated result.

uncontrol_emacs [string]

This function joins all the string arguments together and expands control characters to C-A and meta characters to M-A and whitespace to SPC and RET and so on in the emacs fashion. It returns the concatenated result.

f_getfl filehandle

This performs an F_GETFL on filehandle and returns the flags.

f_setfl filehandle flags

This performs an F_SETFL using the arguments.

getopt [-bundle | -bundling] [GetOptions-arg...]

This is basically Getopt::Long but it has the defaults set up the way I think they should be.

exclusive_create path

This is like a call to the system open() with flags O_CREAT, O_EXCL and O_RDWR set. A Perl filehandle is returned.

tmpfile

This is just like the C function tmpfile(). I dislike that POSIX.pm tells you to use FileHandle->new_tmpfile (though I'm unsure why I don't just have this function call that one).

safe_tmp [arg]...

This routine safely creates temporary files and directories. The default is files, specify mkdir as an arg to make a directory instead. In scalar context the return value is the name of the file or directory created. In array context the file-creating version also returns a filehandle opened in read/write mode on the file. Right now it's an error to call the mkdir version in array context.

Valid args are:

mkdir

This boolean option indicates that a directory rather than a file should be created.

fh => filehandle

Create the file using the given filehandle. If you don't specify this a filehandle is generated for you. You can retrieve it if you call safe_tmp in array context.

dir => directory

Place the created file in this directory. The default is the user's $TMPDIR or /tmp. Note that both the dir arg and $TMPDIR are ignored if the prefix contains a /.

prefix => string

Specify the part of the file name which precedes the random part. The default comes from $0 or (if that doesn't work out) the user's name. If this value contains a / the dir argument is ignored.

loginprefix

Use the user name, not the script name, as the preferred default prefix.

postfix => string

Specify the part of the file name which comes after the random part. There is no default, as you don't normally need one. It's useful for programs which require that files have a certain extension.

    $zip_file = safe_tmp postfix => '.zip'
                    or xdie "can't create temporary file:";
mode => number

Specify the file creation mode. The default is 0600 for files and 0700 for directories.

home

This function returns the current home directory, preferring $HOME to looking it up in the password file. It doesn't cache the result.

home_of user

This function returns the home directory for the given user, or dies.

:stat

This tag gives you subs named ST_DEV, ST_INO, ST_MODE and so on which return indices into a stat() or lstat() array.

:tm

This tag gives you subs named TM_SEC, TM_MIN, and so on which return indices into a gmtime() or localtime() array.

full_name_uid uid

Return a best guess at the full name for the user with uid uid, but always return something usable (rather than dying or returning undef).

full_name

Call full_name_uid with the real user id.

pwuid uid

Return the user name associated with uid.

grgid gid

Return the group name associated with gid.

cat [path]...

Return the contents of each path, one line per element.

catslurp [path]...

Return the contents of each path, in a list context with each file in a separate returned value, in a scalar context all joined together.

xsrand

A better srand().

shuffle [item]...

Return all the items in random order.

yorn default prompt [no_modify_prompt] [option => value]...

Prompt with prompt for a [YNyn] (or allow enter alone to mean the default, if one was given). If no_modify_prompt is not set then " (y/n) [$default] " is appended to the prompt. Options are:

    iosub       sub to print prompt ($_[0]) and return response
    localize    Locale::Maketext handle
prompt prompt ref_to_choices [option => value]...

Display prompt and prompt the user until she enters one of the choices in the list pointed to by ref_to_choices. Options are:

    default             set default response
    downcase            downcase the input
    iosub       sub to print prompt ($_[0]) and return response
    localize            Locale::Maketext handle
    no_modify_prompt    don't add the default tot he prompt
    upcase              upcase the input
    wash                do: $input = $wash->($input);

A choice which is an RE object (qr//) accepts anything which matches.

A choice of undef matches EOF from the user. Without that this function will die if it reads EOF.

data_dump item...

Format some data for a human to read, for debugging.

data_dump_unsorted item

This is like data_dump(), but it doesn't try to sorted hash keys. This is useful for large or Tie::IxHash hashes.

untaint [arg]...

Untaint and return the args.

tainted [arg]...

Return true if any of the args are tainted.

This is intended to be like rename(2) except that it will fail if dest exists. It returns true if it succeeds and false (with $! set appropriately) otherwise. Since it's implemented with a link()/unlink() pair it isn't atomic, alas. If something goes strangely wrong it can leave both the source and dest links on the disk, but it tries not to let that happen.

rename_unique source dest [max-tries]

This function renames source to dest without overwriting an existing dest. If dest exists rename_unique appends a numberic extension and tries again, up to max-tries (default 100) times. Returns the resulting file name if all goes well, undef otherwise. If it returns undef you can check $! for the errno and $@ for a more verbose description of the problem. It uses link() so usually source and dest have to be on the same filesystem, and further it isn't atomic (there is a period during which both source and dest exist on the disk). If something goes strangely wrong it can leave both the source and dest links on the disk, but it tries not to let that happen.

filter_string string command [arg]...

filter_string runs the command with string on its stdin and returns the stdout (as a string in a scalar context or an list of lines in an array context). If there is a system problem filter_string croaks. The exit status of the command is in $?.

mbox_read_head fh

This function reads an email header from from fh and returns it. In an array context a content length indicator is also returned, you can use it when calling mbox_read_body().

mbox_read_body fh [skip-or-callback [clen]]

This function reads an email message body from the given file handle. The file handle is left pointing at the start of the next message, or at EOF. Normally the body is returned (with the mbox encodings undone).

skip-or-callback has two functions:

In either of these cases mbox_read_body returns the length of the body rather than the body itself. This length is the raw length from the start of the body to the start of the next message.

NB: The clen is the value from the Content-Length header, as returned by mbox_read_head. It is used for parsing messages which were saved using a Content-Length header instead of From_ line escaping. If you might have any such it's important that you include it.

mbox_read fh

This function reads a mail message from fh using mbox_read_head() and mbox_read_body() (which see). In an array context it returns the head and the body as two elements, in a scalar context it returns them as a single string joined with a blank line.

mbox_escape_body_part_in_place s

Perform mboxrd-style body escaping on s, modifying the argument in place.

mbox_escape msg

This function escapes a mail message into mboxrd format, the result can be written directly to a file.

daemonize keep_fh_hash_ref

Become a daemon. If keep_fh_hash_ref is specified, STDIN/STDOUT/STDERR are kept instead of being attached to /dev/null if the corresponding entry in the hash is true.

url_decode s

Decode URL-encoded string s.

url_encode_c s

Conservatively URL-encode s. The only characters which aren't encoded are alphanumerics, underscore, period and dash.

html_attr_encode s

Encode and qutoe appropriately an HTML attribute value.

html_attrs [ name => value ]...

Take a list of HTML name, value pairs and encode them, returning a string to use as the attribute list. Use an undef value for a boolean attribute.

html_escape s

Escape s so that it can be included in HTML.

create_index_subs_pkg pkg, to-num-pfx, to-name-pfx, name...

This function sets up constant subs which are used to index into arrays. It fills some of the same niche that pseudo hashes do, but if you have strict subs turned on you get more typo protection. A list of sub names (without package) created is returned.

It is easiest to describe with an example. If you say

    create_index_subs_pkg __PACKAGE__, 'F', 'FNAME', qw(foo bar baz);

you get

    sub F_FOO           () { 0 }
    sub F_BAR           () { 1 }
    sub F_BAZ           () { 2 }

    sub FNAME_FOO       () { 'foo' }
    sub FNAME_BAR       () { 'bar' }
    sub FNAME_BAZ       () { 'baz' }

You can specify either the to-num-prefix or the to-name-prefix as undef in order to skip those subs. If either prefix is '' the underscore will also be skipped when creating the corresponding sub names.

Most people will want to use create_index_subs() instead, which calls this function with the package argument filled in.

create_index_subs to-num-prefix, to-name-prefix, name...

This calls create_index_subs_pkg but sets the package argument for you.

create_constant_subs_pkg pkg, prefix, { key => value }...

This function sets up constant subroutines. It returns the names of the subs (without package) created. A sub is named with the prefix, and underscore, and upcased key. If the prefix is '' the underscore will also be skipped when creating the corresponding sub name. For example:

    create_constant_subs_pkg __PACKAGE__, 'PF', foo => 'F', bar => 'B';

will create

    sub PF_FOO () { 'F' }
    sub PF_BAR () { 'B' }

Most peole will want to use create_constant_subs() instead, which calls this function with the package argument filled in.

create_constant_subs prefix, { key => value }...

This calls create_constant_subs_pkg but sets the package argument for you.

get_win_size

Return the kernel's idea of the current TTY's terminal size.

even_elements LIST
odd_elements LIST

Return only the even or odd elements of LIST, numbering from 0.

define LIST

Map undefined elements to the empty string.

command_with_stdin s, cmd, arg...

Run cmd with args and pipe s to it on stdin. Croaking if something unusual happens, otherwise the exit value is in $?.

If the s is sub ref it's called repeatedly to supply the input.

sendmail message, sendmail arg...

Run sendmail with message on its input and the given args, croaking if something unusual happens. Its return is in $?.

If the message is sub ref it's called repeatedly to supply the message.

ordinal number

Convert from a number to mixed numeric/English ordinal (1st, 2nd, etc.).

wrap initial tab, subsequent tab, text...

This is like Text::Wrap::wrap() (which it uses) but it won't die, it joins strings with $, rather than ' ', and it sets the width based on the terminal size.

max num...

Return the greatest num.

min num...

Return the least num.

max_not_undef num or undef...

Return the greatest arg, treating undef as smaller than any number.

min_not_undef num or undef...

Return the least arg, treating undef as larger than any number.

fuser file...

Return a list of PIDs which are using the given files.

valid_check_digit num [mod]

Return a boolean indicating whether the given num has a valid mod-mod (default 7) check digit.

replace_uids uid

Replace both the real and effective user ID of the current process, or die trying.

replace_ugids uid, gid...

Replace both the real and effective user and group IDs of the current process, or die trying.

cmp_strnum a, b

This is like cmp but it tries to do the Right Thing for scalars which contain both numbers and non-numbers, like software version numbers.

cmp_strnum_transform s

This is the helper function it uses to do that. The result is a scalar which can be cmped directly against similar scalars.

have_prog program...

Return the first program which is an executable file in the $PATH (or which is an executable file given with absolute or relative path).

rfc822_dt [time [use_utc]]

Return an RFC 2822-style date string (like Mon, 21 Jan 2002 02:33:56 -0500) for the given time. If the time isn't specified the current local time is used. If use_utc is true the output is in UTC rather than the local time zone.

iso_date [time [use_utc]]

Return the date in ISO 8601 format.

iso_dt [time [use_utc]]

Return the date/time in ISO 8601 format.

interval_to_secs interval

Convert a string to a number of seconds. An interval is currently a decimal number followed by a suffix. Suffixes and examples:

    s = second          -30s = -30 seconds
    m = minute            5m = 5 minutes
    h = hour            1.5h = 1.5 hours
    d = day               1d = 1 day

If you give me invalid input I'll croak.

secs_to_dhms seconds

Convert a number of seconds into a number of days, hours, minutes, and seconds.

    secs_to_dhms 90061
        -> (1, 1, 1, 1)
secs_to_dhms_str seconds

Convert a number of seconds into a human-readable representation of the number of days, hours, minutes, and seconds.

    secs_to_dhms 90061
        -> '1d+1:01:01'
mtime [file]

Return the mtime of file, or undef with $! set. If file is ommitted, returns the mtime from the current saved stat buffer _.

flush fh

Flush the output on the given filehandle, or return an error. This does not turn on autoflush for the filehandle.

commify [num]...

Add commas to a number to each given number. Eg, 23456 is returned as 23,456.

mime_token_quote s

Quote s as a MIME token and return the result.

unique s...

Return the args with duplicates removed (the first is left in place).

cache_url url, secs

Fetch and return the data from url, and cache the result. Use the cached value unless it was cached more than secs seconds ago.

set_autoflush fh [value]

Set $| for the given fh to value (default on). The previous $| for the filehandle is returned.

remove_at_exit path...

Arrange for each path to be removed when the program exits (even if it's killed by a (normal, per sigtrap) signal). If any can't be removed an error will be printed and the program will exit with a non-zero status.

This doesn't support removing directories yet, because File::Path is unsafe and doesn't properly report errors.

struct_linger_pack linger_onoff, linger_time

Pack a struct linger as used by a SO_LINGER setsockopt().

(linger_onoff, linger_time) = struct_linger_unpack packed

Unpack a struct linger as used by a SO_LINGER setsockopt().

equal thing_1, thing_2

This predicate is true if thing_1 and thing_2 have similar structure and contents.

XXX The current implementation fails on code references, treating them all as equal.

dstr s

Return s suitably formatted for unambiguous, warning-free display, for use in debugging. This is prototyped to take a single scalar arg.

expand_field_list_specs num-fields, spec...

Given a field count and a a string like cut's -f switch (except negative numbers count from the end), return a corresponding list of fields. Eg,

    expand_field_list_specs 12, '5,8 - -3', '2-4', '10-''
        => (5, 8, 9, 10, 2, 3, 4, 10, 11, 12)

A too-negative number for the current num-fields is treated as 1. An invalid spec causes a die().

non_commments [comment-re,] [s]...

Return all the s args which aren't empty, blank, or comments. By default a comment is a line which matches /^\s*#/. The comment-re arg must be a Regexp object (as with qr//) so I can tell it's there.

discard_zombies

Get rid of current zombies (processes which have exited but haven't been reaped).

list_length [arg]...

Return the number of arguments. This is useful to count how many things another function returns.

decompressing_open_command file

If file is a compressed file, return a command which will decompress it. Otherwise return it as-is.

eq_undef a, b

This is like eq but it won't warn if either arg is undef, and undef is only equal to another undef.

inverse_hash [key => value]...

Reverse and return the key/value pairs. If there's a duplicate value, croak. The returned values will be in the same order you gave them to me.

G_PER_LB
KG_PER_LB
LB_PER_KG
lb_to_kg val
lb_to_g val
kg_to_lb val
g_to_lb val

These are simple weight conversion functions. Nothing special is done about the number of returned significant digits.

longest_common_prefix string...

Return the longest string which is a prefix for the given strings.

longest_common_parent_directory path...

Return the longest directory which is common to the given paths.

create_lock_file XXX add args

Create a lock file with retry, stale lock warnings, and other customizations.

XXX document options

    debug
    die_sub
    retry_sleep
    retry_timeout
    warn_age
    warn_sub
prompt_readline prompt

Prompt for a string using Term::ReadLine, handling non-interactive invocations gracefully.

center_string size, string, [truncate]

XXX add description

crack_time time string

Parse a string as a time and return 0-padded hours, minutes, and seconds. Seconds are optional. An undefined or empty input returns nothing.

trim XXX add args

XXX add description

iso_date_fuzzy_re

Returns a regex which matches an ISO date liberally, but at word boundaries. If it matches the regex leaves:

    $1=year
    $2=separator
    $3=month
    $4=day

AUTHOR

Roderick Schertler <roderick@argon.org>