[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.11 v2 2/2] Add new add_maintainers.pl script to optimise the workflow when using git format-patch with get_maintainer.pl

Thanks for this.  It looks like it will be very useful.  I have
reviewed your script in detail.

My review is quite picky, mainly about error handling.

Also I have a lot of style comments: where I say "so and so would be
more perlish" you should feel free to leave it as it is if you prefer.
(I would need some convincing that the continued use of `index' was

Lars Kurth writes ("[PATCH for-4.11 v2 2/2] Add new add_maintainers.pl script 
to optimise the workflow when using git format-patch with get_maintainer.pl"):
> +my $toolversion = "1.0";

I think in practice this version number will quickly become out of
date as it won't be updated.  I think it is harmless but I would drop

> +my @tags                = ("Acked-by:",
> +                           "Release-acked-by:",
> +                           "Reviewed-by:");
> +my $CC                  = "Cc:"; # Note: git-send-mail requires Cc:
> +my $TO                  = "To:";
> +my $AT                  = "@";

This is a very odd variable and needs to be abolished.

> +my $cc_insert_before    = "Signed-off-by:";
> +my $to_insert_before    = "Date:";
> +my $cover_letter        = "0000-cover-letter.patch";
> +my $get_maintainer      = "./scripts/get_maintainer.pl";
> +my $patch_ext           = ".patch";
> +my $mailing_lists       = $AT."lists.";

I notice that your usual quoting character for literal strings is "
rather than '.  You might find it easier to make more use of ', which
does much less interpolation.

> +my @lists; #Needed for <<EOT

Erk.  No, you need to write \@lists.  The warning you are suppressing
here was trying to tell you that
causes the value of the array variable @lists to be interpolated.
Which is not what you want.  I'm slightly surprised you didn't

> +my $usage = <<EOT;
> +USAGE: $tool [options] (--patchdir|-d) <patchdir>
> +VERSION: $toolversion
> +
> +--------
> +  [(--reroll-count|-v) <n>]

This syntax with the ( ) seems clumsy to me.  I would just write

> +  --reroll-count <n> | -v<n>

but this is a matter of taste, so no reason to withhold my ack.

> +  [--verbose]
> +    Show more output

These [ ] are clearly wrong.

> +  [--version]
> +    Show version
> +  --h|help|usage

This is clearly wrong because it's -h, not --h.  And spaces make this
easier to read.  I suggest

  +  -h | --help | --usage

(TBH I don't see the need to support --usage, but whatever.)

> +if ($help != 0) {

Conventional (idiomatic perl) style would be

   if ($help) {

(throughout).  But what you have is not wrong.

> +if (! -e $get_maintainer) {
> +    die "$tool: The tool requires $get_maintainer\n";
> +}

You may remember me saying I wanted a mode that simply transfers
maintainer information already provided in the patches.  That is
useful when the CCs have been manually specified.

I don't think such a mode is essential for me to ack this patch.

But in that mode get_maintainer is not essential.

In any case, if you do want to check it, you should stat, rather than
using a file test.  File tests on other than _ make it hard to produce
correct and useful messages on failure.  In this case, you fail to
print the errno, which you could do if you called stat.

Anyway, is it really worthwhile specifically testing that
get_maintainer exists ?  If it doesn't presumably you will try to run
it, and then get an error later which will print something sensible ?

> +if (! -e $patch_dir) {
> +    die "$tool: Directory $patch_dir does not exist\n";
> +}

Same comment as above about stat.  Also, again, it might be better not
to check and simply allow the later code to handle errors correctly.

> +# Get the list of patches
> +my $pattern = $patch_dir.'/'.$patch_prefix.'*'.$patch_ext;

This goes wrong if $patch_dir (or $patch_prefix) contains whitespace
or glob characters.  This will be fine in any reasonable Unix
environment, but there are corner cases where it may go wrong.  For
example, I am told that modern desktop environments mount removeable
storage media on a pathname containing the volume label (this seems
very unwise to me, but there you are).

I don't think this is a problem for this script, but I thought I would
mention it.

> +my @patches = glob($pattern);

You should set $!=0 first because that makes proper error handling
possible.  The error handling should come immediately after the call,
so dealing with that now:

> +if (!scalar @patches) {
> +    die "$tool: Directory $patch_dir contains no patches\n";
> +}

Here you should check $!.  If $! then there was a read error on the
directory, which should be reported (with the $! value).  Doing this
also obviates the need to check $patch_dir's existence, because if it
doesn't exist you get $!=ENOENT.

> +foreach my $file (@patches) {

It would be nice to exclude ~ and .bak files here.  That way manually
editing files won't require trickery to exclude them.

> +    if (index($file, $cover_letter) != -1) {

This is quite an un-perlish way to do things.  Also it goes wrong if
the patch *directory* is called 0000-cover-letter.patch (which would
be mad, but it would be better not to make things worse), or if there
is a 0000-cover-letter.patch~ but no 0000-cover-letter.patch.

I suggest something like
       if ($file =~ m{/\Q$cover_letter\E$}) {
> +        # Insert CC's and TO's at the top of the *.patch file
> +        if ($mode eq "top") {     
> +            my $insert = join("\n", uniq (@to_only, @cc_only))."\n";
> +            # Insert snippets into files
> +            # $insert before "Signed-off-by:"
> +            insert_before($file , $insert, $to_insert_before);
> +        }
> +        elsif ($mode eq "ccbody") {

Conventional style would be elsif on the same line as }.

> +print "Then perform:\n".
> +      "git send-email -to xen-devel".$AT."lists.xenproject.org ".
> +      $patch_dir.'/'.$patch_prefix."*.patch"."\n";
> +
> +exit 1;

Why do you exit 1 here rather than 0 ?  It seems that your script has

> +sub getmaintainers {

Should have a prototype.  See perlsub(1).  You will find that you
probably need to move it earlier in the file, or add an

This applies to several of your other functions too.

> +    my ($file, $rto, $rcc) = @_;
> +    my $cmd = $get_maintainer." ".$get_maintainer_args." < ".$file;

More perlish might be
        "$get_maintainer $get_maintainer_args <$file"

But, worse, this is quite dodgy if any of this contains shell
metacharacters.  IMO you should use
    open $fh, "-|", $get_maintainer, ...

get_maintainer_args should become an array and use the appropriate
GetOptions thing for a repeatable argument.

> +    while(<$fh>) {
> +        chomp;
> +        # Keep lists and CC's separately as we dont want them in
> +        # the commit message under a Cc: line
> +        if (index($_, $mailing_lists) != -1) {

This is really very strange.  Firstly, I had to look for the
definition of $mailing_lists.  It seems to be a variable for little
reason, as it is not configurable.

Secondly, what this is trying to do is look for the string '@lists.'
anywhere in the CC.  But that is not a reliable way of identifying a
mailing list.  I think in general it is not possible to do this
reliably, but this is rather a suboptimal heuristic.

Instead, I would additionally check to see if the address is mentioned
in any L: line in MAINTAINERS.

I would also allow the user to specify regexps for addresses to be
treated as lists.  If you did that the the regexp \@lists\. would be a
good default starting point.

> +        if ($rextra) {
> +            my $item;
> +            foreach $item (@tags) {
> +                if (hastag($line, $item, \$nline)) {
> +                    # Replace tag with CC, then push
> +                    $nline =~ s/$item/$CC/;

I think this is not a sensible way to identify the tag part of the
line.  Instead, why not use a regexp like ^[-A-Z0-9a-z]+: ?

Perl programs should nearly never use `index'.

> +                    push @{$rextra}, $nline;

Again, @{$var} can be spelled @$var and conventionally is.

> +sub hastag ()
> +{
> +    my ($line, $tag, $rline) = @_;
> +    my $index = index(lc $line, lc $tag);

       if ($line =~ m{^\Q$tag\E}i) {

> +    if ($index != -1) {
> +        if ($index == 0) {
> +            # If at first position, then just return
> +            ${$rline} = $line;
> +            return 1;
> +        } else {
> +            # If not at first position, then remove white
> +            # spaces before the tag and return a normalized
> +            # string

How do you ever get a tag not in the left hand column ?  I think such
a thing, if ever found, has probably been indented specifically to
protect it from automatic processing ?

> +sub normalize {
> +    # Note: you cannot pass two arrays to a subroutine without loosing the
> +    # information which entry belongs to which array. Thus, pass as 
> references.

I think this function needs a comment to say what it does!

So AFAICT normalize(\@x,\@y) returns the elements of @x that do not
appear in @y.

If you made normalize strip repeats from @x, which would be easy, then
it would do the work of uniq too.  That might be simpler to follow.

(The thing about arrays is standard perl but I don't mind you
mentioning it.)

> +    my ($ra, $rb) = @_;
> +    my @aonly = ();
> +    my %seen;
> +    my $item;
> +
> +    foreach $item (@{$rb}) {

Conventional idiomatic syntax would be

  +    foreach $item (@$rb) {

> +    foreach $item (@{$ra}) {
> +        unless ($seen{$item}) {
> +            # it's not in %seen, so add to @aonly
> +            push @aonly, $item;
> +        }
> +    }

This can be written much more perlishly with grep.  But if that is too
obtuse I don't mind this open-coding.

> +sub writefile {
> +    my ($content, $file) = @_;
> +    my $fh;
> +    open($fh, ">", $file)
> +         or die "Could not open file '$file' $!";
> +    print $fh $content;

                          or die $!

> +    close $fh;

                 or die $!
> +
> +    return 1;

Why this when none of your callers check the return value ?

Returning 1 for success is in any case not a very perlish error
handling pattern.

> +sub insert_before {
> +    my ($file, $ins, $before) = @_;
> +    my $content;
> +
> +    if ($ins eq "\n") {
> +        # Not inserting anything
> +        # I added this here such that I don't have to run the check
> +        # when calling the function.
> +        return 0;
> +    }

Instead of this extra code here, I would make $ins contain as many \n
as lines.  Something like (earlier):

+    my $insert = map { "$_\n" } uniq (@to_only, @cc_only);

Except of course subject to my other comments about uniq.

Also it would be nice if the variable name here ($ins) was the same as
the earlier one ($insert).

> +    # Read file
> +    $content = readfile($file);
> +
> +    # Split the string and generate new content
> +    my $i  = index($content, $before);

Again, do this with regexps.  Something like one of these (untested):

       $content =~ s{^(?=\Q$insert_before\E)}{$content}m;

       $content =~ s{^\Q$insert_before\E)}{$content$&}m;

(Your "index" is wrong because it should match only at start of line.)

> +    return 1;
> +}

Again, why the return ?

I hope this detailed critique is helpful...


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.