Xen project Mailing List

Re: [Xen-devel] [PATCH for-4.11 v2 2/2] Add new add_maintainers.pl script to optimise the workflow when using git format-patch with get_maintainer.pl

To: Lars Kurth <lars.kurth@xxxxxxxxxx>

From: Ian Jackson <ian.jackson@xxxxxxxxxx>

Date: Mon, 30 Apr 2018 15:35:58 +0100

Cc: Juergen Gross <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 30 Apr 2018 14:36:54 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thanks for this. It looks like it will be very useful. I have reviewed your script in detail. My review is quite picky, mainly about error handling. Also I have a lot of style comments: where I say "so and so would be more perlish" you should feel free to leave it as it is if you prefer. (I would need some convincing that the continued use of `index' was appropriate.) Lars Kurth writes ("[PATCH for-4.11 v2 2/2] Add new add_maintainers.pl script to optimise the workflow when using git format-patch with get_maintainer.pl"): ... > +my $toolversion = "1.0"; I think in practice this version number will quickly become out of date as it won't be updated. I think it is harmless but I would drop it. > +my @tags = ("Acked-by:", > + "Release-acked-by:", > + "Reviewed-by:"); > +my $CC = "Cc:"; # Note: git-send-mail requires Cc: > +my $TO = "To:"; > +my $AT = "@"; This is a very odd variable and needs to be abolished. > +my $cc_insert_before = "Signed-off-by:"; > +my $to_insert_before = "Date:"; > +my $cover_letter = "0000-cover-letter.patch"; > +my $get_maintainer = "./scripts/get_maintainer.pl"; > +my $patch_ext = ".patch"; > +my $mailing_lists = $AT."lists."; I notice that your usual quoting character for literal strings is " rather than '. You might find it easier to make more use of ', which does much less interpolation. > +my @lists; #Needed for <<EOT Erk. No, you need to write \@lists. The warning you are suppressing here was trying to tell you that xen-devel@xxxxxxxxxxxxxxxxxxxx causes the value of the array variable @lists to be interpolated. Which is not what you want. I'm slightly surprised you didn't notice... > +my $usage = <<EOT; > +USAGE: $tool [options] (--patchdir|-d) <patchdir> > +VERSION: $toolversion > + > +OPTIONS: > +-------- > + [(--reroll-count|-v) <n>] This syntax with the ( ) seems clumsy to me. I would just write > + --reroll-count <n> | -v<n> but this is a matter of taste, so no reason to withhold my ack. > + [--verbose] > + Show more output These [ ] are clearly wrong. > + [--version] > + Show version > + --h|help|usage This is clearly wrong because it's -h, not --h. And spaces make this easier to read. I suggest + -h | --help | --usage (TBH I don't see the need to support --usage, but whatever.) > +if ($help != 0) { Conventional (idiomatic perl) style would be if ($help) { (throughout). But what you have is not wrong. > +if (! -e $get_maintainer) { > + die "$tool: The tool requires $get_maintainer\n"; > +} You may remember me saying I wanted a mode that simply transfers maintainer information already provided in the patches. That is useful when the CCs have been manually specified. I don't think such a mode is essential for me to ack this patch. But in that mode get_maintainer is not essential. In any case, if you do want to check it, you should stat, rather than using a file test. File tests on other than _ make it hard to produce correct and useful messages on failure. In this case, you fail to print the errno, which you could do if you called stat. Anyway, is it really worthwhile specifically testing that get_maintainer exists ? If it doesn't presumably you will try to run it, and then get an error later which will print something sensible ? > +if (! -e $patch_dir) { > + die "$tool: Directory $patch_dir does not exist\n"; > +} Same comment as above about stat. Also, again, it might be better not to check and simply allow the later code to handle errors correctly. > +# Get the list of patches > +my $pattern = $patch_dir.'/'.$patch_prefix.'*'.$patch_ext; This goes wrong if $patch_dir (or $patch_prefix) contains whitespace or glob characters. This will be fine in any reasonable Unix environment, but there are corner cases where it may go wrong. For example, I am told that modern desktop environments mount removeable storage media on a pathname containing the volume label (this seems very unwise to me, but there you are). I don't think this is a problem for this script, but I thought I would mention it. > +my @patches = glob($pattern); You should set $!=0 first because that makes proper error handling possible. The error handling should come immediately after the call, so dealing with that now: > +if (!scalar @patches) { > + die "$tool: Directory $patch_dir contains no patches\n"; > +} Here you should check $!. If $! then there was a read error on the directory, which should be reported (with the $! value). Doing this also obviates the need to check $patch_dir's existence, because if it doesn't exist you get $!=ENOENT. > +foreach my $file (@patches) { It would be nice to exclude ~ and .bak files here. That way manually editing files won't require trickery to exclude them. > + if (index($file, $cover_letter) != -1) { This is quite an un-perlish way to do things. Also it goes wrong if the patch *directory* is called 0000-cover-letter.patch (which would be mad, but it would be better not to make things worse), or if there is a 0000-cover-letter.patch~ but no 0000-cover-letter.patch. I suggest something like if ($file =~ m{/\Q$cover_letter\E$}) { > + # Insert CC's and TO's at the top of the *.patch file > + if ($mode eq "top") { > + my $insert = join("\n", uniq (@to_only, @cc_only))."\n"; > + # Insert snippets into files > + # $insert before "Signed-off-by:" > + insert_before($file , $insert, $to_insert_before); > + } > + elsif ($mode eq "ccbody") { Conventional style would be elsif on the same line as }. > +print "Then perform:\n". > + "git send-email -to xen-devel".$AT."lists.xenproject.org ". > + $patch_dir.'/'.$patch_prefix."*.patch"."\n"; > + > +exit 1; Why do you exit 1 here rather than 0 ? It seems that your script has succeeded. > +sub getmaintainers { Should have a prototype. See perlsub(1). You will find that you probably need to move it earlier in the file, or add an pre-declaration. This applies to several of your other functions too. > + my ($file, $rto, $rcc) = @_; > + my $cmd = $get_maintainer." ".$get_maintainer_args." < ".$file; More perlish might be "$get_maintainer $get_maintainer_args <$file" But, worse, this is quite dodgy if any of this contains shell metacharacters. IMO you should use open $fh, "-|", $get_maintainer, ... instead. get_maintainer_args should become an array and use the appropriate GetOptions thing for a repeatable argument. > + while(<$fh>) { > + chomp; > + # Keep lists and CC's separately as we dont want them in > + # the commit message under a Cc: line > + if (index($_, $mailing_lists) != -1) { This is really very strange. Firstly, I had to look for the definition of $mailing_lists. It seems to be a variable for little reason, as it is not configurable. Secondly, what this is trying to do is look for the string '@lists.' anywhere in the CC. But that is not a reliable way of identifying a mailing list. I think in general it is not possible to do this reliably, but this is rather a suboptimal heuristic. Instead, I would additionally check to see if the address is mentioned in any L: line in MAINTAINERS. I would also allow the user to specify regexps for addresses to be treated as lists. If you did that the the regexp \@lists\. would be a good default starting point. > + if ($rextra) { > + my $item; > + foreach $item (@tags) { > + if (hastag($line, $item, \$nline)) { > + # Replace tag with CC, then push > + $nline =~ s/$item/$CC/; I think this is not a sensible way to identify the tag part of the line. Instead, why not use a regexp like ^[-A-Z0-9a-z]+: ? Perl programs should nearly never use `index'. > + push @{$rextra}, $nline; Again, @{$var} can be spelled @$var and conventionally is. > +sub hastag () > +{ > + my ($line, $tag, $rline) = @_; > + my $index = index(lc $line, lc $tag); ITYM if ($line =~ m{^\Q$tag\E}i) { ? > + if ($index != -1) { > + if ($index == 0) { > + # If at first position, then just return > + ${$rline} = $line; > + return 1; > + } else { > + # If not at first position, then remove white > + # spaces before the tag and return a normalized > + # string How do you ever get a tag not in the left hand column ? I think such a thing, if ever found, has probably been indented specifically to protect it from automatic processing ? > +sub normalize { > + # Note: you cannot pass two arrays to a subroutine without loosing the > + # information which entry belongs to which array. Thus, pass as > references. I think this function needs a comment to say what it does! So AFAICT normalize(\@x,\@y) returns the elements of @x that do not appear in @y. If you made normalize strip repeats from @x, which would be easy, then it would do the work of uniq too. That might be simpler to follow. (The thing about arrays is standard perl but I don't mind you mentioning it.) > + my ($ra, $rb) = @_; > + my @aonly = (); > + my %seen; > + my $item; > + > + foreach $item (@{$rb}) { Conventional idiomatic syntax would be + foreach $item (@$rb) { > + foreach $item (@{$ra}) { > + unless ($seen{$item}) { > + # it's not in %seen, so add to @aonly > + push @aonly, $item; > + } > + } This can be written much more perlishly with grep. But if that is too obtuse I don't mind this open-coding. > +sub writefile { > + my ($content, $file) = @_; > + my $fh; > + open($fh, ">", $file) > + or die "Could not open file '$file' $!"; > + print $fh $content; or die $! > + close $fh; or die $! > + > + return 1; Why this when none of your callers check the return value ? Returning 1 for success is in any case not a very perlish error handling pattern. > +sub insert_before { > + my ($file, $ins, $before) = @_; > + my $content; > + > + if ($ins eq "\n") { > + # Not inserting anything > + # I added this here such that I don't have to run the check > + # when calling the function. > + return 0; > + } Instead of this extra code here, I would make $ins contain as many \n as lines. Something like (earlier): + my $insert = map { "$_\n" } uniq (@to_only, @cc_only); Except of course subject to my other comments about uniq. Also it would be nice if the variable name here ($ins) was the same as the earlier one ($insert). > + # Read file > + $content = readfile($file); > + > + # Split the string and generate new content > + my $i = index($content, $before); Again, do this with regexps. Something like one of these (untested): $content =~ s{^(?=\Q$insert_before\E)}{$content}m; $content =~ s{^\Q$insert_before\E)}{$content$&}m; (Your "index" is wrong because it should match only at start of line.) > + return 1; > +} Again, why the return ? I hope this detailed critique is helpful... Regards, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.