[GRLUG] Open a file based on Content [Perl]

Tue Mar 6 10:29:33 EST 2007

On 3/6/07, Justin Denick <justin.denick at gmail.com> wrote:
> This is a cross posted message, if that offends you than seek consoling.
>
> I have the need to open files and manipulate them into helping me take over
> the world. Muhahahah.
> I can do it in bash, but I really want to learn perl, so hopefully one of
> you guys or girls
> could help. (do we have any women on this list?)
>
> Here's how I gitrdun in bash.
>
> > x=0
> > pattern="pattern"
> > files=`grep -l $pattern list-of-files`
> >
> > for this in $files do;
> >    if [ $x -eq 0 ]
> >       sed s/\~/\\r\\n/g $files > name.txt
> >
>
>
> > x=`expr $x + 1`
> >
>
>
> >    else
> >       sed s/\~/\\r\\n/g  $files > name.txt$x
> >
> >    fi
> > done
>
>
>
> But I want learn how to do this in perl
> so it would be something like (and please pardon my ignorance)
>
> > my $x = 0;
> > my $pattern = "pattern";
> >
> > open DH, "path" or die;
> > my @files = (readdir DH);
> > my @found = grep($pattern, @files);
> > for my $source (@found) {
> > my destination = "$source.new"
> >    open INPUT, "<$source" or die;
> >    open OUTPUT ">$destination" or die;
> >    while (<INPUT>) {
> >       s/\176/\012\015/g;
> >       print OUTPUT $_;
> >       rename $destination, $source;
> >    }
> >
> > }
>
> Thanks in advance for your help. '

grep() in perl is quite different from grep(1) in Unix.   It takes a
statement and a list of scalars and runs the block across them
returning items for which the statement evaluated to true.  So,

grep /~/, @lines;

will return a list of lines that contain a tilde.  Check out "perldoc -f grep".

This is a quick-n-dirty, completely untested, and unverified version.
 It also tries to be efficient about opening files multiple times,
which the shell version can't do.   I highly recommend the Perl
Cookbook from O'Reilly, since it is geared towards "Git-R-Done" rather
than a more academic approach to learning Perl.

#!/usr/bin/perl

use File::Find;

# see perldoc -f qr
my $pattern = qr/pattern/;

sub wanted {
    # recent versions of perl allow/prefer "my $fh" over barewords
    open my $fh, "< $File::Find::name";
    return unless ( $fh ); # skip this file if open() failed

    while ( my $line = <$fh> ) {
        # only scan the file up to the first match, then immediately jump to
        # processing it then bail out of this loop
        if ( $line =~ /$pattern/ ) {
            process( $fh, $File::Find::name );
            last; # exit the loop now that the file is processed
        }
    }
}

sub process {
    my( $fh, $filename ) = @_;

    # move the file handle back to the top of the file -- it may be more
    # efficient for large files to only seek backwards a little bit, but this
    # is easier to follow for now
    seek $fh, 0, 0;

    open my $out, "> $filename.new"
        or die "Unable to open $filename.new for writing: $!";

    while ( my $line = <$fh> ) {
        $line =~ s/\176/\012\015/g;
        print $out $line;
    }

    close $out;
    close $fh;

    rename "$filename.new" $filename;
}

# this will recurse a directory tree
find({ wanted => \&wanted, no_chdir => 1 }, '.' );

-Al Tobey