Hacker News new | past | comments | ask | show | jobs | submit login

A lot of the standard lib constructs behave completely differently in different contexts. Often subtly. For instance,

    @globvals = glob("*"); # returns all matches to @globvals

    $glob1 = glob("*"); # returns the first match and stores it in $glob1

    $glob2 = glob("*.c"); # returns the second match of "*" and stores it in $glob2



> $glob2 = glob(".c"); # returns the second match of "" and stores it in $glob2

Not true.

    $ touch 1 2 3 4.c 5.c 6.c 7 8
    $ ls
    1  2  3  4.c  5.c  6.c  7  8
    $ perl -E '
         @globvals = glob("*");
         say "globvals = @globvals";
         $glob1 = glob("*");
         $glob2 = glob("*.c");
         say "glob1 = $glob1\nglob2 = $glob2"
      '
    globvals = 1 2 3 4.c 5.c 6.c 7 8
    glob1 = 1
    glob2 = 4.c
As you can see, the third call to glob returns the expected value.

Here's how glob is documented in the perlfunc document[1].

    glob    In list context, returns a (possibly empty) list of filename
            expansions on the value of EXPR such as the standard Unix shell
            /bin/csh would do. In scalar context, glob iterates through such
            filename expansions, returning undef when the list is exhausted.
[1]: http://perldoc.perl.org/functions/glob.html


Context is a pretty important, pervasive concept in Perl. So much so that chromatic discusses it in the very first chapter[1] of his Modern Perl book[2].

    [1]: http://modernperlbooks.com/books/modern_perl_2014/01-perl-philosophy.html#Q29udGV4dA
    [2]: This very accessible, and free, book (updated very recently for 2014) is a must-read for any Perl developer.


Consistency is about expectations. It is possible to check if caller wants an array or a scalar and return either or a reference to either. And so nobody expects a single way of returning things across all the functions. Therefore such behavior is consistent.


Having implicit shared state dependent on the requested type is certainly not consistent. To a reasonably experienced perl programmer one would expect the following:

    @globs = glob("*") # all matches
    $glob1 = glob("*") # the first match of "*"
    $glob2 = glob("*.c") # the first match of "*.c"


I agree that's the behavior you would want for that example. That is almost the behavior you get when use use File::Glob's bsd_glob function. For some reason it instead appears to return the last alphanumerically sorted match instead of the first. I've submitted a bug about that as well as the general lack of documentation on the true behavior of the CORE::glob() function, as outlined in my other comment.


I disagree. You would expect what it says in documentation, and it says it should iterate over the list.

If you want the first element of the list, you should say exactly that to the compiler, i.e. force list context and take the first element:

    ($file1) = glob("*");


> I disagree

I assume because you misread the example given there. If you note that $glob1 and $glob2 come from what should be different lists (all files in working dir and all files ending in .c in the working dir), then $glob1 and $glob2 should contain the first item each of their respective lists. That's exactly what the documentation says should happen.

> You would expect what it says in documentation, and it says it should iterate over the list.

Unfortunately it doesn't even do what it says in the documentation consistently. I have another comment here that outlines that fairly thoroughly. The behavior is very weird and specific to glob, and is not documented accurately.


I guess I misread the example, sorry. I thought we were talking in the context of inconsistent behavior of Perl as a language and therefore its syntax, but not in the context of an undefined behavior of glob().


We are and we aren't. elektronjunge chose an example that definitely is inconsistent, but I'm not sure that it's really indicative of Perl in general. IMHO, It's a fairly specific kind o broken, where we can't really fix the behavior because of backwards compatibility, but the documentation is just plain inadequate in this case as well.


It does return the values as expected. See my other reply: https://news.ycombinator.com/item?id=8730798


Actually, it doesn't, and it's complicated. See my rather long-winded reply here: https://news.ycombinator.com/item?id=8730925



You picked an extremely good built-in as an example. It doesn't function exactly as you've shown it, but there is very weird stuff going on. On the plus side, this is the only time I've seen something quite like this in Perl, so I'm not sure it's represents the state of using Perl as a whole very well.

glob appears to do something special, and unlike most other Perl functions where it's just a matter of knowing the context and the API the function exposes. In the versions I just tested (including 5.21.6), glob in fact doesn't quite do what it's documentation says, in the apparent effort to "do what you mean". It does not iterate through files in the result when used in scalar context, it iterates through files in the result when used in scalar context and in the exact same call-site in the program.

Here's all the files we'll test

    # perl -E 'my @files = glob("*"); say for @files;
    one.a
    one.b
    two.a
    two.b
Glob acts like an iterator when used in scalar context. Here, each call returns another file.

    # perl -E 'while ( my $file = glob("*") ) { say $file; } '
    one.a
    one.b
    two.a
    two.b
Glob gives the first file each time it's called in scalar context if it's not the same exact point in the source. This doesn't follow the docs, which says it will act like an iterator in scalar context.

    # perl -E 'my $file1 = glob("*"); my $file2 = glob("*"); say $file1; say $file2;'
    one.a
    one.a
Again, we see it's not acting like an iterator. Each call is generating it's own list and returning the first file.

    # perl -E 'my $file1 = glob("*"); my $file2 = glob("two*"); say $file1; say $file2;'
    one.a
    two.a
Here, we can see that since the same point in the code is getting hit, glob is acting like an iterator.

    # perl -E 'sub myglob { my $mask = shift; glob($mask); } my $file1 = myglob("*"); my $file2 = myglob("*"); say $file1; say $file2;'
    one.a
    one.b
Here we see that since the same point in code is getting hit, glob is ignoring it's input and just acting like an iterator on the first input, which seems to be what you were trying to show in your example.

    # perl -E 'sub myglob { my $mask = shift; glob($mask); } my $file1 = myglob("*"); my $file2 = myglob("two*"); say $file1; say $file2;'
    one.a
    one.b
That is very weird behavior. I agree it's not consistent with the rest of the language. I'm not sure it's indicative of the language as a whole though, as it appears to be due to weird historical implementation details that are kept for backwards compatibility. The perldoc for glob says it's implemented using the the standard File::Glob module. The perldoc for File::Glob mentions that it implements the code glob in terms of bsd_glob (the FreeBSD glob(3) routine, a superset of POSIX glob), which is function that can also be exported. In fact, the bsd_glob function, when used, acts as we would wish, without weird iterator behavior (without iterator behavior at all, in fact).

    # perl -E 'use File::Glob qw/:bsd_glob/; my $file1 = bsd_glob("*"); my $file2 = bsd_glob("two*"); say $file1; say $file2;'
    two.b
    two.b
    # perl -E 'use File::Glob qw/:bsd_glob/; sub myglob { my $mask = shift; bsd_glob($mask); } my $file1 = myglob("*"); my $file2 = myglob("one*"); say $file1; say $file2;'
    two.b
    one.b
Of course, this means the documentation that says the core glob routine is implemented using bsd_glob has some glaring omissions.

So, congratulations, you picked an extremely good example and unearthed some crazy Perl arcana, and possibly a bug (there's some open, longstanding tickets regarding glob bugs[1][2], which seem to boil down to "we're doing what we can to make it better, but we're hampered by backwards compatibility and weird semantics"). I think the documentation is woefully inadequate to explain what's going on in this case though.

Note: that bsd_glob seems to return the last item when used in scalar context, not the first.

  [1]: https://rt.perl.org/Ticket/Display.html?id=2707

  [2]: https://rt.perl.org/Ticket/Display.html?id=2713




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: