Functionally I'll iterate#
Fri, 18 Dec 2020 22:53:03 +0000
I'm going to attempt to explain Lua iterators, in the hope that by the time I've finished writing I'll understand them. So.
There's a construct in Lua called for
which is used for iterating
over a collection or other sequencey-type thing. You might write e.g.
> for k, v in pairs({a=2, b=5, c=9}) do print(k,v) end a 2 b 5 c 9
or
> b=io.open("/etc/hosts", "r"); for l in b:lines() do print(l) end 127.0.0.1 localhost ::1 localhost 127.0.0.2 noetbook ::1 noetbook
Per the Lua manual, the syntax for the for is as follows:
for <var-list> in <exp-list> do <body> end
where <exp-list>
is (between one and) three values - or something
that produces three values when evaluated. The first value is an
"iterator function" f
, the second is "invariant state" state
and the third is the "control variable" a
.
The interpreter then runs the loop: repeatedly, it
- calls
f(state, a)
- binds the return values to
<exp-list>
- runs the loop body
- updates the control variable
a
to be the first of the values returned by the previous call tof
until a
becomes nil
which signals the end of the iteration.
The builtin ipairs
iterator, used for traversing "array" tables -
tables in which the indices are consecutive integers - provides an
iterator function which when provided with a table and an index
returns the next indes, and the value at that index
> gen,_,_ = pairs({5,6,7,8}) > =gen function: 0x41c600 > gen({5,6,7,8}, nil) -- what's the first element? 1 5 > gen({5,6,7,8}, 3) -- what's after the third element? 4 8
The standard pairs
iterator, used for traversing tables with
arbitrary keys, uses the builtin next
function as an iterator
function. next
works similarly to the iterator function in the
previous example: given a key and a table, it returns the next key
(for some value of "next" we aren't interested in the details of) and
the value at that key
> next({a=2, b=4}, "a") b 4
These both depend on the for
construct's behaviour of using the
first return value from each call to the iterator function as the
second argument to it on the subsequent call. Which is fine if we
want that value, but quite often we don't, so we end up doing this:
for _, v in ipairs(an_array) do print(v) end
If we want to write an iterator values_of
that returns only the
value not the index, so could be used something more like this:
for v in values_of(an_array) do print(v) end
then the iterator function returned by values_of
would be called each
time only with the value v
, which would not be sufficient for it to work
out how far through the array it had got last time. Instead we have to
make our iterator close over the state it needs:
function valuesof(anarray) local index = 0 return function() index = index+1 return an_array[index] end end> for v in values_of({7,3,5}) do print(v) end 7 3 5
I note that the io.lines
function has similar behaviour to
values_of
in that it returns only the next data value and not also
the index of that value. Assuming it uses the standard C library
functions for file input, I am guessing that it does this by reading
from the current file position. So it also has internal state, but
that state is implicit in the depths of stdio
Yes, but why?
Why did I start looking at this? Reasonable question. Because I wanted
to write an idiomatic find
function in Fennel and thought it would
be a lot neater to have it work on any iterator, not just on
tables. So I I could say something like
>> (find (fn [_ v] (= (% v 2) 0)) (pairs [ 1 2 3 4 5]))
or
>> (find #(even? $1) (pairs [ 1 2 3 4 5]))
but it does look messy having to accept and ignore that first parameter, especially if I want to pass bog standard predicate functions. I want to write something more like
>> (find even? (vals [ 1 2 3 4 5]))
and indeed this is possible, but only if I write the vals
iterator
to close over its state instead of being able to use the parameters
that Lua passes into it when it's called.
See also:
- leo60228 has a version of
find
(which he callsfirst
) at https://gist.github.com/leo60228/ab68f365a10f6c218a31a3a7b7882280 which pretty much does this - but now I think I understand how it works :-)