zack$ osh gen 10 ^ f 'x: x**2' ^ out
Comments:
As a syntactic convenience, the token $ can be used instead of ^ out at the end of an osh command sequence. However, if this is done, then only default behavior for out is obtained. (The out command has options for formatting, and for writing to files.)
Input can be passed to an osh command sequence using a Unix pipe as follows:
zack$ cat /usr/share/dict/words | osh ^ select 'w: len(w) >= 20' $
This command begins by writing the contents of /usr/share/dict/words to stdout.
A Unix pipe is used to send this stream of data to osh. "osh ^" is a special syntactic
form which converts each line of input from stdin into a Python string (omitting the
line terminator \n).
These strings are passed
to the next command which keeps only those strings whose length is at least 20, and prints them to stdout.
The other approach is to use the osh commands imp and py. imp imports modules for use by commands later in the command sequence. py runs arbitrary Python code; the intent is to define symbols that can be used by commands later in the command sequence. Both commands pass objects received on the input stream to the output stream. In both cases, the import or python code is executed once, before objects start flowing on the streams connecting commands in the command sequence. For example, the following command line prints the area of circles with radii 0, 1, 2, ... 9:
zack$ osh py 'pi = 3.14159265358979' ^ gen 10 ^ f 'r: (r, pi * r**2)' $
(0, 0.0)
(1, 3.14159265359)
(2, 12.5663706144)
(3, 28.2743338823)
(4, 50.2654824574)
(5, 78.5398163397)
(6, 113.097335529)
(7, 153.938040026)
(8, 201.06192983)
(9, 254.469004941)
zack$ osh @3 [ sh 'sleep 5; date' ] $
(0, 'Mon Aug 6 23:03:26 EDT 2007')
(1, 'Mon Aug 6 23:03:26 EDT 2007')
(2, 'Mon Aug 6 23:03:26 EDT 2007')
@3 introduces three threads of execution. Each thread has state identifying the
thread, in this case the integers 0, 1, and 2.
The bracketed command sequence, sh 'sleep 5; date', is executed on each thread.
sh is an escape to a native shell, so sh 'sleep 5; date'
is executed on each thread.
Each line of output contains the thread state, identifying the thread, and output from the executed
command.
Thread state can also be generated by evaluating a function returning a sequence, e.g.
zack$ osh @'range(3)' [ sh 'sleep 5; date' ] $
(0, 'Mon Aug 6 23:03:26 EDT 2007')
(1, 'Mon Aug 6 23:03:26 EDT 2007')
(2, 'Mon Aug 6 23:03:26 EDT 2007')
The fact that all printed dates are the same shows that sleep 5 executed simultaneously
on all threads.
The function is specified by 'range(3)'. This is a function with no arguments generating
the integers 0, 1, 2.
Finally, parallel execution can also be initiated by naming a cluster, e.g.
zack$ osh @fred [ sh 'sleep 5; date' ] $
('101', 'Mon Aug 6 23:03:26 EDT 2007')
('102', 'Mon Aug 6 23:03:26 EDT 2007')
('103', 'Mon Aug 6 23:03:26 EDT 2007')
In this case, thread state contains an object
describing a node in the named cluster, (configured in .oshrc),
and each thread runs the bracketed command on the indicated node.
A subset of the nodes in a cluster can be specified as follows:
zack$ osh @fred:102 [ sh 'sleep 5; date' ] $
('102', 'Mon Aug 6 23:13:19 EDT 2007')
@fred:102 specifies that the command should be run on nodes of fred
whose name contains 102 as a substring. Since the names of the nodes in cluster
fred are 101, 102, 103, only node 102 is selected.
@fred:10 would select all nodes of the cluster since all node names contain 10.
zack$ osh @"('a', 'b', 'c')" [ gen 3 ] $
('a', 0)
('a', 1)
('a', 2)
('b', 0)
('b', 1)
('b', 2)
('c', 0)
('c', 1)
('c', 2)
The sequence produced by each thread is ordered. To produce a merged sequence, we provide a
merge function:
zack$ osh @"('a', 'b', 'c')" [ gen 3 // 'x: x' ] $
('a', 0)
('b', 0)
('c', 0)
('a', 1)
('b', 1)
('c', 1)
('a', 2)
('b', 2)
('c', 2)
// is the merge operator, indicating that the
sequences produced by each thread are to be merged. Each node's input
to the merge operator is a sequence of integers, (output
from gen). The function following the merge operator, x:
x assumes that the inputs are ordered, (raising an exception if this assumption
does not hold); and interleaves tuples from the threads
so that the output sequence is ordered by the values from gen.
If the merge function is x: x, then the function can be omitted, e.g.
zack$ osh @"('a', 'b', 'c')" [ gen 3 // ] $
('a', 0)
('b', 0)
('c', 0)
('a', 1)
('b', 1)
('c', 1)
('a', 2)
('b', 2)
('c', 2)