## American Binomial Model in q

American-style options may be exercised prior to expiry. To price them on a binomial tree, the full tree has to be constructed. The prices are then computed at each node through a backward reduction process starting from the prices at maturity. At each node, we take the maximum of the payoff at that node versus the price implied by the binomial model.

Constructing a binomial price tree is relatively easy in q:

```BPTree:{[n;S;u] n{(x*y 0),y%x}[u]\1#S}  / binomial price tree
```

where n is the depth of the tree, S is the current price and u is the scale of the up-move. We simply let the scale of the down-move be 1/u and hence the y%x in the expression.

The general binomial model can then be implemented as follow:

```GBM:{[R;P;S;T;r;b;v;n]                  / General Binomial Model (CRR)
t:T%n;                                 / time interval
u:exp v*sqrt t;                        / up; down is 1/u
p:(exp[b*t]-1%u)%(u-1%u);              / probability of up
ptree:reverse BPTree[n;S;u];           / reverse binomial price tree
first R[exp[neg r*t];p] over P ptree }
```

where R is a reduction function, P is the payoff function, S is the current price, T is the time to maturity, r is the risk-free rate, b is the cost of carry, v is the volatility and n is the depth of the tree. For American and European options, the reduction functions may be expressed as:

```American:{[D;p;a;b] max(b;D*(-1_a*p)+1_a*1-p)}
European:{[D;p;a;b] D*(-1_a*p)+1_a*1-p}
```

where D is the discount factor and p is the probability of an up-move. Consequently, we can express the American and European binomial models simply as:

```ABM:GBM[American]
EBM:GBM[European]
```

Testing the code on an American vanilla put option (strike = 102; price = 100; time to maturity = 1 year; risk-free rate = cost of carry = 8%; volatility = 20%; depth of tree = 1000):

```q) VP:{[S;K]max(K-S;0)}    / vanilla put: max(K-S,0)
q) \t show ABM[VP[;102];100;1;0.08;0.08;0.2;1000]
6.2215
31
```

It took 31 ms to compute. Pretty nice for so little code.

Note: It turns out that this implementation of EBM is faster than the one in the previous post. The reason is that I avoided using the expensive xexp function this time round. Otherwise, the previous implementation should be faster since it only computes the payoffs at maturity and not the intermediate nodes.

## European Binomial Model in q and Python

In the previous post, we created a binomial probability mass function (pmf). We can use that to easily evaluate European-style options:

```EBM:{[P;S;K;T;r;b;v;n]                 / European Binomial Model (CRR)
t:T%n;                                / time interval
u:exp v*sqrt t;                       / up
d:1%u;                                / down
p:(exp[b*t]-d)%(u-d);                 / probability of up
ns:til n+1;                           / 0, 1, 2, ..., n
us:u xexp ns;                         / u**0, u**1, ...
ds:d xexp ns;                         / d**0, d**1, ...
Ss:S*ds*reverse us;                   / prices at tree leaves
ps:pmf[n;p];                          / probabilities at tree leaves
exp[neg r*T]*sum P[Ss;K]*ps }
```

Note that P is the payoff, S is the current price, K is the strike price, T is the time to maturity, r is the risk-free rate, v is the volatility, b is the cost of carry and n is the depth of the binomial tree. The Python version using NumPy and SciPy actually looks quite similar:

```def EuropeanBinomialModel(P, S, K, T, r, b, v, n):
n = int(n)
t = float(T)/n                              # time interval
u = np.exp(v * np.sqrt(t))                  # up
d = 1/u                                     # down
p = (np.exp(b*t)-d)/(u-d)                   # probability of up
ns = np.arange(0, n+1, 1)                   # 0, 1, 2, ..., n
us = u**ns                                  # u**0, u**1, ...
ds = d**ns                                  # d**0, d**1, ...
Ss = S*us*ds[::-1]                          # prices at leaves
ps = binom_pmf(ns, n, p)                    # probabilities at leaves
return np.exp(-r*T) * np.sum(P(Ss,K) * ps)
```

As we can see, both code has no explicit loops. This is possible in Python as NumPy and SciPy are array-oriented. NumPy and SciPy’s idea of “broadcasting” has some similarity with k/q’s concept of “atomic functions” (definition: a function f of any number of arguments is atomic if f is identical to f’).

## Binomial distribution in q

Recently I was using SciPy’s `scipy.stats.binom.pmf(x,n,p)`. I though it would be great if I could have such a function in q. So a simple idea is to construct a binomial tree with probabilities attached. Recalling that a Pascal triangle is generated using `n{0+':x,0}\1`, I modified it to get:

```q)pmf:{[n;p]n{(0,y*1-x)+x*y,0}[p]/1#1f}
q)pmf[6;0.3]
0.000729 0.010206 0.059535 0.18522 0.324135 0.302526 0.117649
q)sum pmf[1000;0.3]
1f
```

What is great about this method is that it is stable. Compared to SciPy 0.7.0, it was more accurate too (it is a known issue that older SciPy has buggy binom.pmf):

```>>> scipy.stats.binom.pmf(range(0,41),40,0.3)[-5:]
>>> array([3.33066907e-15, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 1.11022302e-16])
q)-5#pmf[40;0.7]
3.293487e-15 1.52594e-16 5.162955e-18 1.134715e-19 1.215767e-21
```

Unfortunately this method is too slow for large n. For large n, we need more sophisticated methods. For the interested reader, take a look at Catherine Loader’s Fast and Accurate Computation of Binomial Probabilities paper and an implementation of a binomial distribution in Boost

Posted in coding, k&q, python. Tags: , , . 1 Comment »

## Gray code in q

It is easy to construct binary n-bit Gray code in q using the recursive reflection-prefixing technique:

```q)gc:{\$[x;(0b,/:a),1b,/:reverse a:.z.s x-1;1#()]}
q)show gc 4
0000b 0001b 0011b 0010b 0110b 0111b 0101b 0100b
1100b 1101b 1111b 1110b 1010b 1011b 1001b 1000b```

It is also possible to construct the above iteratively using the formula $n \oplus \lfloor n/2 \rfloor$:

```q).q.xor:{not x=y}
q)gc_iter:{(0b vs x) xor (0b vs x div 2)}
q)show (-4#gc_iter@) each til 16
0000b 0001b 0011b 0010b 0110b 0111b 0101b 0100b
1100b 1101b 1111b 1110b 1010b 1011b 1001b 1000b```

To check that indeed exactly one bit is flipped each time:

```q)check:{x[0] (sum@xor)': 1_x}
q)check gc 5
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1```

To identify the position of the bit that was flipped:

```q)pos:{raze x[0] (where@xor)': 1_x}
q)pos gc 5
4 3 4 2 4 3 4 1 4 3 4 2 4 3 4 0 4 3 4 2 4 3 4 1 4 3 4 2 4 3 4```

If we think about it, there is no reason why we have to prefix. We could do suffix as well:

```q)gc:{\$[x;(a,\:0b),(reverse a:.z.s x-1),\:1b;1#()]}
q)show gc 4
0000b 1000b 1100b 0100b 0110b 1110b 1010b 0010b
0011b 1011b 1111b 0111b 0101b 1101b 1001b 0001b
q)check gc 5
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
q)pos gc 5
0 1 0 2 0 1 0 3 0 1 0 2 0 1 0 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0```

In fact, if we are only interested in the positions that need to be flipped, we can use this instead:

```q)gcpos:{\$[x;a,n,a:.z.s n:x-1;()]}
q)gcpos 5
0 1 0 2 0 1 0 3 0 1 0 2 0 1 0 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0```

Such a sequence of positions is useful if we are using Gray code to efficiently enumerate the non-zero points spanned by a set of basis vectors:

```q)basis:(1100000b;0111001b;0000011b)
q){x xor y} scan basis gcpos count basis
1100000b
1011001b
0111001b
0111010b
1011010b
1100011b
0000011b```

Update (20090927): Once again, Attila has beaten me at q-golf :-) Here is his formulation:

`gc:{x{(0b,/:x),1b,/:reverse x}/1#()}`
Posted in k&q. Tags: , . Leave a Comment »

## Game of Life in (one line of) q

Recently on the KDB+ Personal Developers mailing list, there was a post of an YouTube video showing how Conway’s Game of Life can be implemented in one line of APL. That post generated some interest and gave rise to several one-liners in q:

```life:{any(1b;x)&'3 4=\:sum sum f(f:rotate'\:[1 0 -1])x} / by Aaron Davies
life:{any(1;x)&3 4=\:2 sum/2(1 0 -1 rotate'\:)/x}       / by Attila Vrabecz
life:{(3=a)|x&4=a:2 sum/2 rotate'\:[1 0 -1]/x}          / by Attila Vrabecz
life:{3=a-x*4=a:2 sum/2(1 0 -1 rotate'\:)/x}            / by Attila Vrabecz
life:{0<x+0 1@4-2 sum/2(1 0 -1 rotate'\:)/x}            / by Swee Heng (me!)```

They pretty much implement the algorithm shown in the YouTube video. The steps are as follow:

1. Take the input grid X and generate 3 new grids: the 1st is X with each row rotated left by one position; the 2nd is X itself; and the 3rd is X with each row rotated right by one position. We illustrate the transformation of the grid positions using ASCII art
```                 +-------+
| 12340 |
| 67895 |
| BCDEA |
+-------+      +-------+
| 01234 |      | 01234 |
| 56789 | ---> | 56789 |
| ABCDE |      | ABCDE |
+-------+      +-------+
| 40123 |
| 95678 |
| EABCD |
+-------+```
2. For each of the 3 resulting grids, perform step 1 again but this time rotate the columns (instead of rows) up and down. In ASCII art:
```                 +-------+      +-------+-------+-------+
| 12340 |      | 67895 | 12340 | BCDEA |
| 67895 | ---> | BCDEA | 67895 | 12340 |
| BCDEA |      | 12340 | BCDEA | 67895 |
+-------+      +-------+      +-------+-------+-------+
| 01234 |      | 01234 |      | 56789 | 01234 | ABCDE |
| 56789 | ---> | 56789 | ---> | ABCDE | 56789 | 01234 |
| ABCDE |      | ABCDE |      | 01234 | ABCDE | 56789 |
+-------+      +-------+      +-------+-------+-------+
| 40123 |      | 95678 | 40123 | EABCD |
| 95678 | ---> | EABCD | 95678 | 40123 |
| EABCD |      | 40123 | EABCD | 95678 |
+-------+      +-------+-------+-------+```
3. Sum up all 9 grids. Each cells of the resulting grid contains the number of occupied cells in its immediate neighbourhood (including itself).

In Aaron’s code, Step 1 is implemented by “f:rotate’\:[1 0 -1]”, Step 2 is performed by the second call to “f” and Step 3 is done by the double-call to “sum”. Since “f” and “sum” are called twice iteratively, Attila rewrote them as “2 sum/2(1 0 -1 rotate’\:)/x”. He also reduced the test of survivorship (occupied cells with 2 or 3 neighbours) and birth (empty cells with 3 neighbours) to a single expression “3=a-x*4=a”. As for me, I found an equivalent expression (i.e. “0<x+0 1@4-“) for the test of survivorship and birth.

The above algorithm is not so efficient as it creates additional grids and adds them. Chris Burke provided an alternative algorithm using table lookup. While it is not a one-liner, it is much faster. Here it is verbatim:

```t:0b vs ' til "i"\$2 xexp 9
ctr:t[;27]
cnt:sum each t
trans:(cnt=3) or (ctr=1) and cnt=4
lifeinit:{
r:count x;c:count first x;
index::raze each raze 2 (1 0 -1 rotate'\:)/ (r,c) # til r*c;
raze x}
life:{trans 2 sv x index}
liferun:{(life\) lifeinit x}```

The insight here is that a cell and its 8 neighbours can be treated as a “neighbourhood array” of 9 bits. E.g. cell 7 in the ASCII art has the corresponding neighbourhood array (1, 2, 3, 6, 7, 8, B, C, D). Such 9-bits array corresponds (using “2 sv”) to a number from 0 to 511 inclusive. By pre-computing “trans” (the test of survivorship and birth) and “index” (the neighourhood array of each cell), the “life” step becomes a quick table lookup.

## Finding words in k

Facebook has a game called Word Challenge (on Yahoo it is known as Text Twist). The idea of the game is to form as many words as possible (of minimum length 3) given a set of 6 characters. It is the type of problem that can be easily solved with k or q.

My first attempt is basically brute-force:

```w:w@=#:'w:?_0:`\$words                         / read words & group by length
p:{[n;r]r(,/{y,/:&~x in y}[!n]')/,()}         / nPr; permute r items from !n
f:{[r;s],/a@'(a:w r)(&:in)'s@p[#s]'r}         / find words; r:lengths s:chars```

The first line reads in a list of possible English words from a file and groups them by length. The second line is used to create the permutation (nPr) indices. The last line checks whether the words in the file appear in the list of permutations of the given characters.

While it works well for 6 or less characters, the performance is poor for longer strings since nPr is huge for large n and r. My second solution is better:

```v:v@={x@<x}'v:?_0:`\$words                     / read words, sort & group
c:{[n;r]r(,/{_[y-1+#z;!*z,x],\:z}[n;r]')/,()} / nCr; choose r items from !n
g:{[r;s]a@&0<#:'a:,/v@?,/(s@<s)@c[#s]'r}      / find words;```

The first line reads in the words, sorts each word lexicographically (think of it as reducing each word to a canonical form) and groups them according to that canonical form. The second line creates the combination (nCr) indices. The last line takes the characters, creates a list of all canonical forms obtainable from combinations of the characters and retrieves all the words associated with each canonical form. The performance is much faster because nCr is smaller and the pre-processing of v helps a lot.

Here’s a sample output for the characters “qwertyzxcvbi” and a list of 98,569 English words (/etc/dictionaries-common/words in Ubuntu). It took only 49 ms to compute.

```brevity cervix verity zyrtec bizet brice
bryce   civet  evict  ritzy  rivet tiber
tribe   trice  twice  write  bert  bevy
bier    bite   bret   brew   brie  brit
byte    cite   city   crew   crib  eric
exit    ibex   rice   rite   ritz  rive
teri    tier   tire   trey   tyre  verb
very    vibe   vice   view   weir  wire
wiry    wite   wive   writ   yeti  bet
bic     bit    bye    cry    ice   icy
ire     ivy    rev    rex    rib   rte
rye     tex    tic    tie    try   vet
vex     vic    vie    web    wei   wet
wit     wry    yet    yew    zit```

The code is here: fbwc.k.

Posted in k&q. Tags: , , . Leave a Comment »

## More Sudoku solvers in k and q

I was informed by Attila that the Sudoku solver I mentioned in my previous post wasn’t the only version developed by Arthur. There’s a shorter one here: http://nsl.com/k/sudoku/aw3.k. In fact, in the same directory there are several other solvers by Arthur, Attila and Roger Hui.

So here is all 72-bytes (including the 2 trailing \n) of the shortest Sudoku solver I’ve seen to date:

```p,:3/:_(p:9\:!81)%3
s:{*(,x)(,/{@[x;y;:;]'&21=x[&|/p[;y]=p]?!10}')/&~x}```

and its translation to q:

```P,:3 sv floor(P:9 vs til 81)%3
S:{first(enlist x)(raze@{@[x;y;:;]each where 21=x[where(or)over P[;y]=P]?til 10}')/where not x}```

How does it work? Unlike the previous solver, there is considerable simplification.

1. The P array now comprises of 3 lists: the row, column and sub-square. So P[;i] returns a 3 element vector containing the row, column and sub-square of position i.
2. The S function is no longer recursive. Instead it iterates over the unfilled positions of the board (/where not x). At each iteration, “where(or)over P[;y]=P is used to identify those positions that lies on the same row, column and sub-square. The “?til 10” returns a list of those positions where the numbers 0 to 9 first appeared. As there are 21 positions, a value of 21 indicates that the corresponding number is missing. So “where 21=” identifies those missing numbers which are then exhaustively substituted using @[x;y;:;]. The “raze” is required to flatten the resulting nested list.