Grep binary string

A co-worker asked me if it is possible to grep arbitrary binary strings, e.g. sequences of non-printable ASCII characters. It turns out that GNU grep does understand binary strings if we use Perl-regex via the -P option.

[sh@pc ~]$ grep -slrP '\x05\x00\xc0' /boot
/boot/grub/ffs_stage1_5
/boot/grub/ufs2_stage1_5
/boot/grub/stage2
/boot/efi/EFI/redhat/grub.efi
/boot/vmlinuz-2.6.29.6-213.fc11.x86_64

I couldn’t find this when Googling for “grep binary” so I thought I should pen it down here.

Advertisements
Posted in hacks. Tags: , . Leave a Comment »

gdata.finance

My contributions (gdata.finance module) to the Google Data APIs Python Client Library has been accepted into version 1.3.3 of the library.

A Conversation with Arthur Whitney

Arthur Whitney was interviewed on ACM’s Queue magazine. Here’s a link to the article and a link to a local PDF copy.

Game of Life in (one line of) k

In my previous post I mentioned several q implementation of Conway’s Game of Life. There are also some k implementations at nsl.com (see this page). The shortest one I know of comes from Arthur Whitney:

life:{3=a-x*4=a:2{+(0+':x)+1_x,0}/x}

The key insight here is that for each cell in the grid we can compute the number of occupied cells in its neighbourhood by (i.) summing its column-wise neighbours; and then (ii.) summing the row-wise neighbours of the first sum. To visualize the components of the sum in ASCII art:

  +---+---+---+---+       +---+---+---+---+       +---+---+---+---+
  |   |   |   |   |       |   |   |   |   |       |   |   |   |   |
  | 0 | 1 | 2 | 3 |       | 0 | 1 | 2 | 3 |       | 01|012|123|23 |
  |   |   |   |   |       | 4 | 5 | 6 | 7 |       | 45|456|567|67 |
  +---+---+---+---+       +---+---+---+---+       +---+---+---+---+
  |   |   |   |   |  col  | 0 | 1 | 2 | 3 |  row  | 01|012|123|23 |
  | 4 | 5 | 6 | 7 |  sum  | 4 | 5 | 6 | 7 |  sum  | 45|456|567|67 |
  |   |   |   |   | ----> | 8 | 9 | A | B | ----> | 89|89A|9AB|AB |
  +---+---+---+---+       +---+---+---+---+       +---+---+---+---+
  |   |   |   |   |       | 4 | 5 | 6 | 7 |       | 45|456|567|67 |
  | 8 | 9 | A | B |       | 8 | 9 | A | B |       | 89|89A|9AB|AB |
  |   |   |   |   |       |   |   |   |   |       |   |   |   |   |
  +---+---+---+---+       +---+---+---+---+       +---+---+---+---+

The code {+(0+’:x)+1_x,0} simply arranges the rows and computes the column-wise sums.

      x                (0+':x)            1_x,0
  +-------+       +---------------+     +-------+
  | 01234 |       | 01234         |     | 56789 |
  | 56789 | ----> | 56789 + 01234 |  +  | ABCDE |
  | ABCDE |       | ABCDE + 56789 |     | FGHIJ |
  | FGHIJ |       | FGHIJ + ABCDE |     |       |
  +-------+       +---------------+     +-------+

The flip (+) operation changes columns to rows so that the same {+(0+’:x)+1_x,0} code can be used to compute the row-wise sums. The 2nd flip restores the orientation of the grid.

The above Game of Life is not exactly the same as what I posted earlier since the boundary cells do not wrap around. To get a wrap-around version, we can use this one-liner:

life:{3=a-x*4=a:2{+x-2{|1_0+':x,1#x}/x}/x}

I will leave it to the reader to figure out how this works. :-)

Game of Life in (one line of) q

Recently on the KDB+ Personal Developers mailing list, there was a post of an YouTube video showing how Conway’s Game of Life can be implemented in one line of APL. That post generated some interest and gave rise to several one-liners in q:

life:{any(1b;x)&'3 4=\:sum sum f(f:rotate'\:[1 0 -1])x} / by Aaron Davies
life:{any(1;x)&3 4=\:2 sum/2(1 0 -1 rotate'\:)/x}       / by Attila Vrabecz
life:{(3=a)|x&4=a:2 sum/2 rotate'\:[1 0 -1]/x}          / by Attila Vrabecz
life:{3=a-x*4=a:2 sum/2(1 0 -1 rotate'\:)/x}            / by Attila Vrabecz
life:{0<x+0 1@4-2 sum/2(1 0 -1 rotate'\:)/x}            / by Swee Heng (me!)

They pretty much implement the algorithm shown in the YouTube video. The steps are as follow:

  1. Take the input grid X and generate 3 new grids: the 1st is X with each row rotated left by one position; the 2nd is X itself; and the 3rd is X with each row rotated right by one position. We illustrate the transformation of the grid positions using ASCII art
                     +-------+
                     | 12340 |
                     | 67895 |
                     | BCDEA |
      +-------+      +-------+
      | 01234 |      | 01234 |
      | 56789 | ---> | 56789 |
      | ABCDE |      | ABCDE |
      +-------+      +-------+
                     | 40123 |
                     | 95678 |
                     | EABCD |
                     +-------+
  2. For each of the 3 resulting grids, perform step 1 again but this time rotate the columns (instead of rows) up and down. In ASCII art:
                     +-------+      +-------+-------+-------+
                     | 12340 |      | 67895 | 12340 | BCDEA |
                     | 67895 | ---> | BCDEA | 67895 | 12340 |
                     | BCDEA |      | 12340 | BCDEA | 67895 |
      +-------+      +-------+      +-------+-------+-------+
      | 01234 |      | 01234 |      | 56789 | 01234 | ABCDE |
      | 56789 | ---> | 56789 | ---> | ABCDE | 56789 | 01234 |
      | ABCDE |      | ABCDE |      | 01234 | ABCDE | 56789 |
      +-------+      +-------+      +-------+-------+-------+
                     | 40123 |      | 95678 | 40123 | EABCD |
                     | 95678 | ---> | EABCD | 95678 | 40123 |
                     | EABCD |      | 40123 | EABCD | 95678 |
                     +-------+      +-------+-------+-------+
  3. Sum up all 9 grids. Each cells of the resulting grid contains the number of occupied cells in its immediate neighbourhood (including itself).

In Aaron’s code, Step 1 is implemented by “f:rotate’\:[1 0 -1]”, Step 2 is performed by the second call to “f” and Step 3 is done by the double-call to “sum”. Since “f” and “sum” are called twice iteratively, Attila rewrote them as “2 sum/2(1 0 -1 rotate’\:)/x”. He also reduced the test of survivorship (occupied cells with 2 or 3 neighbours) and birth (empty cells with 3 neighbours) to a single expression “3=a-x*4=a”. As for me, I found an equivalent expression (i.e. “0<x+0 1@4-“) for the test of survivorship and birth.

The above algorithm is not so efficient as it creates additional grids and adds them. Chris Burke provided an alternative algorithm using table lookup. While it is not a one-liner, it is much faster. Here it is verbatim:

t:0b vs ' til "i"$2 xexp 9
ctr:t[;27]
cnt:sum each t
trans:(cnt=3) or (ctr=1) and cnt=4
lifeinit:{
 r:count x;c:count first x;
 index::raze each raze 2 (1 0 -1 rotate'\:)/ (r,c) # til r*c;
 raze x}
life:{trans 2 sv x index}
liferun:{(life\) lifeinit x}

The insight here is that a cell and its 8 neighbours can be treated as a “neighbourhood array” of 9 bits. E.g. cell 7 in the ASCII art has the corresponding neighbourhood array (1, 2, 3, 6, 7, 8, B, C, D). Such 9-bits array corresponds (using “2 sv”) to a number from 0 to 511 inclusive. By pre-computing “trans” (the test of survivorship and birth) and “index” (the neighourhood array of each cell), the “life” step becomes a quick table lookup.

Faster solver for (1,3,4,6) Puzzle in k

In the previous post, an exhaustive evaluation of RPN expressions was used to solve the (1,3,4,6)-Puzzle. It wasn’t a good solution as the generation of those expressions was inefficient.

A better method might be to construct binary trees with the numbers as leaves and operators as roots of subtrees. Here is one such solution in k:

p:{y(,/{y,/:&~x in y}[!x]')/,()}       / permutation of y items out of !x
u:{{(,y),&~x in y}[!x]'p[x]y}          / append unselected items of p[x]y
r:{,/.[a;(!#a:y@u[#y]2;0);{y,x};]'x}   / add operators(x) to all head items
t:{,/(-1+#y)(,/r[x]')/,y}              / all binary trees (with repetition)
e:{$[0>@x;x;(*x). e'1_x]}              / evaluate each binary tree recursively
s:{?a@&z=e'a:t[x]y}                    / x:operators y:numbers z:target

Explanation:

  • p() generates permutations of y indices taken from !x.
  • u() appends the unused indices to each permutation generated by  p().
  • r() is slightly harder to explain. Effectively it takes 2 elements from a list, combines them with an operator and puts the result back to the head of the list.
  • t() uses r() to iteratively build up binary trees. Note that some trees are repeated.
  • e() is a recursive evaluator of each tree.
  • s() is the solver. Due to the repeated binary trees, it is necessary to call distinct (?) before returning the results.

This is much faster than the previous solution. But there probably is a more efficient method. Another post for another day. :-)

The code is here: 1346b_k

Posted in k&q. Tags: . Leave a Comment »

Solver for (1, 3, 4, 6) Puzzle in k

The puzzle is this: given the numbers (1,3,4, 6) and the binary operators (+, -, *, /) form the integer 24 using all the numbers (once each) and some of the operators (may be repeated). Parenthesis are allowed, of course. E.g. the expression 3*(1+4*6) is a valid attempt but clearly the wrong solution.

I’ve previously solved this in Python: 1346.py. The basic idea is to exhaustively construct Reverse Polish Notation (RPN) expressions using all the numbers and 3 operators (chosen with replacement). It is much shorter when implemented in k:

c:{x@y(,/{_[*y,0;!x],\:y}[#x]')/,()}
p:{?x(#x)(,/{y,/:&~x in y}[!#x]')/,()}
e:{$[0>@f:*x;x;f[*g;*h],1_h:e 1_g:e 1_x]}
t:{1e-8>abs x-y}
s:{a@&t[z;*:'b]&1=#:'b:e'a:,/p'y,/:c[x;-1+#y]}

The Python version actually ran faster in this case because the algorithm to generate all RPNs is more optimized in the Python version.

Explanation:

  • The function c() returns all possible y-tuples combinations of items in x, with replacement allowed. It is used to generate all possible 3-tuples of the 4 binary operators.
  • The function p() returns all distinct permutations of the items in x. Since x may have repeated items, we apply ? to make the result distinct.
  • The function e() is a recursive RPN evaluator which stores the result of its computation in the first element of the list returned. If the the entire expression is a valid RPN, the returned list will be a singleton.
  • The function t() takes care of floating point trickiness (24 is not exactly the same as 24f sometimes).
  • The function s() is the main solver. It starts by creating all ((#y)-1)-tuples combinations of the operators (x) and appending them to the numbers (y). The result is permuted to get a list of expressions and all permutations are then evaluated by e(). If the expression is a valid RPN (i.e. result is a singleton) and its evaluated value is the target (z), it would be selected.

The result for 24, 29 and -18 are as follows:

solution for 24
"(6%(1-(3%4)))"

solution for 29
"((3*(4+6))-1)"
"((3*(6+4))-1)"
"(((4+6)*3)-1)"
"(((6+4)*3)-1)"

solution for -18
"(6%(1-(4%3)))"

The code is here: 1346_k

Posted in k&q. Tags: , . Leave a Comment »