Sunday, April 07, 2019

Google Code Jam 2019 in Lisp (again) - Qualification Round

Like most years, I registered for Google Code Jam again, because it's fun. The qualification round is usually easy for me to survive, but presents an opportunity to de-rust the brain, get used to coding and interacting with the competition platform.
Google introduced a new platform last year with a big change: Previously you got input and submitted output for checking, now you submit code, and Google runs it against input to check whether it generates the correct results. I suspect that a major driver for this change was the desire to curb cheating. The new platform also allows "interactive" problems, where your code interacts (via stdout/stdin) with a "judge" process, which is cool.
One sad effect of last year's move to the new platform was that I could no longer use Lisp—only a couple programming languages were supported. Thus I mostly used C last year.
This year I thought it was the same, and coded the first two problems, Foregone Solution and You Can Go Your Own Way, in C. As I started the third, I somehow noticed that more programming languages are supported this year, including Lisp (SBCL) and Clojure. So I changed to Common Lisp for problems three and four, Cryptopangrams and Dat Bae.
After fighting a little with the way the platform runs Lisp programs (I hadn't even known about SBCL's --script option), I managed to submit working versions for all problems.
When the ratings came out, it turned out that I had gotten the two difficult problems right including the (hidden) large datasets, but for the two easy problems, my results for the large inputs were bad: Wrong Answer for Foregone Solution and Runtime Error for You Can Go Your Own Way.
For the first, I had used strncpy() with a limit of MAXDIGITS, and forgot that if there are actually MAXDIGITS characters to copy, then strncpy() won't copy the terminating NUL, so I had to write that NUL byte to the end of the string, which I had failed to do.
For the second, I didn't read the problem description well, and thought that the problem size was limited to a 1000x1000 maze, where in fact it was 50000x50000 for the large set. Because I statically assigned vectors based on the maze edge length, this caused overruns. Easy to fix by changing a #define from 1000 to 50000.
Both bugs would have been unlikely in Lisp, where strings don't have to be NUL-terminated and dynamic allocation of data structures is more natural than fixed allocation.
Anyway, I lost 1+10 points to these stupid bugs, which gave me an overall score of 89. Of the more than 35500 participants, only one other person and I had this particular score. My rank was 1147, which is somewhat encouraging. But I was lucky to find good and easy-to-program approaches for each of the problems, and that will become (much) harder in the coming rounds.
For Foregone Solution, I noticed quickly that xx4x44x can be trivially split into xx3x33x and 0010110. For You Can Go Your Own Way, it took me a bit longer to come up with the "complement" solution, i.e. when Lydia has walked SSESEESE, then I can simply walk EESESSES and be sure that we both arrive at the same destination (because the maze is square) and never make the same move. With Cryptopangrams, I quickly had the correct intuition that I should compute the GCD of adjacent cryptosymbols to extract the common prime, and hacked together a version that would solve the two simple sample cases. But as I repeatedly failed the smaller input sets, I noticed that there are complications when symbols are repeated. This was when I resorted to pencil and paper—it would probably have been a good idea to start this earlier!—, and it took me a while to turn this into code. But when I thought I had it right, the system still wouldn't accept my solutions! That's when I found out that I have to actually call the (solve) function at the end of my program—doh!
For Dat Bae, I implemented the interaction with the Python-written sample judge using a subprocess, which allowed me to run local tests. The easier set (F=10 guesses for broken bits in up to 1024 total bits) was solved by numbering all 1024 bits. It took me a break to find that I could easily bring the number of guesses down to F=5 due to the constraint that not more than 15 bits are broken: This allowed me to use a smaller pattern (e.g. 0...31) and repeat it, without introducing any ambiguities.