Thursday, June 10, 2021

AI/ML Hype: If even Google doesn't get it right, then...?

 I'm a happy and frequent user of Google Photos, which I use to back up, share, and edit the many photos that I take on my smartphone.

This morning, the mobile app greeted me with a new feature: It suggested that six of my photos were incorrectly oriented, and offered me to fix that by rotating them.

I checked, and out of those six photos, five were totally fine as they were. (I'm a bit pedantic and usually fix any wrong orientation quickly by hand, which Google Photos makes quite easy.)

The sixth was indeed misoriented, but the rotation suggested by the tool was also wrong.

First I laughed. If Google—of all people!—sends me twelve predictions via a highly popular app, and gets eleven of those wrong, how can anybody really expect that Artificial Intelligence/Machine Learning will solve real problems and eventually pay back all the investments being made in it? And note that the problem class here is basically image classification, which is one of the few narrow domains where ML has been particularly effective.

And then I dutifully corrected Google's mispredictions by rotating the proposed images until they were correct (correct again, in five of the six cases). So maybe if a few million other nice/gullible people help poor Google out, then maybe one day, it will actually become helpful even to pedants like me.

This suddenly recalled a question Prof. Mireille Hildebrandt asked at a large EU event last week ("Leading the Digital Decade", video):

What is a digitally skilled population? Is it a population capable of using AI and other digital systems? That's the usual way to talk about citizens: users. Or is it, perhaps, a population that is open to be used by these systems as data engines?

Sunday, May 03, 2020

Google Code Jam 2020 in Lisp—Round 1C

Well, that was slightly embarrassing and frustrating...

After a relatively promising attempt at round 1B, I entered round 1C yesterday as the last chance to advance to round 2. Again, three nice problems, with the third one looking too hard for me at first sight, at least for the general solution. My "plan" after reading the problem descriptions was roughly: Solve problems 1 and 2 and, time permitting, submit a solution for the easy first input set of problem 3. That would have been a good plan, except I took a bit too long to code problem 1, and I got completely stuck trying to debug my Lisp solution for problem 2.

Problem 1: Overexcited Fan

The problem turned out to be even easier than I first thought. I should have spent more time designing the algorithm before starting to code it. I had an OK solution, but it was more complex than necessary and took me 53 minutes. That's much longer than it should have, given the overall time limit of 2 hours 30 for the three tasks. After the round, I simplified this to a more elegant solution.

Problem 2: Overrandomized

Here I had the correct inspiration to work with letter frequencies, in particular first-digit letter frequencies. Unfortunately my Lisp solution would always result in "RE" (runtime error) when submitted, even though it ran fine both against the sample data set and against some additional data sets that I have generated. At this point I ran out of time.
After the round, I tried for some time to get my Lisp program working, but without any possibility to get diagnostic output, I didn't find where the problem lied. I even started from scratch using a different implementation technique, but the result was the same: My program would run fine against the sample and my self-generated tests, but throw "RE" when faced with the competition/training data. Very frustrating. If anyone finds the error(s), please let me know!
At one point I reimplemented my solution in C++, and it successfully solved all data sets as soon as I got it to compile. I then "backported" it to C, which simplified it further (in the C++ version I used both "traditional" arrays and "library" vectors—for sorting—which makes that version quite ugly.

Problem 3: Oversized Pancake Choppers

After reading the problem description, I assumed (probably correctly) that a full solution would beyond my capabilities for solving, at least under this time pressure. But the easy set seemed quite tractable: If there are only 2 or 3 diners, then the set of solution possibilities is quite bounded. Indeed it took me only (competitive coders can laugh now) about 15 minutes to write a stupid solution that worked for this limited case. But that's academic because I had run out of time in problem 2, so this happened after the round was over.

Conclusion

With only the first and easiest problem solved in time, I ranked in the 8000s, much below my rank in 1B—and hopelessly below the cut-off at 1500.
In problem 1 and (my constrained/stupid solution for) problem 3 I had no problems with Common Lisp as a coding language. But I failed to get my problem 2 Lisp solution working, while later attempts in C++ and C worked right away (and showed that I had analyzed the problem correctly).

Sunday, April 19, 2020

Google Code Jam 2020 in Lisp—Round 1B

Although I had quite a bit of trouble surviving this year's qualification round, I was allowed to participate in online round 1, which, as usual, is held as three different sub-rounds to accommodate participants from various time zones.  The first (1A) was in the middle of the night in my time zone, so I skipped it and didn't even get around to reading the problems.  But today I attempted  round 1B.  The problems were again very cute and interesting, and while I struggled with the limited time, I ended up being quite satisfied, even though I missed qualification this time.

Problem 1: Expogo

I wrote a relatively simple implementation using search with memoization (caching) and a bit of pruning of implausible search regions, and that was sufficient for all three data sets.  According to the after-round summary, there's also a constructive solution.  I had suspected so, but couldn't find it.

Problem 2: Blindfolded Bullseye

This is a nice search problem that involves some geometry.  I think I started with a reasonable search strategy, but then ran out of time when it came to doing the geometry... So I gave up trying to find a "good" solution, and coded the trivial brute-force solution for solving the easiest data set.  That was worth only 3 points, but those 3 points improved my final ranking by more than 1'000.

Problem 3: Join the Ranks

This didn't look too impossible, but given the high amounts of points awarded for this, I suspected that this would be beyond my skills and decided to focus on the other two problems.

Conclusion

With a full solution for problem 1, a half(?)-done good solution for problem 2, and the few extra points for the trivial data set of problem 2, I finally ranked #1996 [edited for final result].  As only the first 1500 contestants qualify for round 2, this was not sufficient.  But I'm quite happy with how it went, and am looking forward to round 1C, about two weeks from now.  And unlike during the qualification round, I didn't run into any issues with coding in Lisp—even though I haven't been using it professionally for several years now.

Tuesday, April 07, 2020

Google Code Jam 2020 in Lisp—Qualification Round

This year I registered for Google Code Jam again. The new platform supports Common Lisp (SBCL) again, so I started writing my solutions for the Qualification Round in SBCL. Although I managed to qualify for Online Round 1, it didn't work as well as I'd hoped.
The Qualification Round this year had five problems, which took me a while to notice—I had to scroll horizontally to see the fifth problem. I attempted all five, but only managed to write working solutions for three of them.

Problem 1: Vestigium

That was an easy one, but for some reason I didn't manage to code it correctly in Common Lisp. Even though my code looked totally straightforward, it only produced an RE (Runtime Error) when run against the data set. Finally I rewrote the solution in C++, and that worked on first try.

Problem 2: Nesting Depth

The task was to generate appropriate numbers of correctly nested parentheses. Obviously this type of problem is familiar to Lispers, and this time my straightforward code actually worked. My sloppiness resulted in an invalid attempt, as I initially left out the (solve) call at the end of the script.

Problem 3: Parenting Partnering Returns

It occurred to me relatively early that the problem maps to the problem of partitioning (two-coloring) the interval graph. However, all my attempts of coding this simple graph traversal algorithm in Common Lisp ended in REs (Runtime Exceptions) again. Very frustrating!

Problem 4: ESAb ATAd

This was my favorite problem, and I came up with a simple and near-optimal solution pretty quickly. It would have worked the first time, had I not (again) forgotten to add the correct (solve) call after the development phase.

Problem 5: Indicium

That one was definitely too hard for me! After the deadline, I read the analysis and took it up again. I managed to code a super-efficient method to generate latin squares with arbitrary possible traces at will, based on "circulating" matrices. The only cases that I cannot solve in this way are odd-sized with traces of is n+2 or n2-2.

Conclusion

Finally I advanced with 49 points (from 100, with 30 required), and ranked #6318 of about 44000 participants. My Lisp development and graph algorithm skills have become a bit rusty, and I'm not very optimistic about being able to survive Online Round 1.

Monday, July 22, 2019

Compiling Emacs on Mac OS against X11 libraries

Motivation

My place of work has been using Apple as the main, and now basically the only, supported platform for office use. As a long-time (GNU) Emacs user and occasional contributor, I usually compile GNU Emacs from the development (master) sources, which are now thankfully available over Git.
I already feel bad for using a proprietary OS rather than something based on GNU/Linux or another free or at least open source system. For this and other reasons, I try to at least use system libraries from the free/open source community rather than Apple's, for example for the GUI.  So rather than Apple's Carbon UI, I build Emacs against X11 libraries. Previously I could do this with the Gtk toolkit, but the maintainers of the Mac OS port of Gtk I was using stopped supporting its X11 drivers. So I reverted to the older "Lucid" toolkit with X11. (Yes, maybe I'm weird to cling to the X Window System in spite of the obstacles... that's probably because X was an important part of my socialization to Free Software and distributed systems.)

Howto

The prerequisites are
  • a bunch of packages from Homebrew (not all of which may be relevant):
    • autoconf autogen automake cairo d-bus dbus fontconfig freetype gcc gdk-pixbuf gettext git glib gmp gnutls gobject-introspection graphite2 harfbuzz imagemagick intltool ispell jansson jpeg json-glib libffi libpng librsvg libtasn1 libtiff libtool libunistring libxml2 little-cms2 netpbm nettle openjpeg openshift-cli osinfo-db osinfo-db-tools pcre pcre2 pkg-config readline shared-mime-info texinfo zlib
  • an X11 (XQuartz) installation with development headers
  • The Xcode development environment with its CLI tools installed.
I clone Emacs into /var/tmp/emacs/emacs according to the instructions on the Emacs from Git page of the Emacs Wiki. Then I mkdir /var/tmp/emacs/gbuild, cd there, and run
../emacs/configure --verbose \
  --with-x --with-x-toolkit=lucid --with-ns=no \
  --without-makeinfo \
  LIBXML2_CFLAGS=-I/usr/local/opt/libxml2/include/libxml2 \
  LIBXML2_LIBS='-L/usr/local/opt/libxml2/lib -lxml2' \
  --with-jpeg=no --with-gif=no --with-tiff=no \
  --x-libraries=/usr/local/opt/freetype/lib:/usr/X11/lib \
  --x-includes=/usr/local/opt/freetype/include:/usr/X11/include \
  --with-xpm=no \
  PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:/usr/X11/lib/pkgconfig
This gives me the following configuration:
Configured for 'x86_64-apple-darwin18.6.0'.

  Where should the build process find the source code?    ../emacs
  What compiler should emacs be built with?               gcc -g3 -O2
  Should Emacs use the GNU version of malloc?             no
    (The GNU allocators don't work with this system configuration.)
  Should Emacs use a relocating allocator for buffers?    no
  Should Emacs use mmap(2) for buffer allocation?         no
  What window system should Emacs use?                    x11
  What toolkit should Emacs use?                          LUCID
  Where do we find X Windows header files?                /usr/local/opt/freetype/include:/usr/X11/include
  Where do we find X Windows libraries?                   /usr/local/opt/freetype/lib:/usr/X11/lib
  Does Emacs use -lXaw3d?                                 yes
  Does Emacs use -lXpm?                                   no
  Does Emacs use -ljpeg?                                  no
  Does Emacs use -ltiff?                                  no
  Does Emacs use a gif library?                           no
  Does Emacs use a png library?                           yes -L/usr/local/Cellar/libpng/1.6.37/lib -lpng16 -lz
  Does Emacs use -lrsvg-2?                                no
  Does Emacs use cairo?                                   no
  Does Emacs use -llcms2?                                 yes
  Does Emacs use imagemagick?                             no
  Does Emacs support sound?                               no
  Does Emacs use -lgpm?                                   no
  Does Emacs use -ldbus?                                  yes
  Does Emacs use -lgconf?                                 no
  Does Emacs use GSettings?                               no
  Does Emacs use a file notification library?             yes (kqueue)
  Does Emacs use access control lists?                    yes
  Does Emacs use -lselinux?                               no
  Does Emacs use -lgnutls?                                yes
  Does Emacs use -lxml2?                                  yes
  Does Emacs use -lfreetype?                              yes
  Does Emacs use HarfBuzz?                                yes
  Does Emacs use -lm17n-flt?                              no
  Does Emacs use -lotf?                                   no
  Does Emacs use -lxft?                                   yes
  Does Emacs use -lsystemd?                               no
  Does Emacs use -ljansson?                               yes
  Does Emacs use -lgmp?                                   yes
  Does Emacs directly use zlib?                           yes
  Does Emacs have dynamic modules support?                no
  Does Emacs use toolkit scroll bars?                     yes
  Does Emacs support Xwidgets (requires gtk3)?            no
  Does Emacs have threading support in lisp?              yes
  Does Emacs support the portable dumper?                 yes
  Does Emacs support legacy unexec dumping?               no
  Which dumping strategy does Emacs use?                  pdumper
Then I can build Emacs using
make bootstrap
This produces an intermediate "temacs" binary referencing many shared objects:
: 1leinen@macsl[leinen]; objdump -macho -dylibs-used /var/tmp/emacs/gbuild/src/temacs
/var/tmp/emacs/gbuild/src/temacs:
 /opt/X11/lib/libpng16.16.dylib (compatibility version 43.0.0, current version 43.0.0) /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11) /opt/X11/lib/libXaw3d.8.dylib (compatibility version 9.0.0, current version 9.0.0) /opt/X11/lib/libXmu.6.dylib (compatibility version 9.0.0, current version 9.0.0) /opt/X11/lib/libXt.6.dylib (compatibility version 7.0.0, current version 7.0.0) /opt/X11/lib/libSM.6.dylib (compatibility version 7.0.0, current version 7.1.0) /opt/X11/lib/libICE.6.dylib (compatibility version 10.0.0, current version 10.0.0) /opt/X11/lib/libXext.6.dylib (compatibility version 11.0.0, current version 11.0.0) /opt/X11/lib/libX11.6.dylib (compatibility version 10.0.0, current version 10.0.0) /opt/X11/lib/libX11-xcb.1.dylib (compatibility version 2.0.0, current version 2.0.0) /opt/X11/lib/libxcb.1.dylib (compatibility version 3.0.0, current version 3.0.0) /opt/X11/lib/libXft.2.dylib (compatibility version 6.0.0, current version 6.2.0) /opt/X11/lib/libXrender.1.dylib (compatibility version 5.0.0, current version 5.0.0) /usr/local/opt/dbus/lib/libdbus-1.3.dylib (compatibility version 23.0.0, current version 23.11.0) /opt/X11/lib/libXrandr.2.dylib (compatibility version 5.0.0, current version 5.0.0) /opt/X11/lib/libXinerama.1.dylib (compatibility version 2.0.0, current version 2.0.0) /opt/X11/lib/libXfixes.3.dylib (compatibility version 5.0.0, current version 5.0.0) /usr/local/opt/libxml2/lib/libxml2.2.dylib (compatibility version 12.0.0, current version 12.9.0) /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0) /usr/local/opt/freetype/lib/libfreetype.6.dylib (compatibility version 24.0.0, current version 24.1.0) /opt/X11/lib/libfontconfig.1.dylib (compatibility version 11.0.0, current version 11.2.0) /usr/local/opt/harfbuzz/lib/libharfbuzz.0.dylib (compatibility version 20504.0.0, current version 20504.0.0) /usr/local/opt/gnutls/lib/libgnutls.30.dylib (compatibility version 55.0.0, current version 55.0.0) /usr/local/opt/little-cms2/lib/liblcms2.2.dylib (compatibility version 3.0.0, current version 3.8.0) /usr/local/opt/jansson/lib/libjansson.4.dylib (compatibility version 16.0.0, current version 16.1.0) /usr/local/opt/gmp/lib/libgmp.10.dylib (compatibility version 14.0.0, current version 14.2.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
...and an src/emacs that hopefully doesn't crash. Oops... why am I saying that?
Well, yesterday I used a simpler method that didn't specify /usr/local/opt/freetype explicitly anywhere, but relied on autoconf/pkgconfig to find it. Unfortunately this created a "DLL hell" situation where an older version of the freetype dynamic library from /opt/X11/lib was used instead of the newer on in /usr/local/opt/freetype/lib, leading to crashes when Emacs tried to use certain fonts. Thanks to YAMAMOTO Mitsuharu for helping me find the error in my old build process!