HAL9000

HAL9000
"It just isn't conceivable that you can design a program strong enough to beat players like me."

December 14, 2014

Strongest Android chess engine, again please?

You've got a handy Android mobile and you wanna know which engine is the strongest of all?

I've been answering in an older post, stating Stockfish DD is the king. But two newer Stockfish versions have been released so far and Komodo 8 came by as an MP version to cause trouble too.

Still no Houdini, no Gull for Android in horizon yet. Critter 1.6 is too old to compete with updated fishes and newcomers like Firenzina, Black Mamba, Texel, Senpai are still behind.

Then there was one thing left to do for me. I decided to test SF5 vs Komodo 8 in a separate match but i wanted to push the latest Stockfish development version 121014 into the battlefield too. A threesome then...

This 121014 was exclusively bundled with Droidfish v1.55 and is not related to Stockfish official site.

The results were surprising for me in two ways:
1) In contrast with what's happening on Windows, Komodo can't resist against Stockfish on Android at least without tablebases,
2) SF5 overcomes SF121014, which is totally illogic but true. Maybe compiler matters, i donno.

The head-to-head matches were played at 15 sec/move on an Asus tablet with Intel Z3745 quad core cpu running at 1.86 Ghz. All three engines were based on their x86 compiles and not Arm v7, as usual for Intel.

To be fair on openings, i've used all 32 positions of TCEC-6 superfinal played twice by both sides, for a total of 64 games on each encounter.

Stockfish 121014 vs Komodo 8 v1.6:
34 - 27 (+22 -12 =30) gives 83 ELO gap
Stockfish 5 vs Komodo 8 v1.6:
39.5 - 24.5 (+24 -9 =30) gives 55 ELO gap
Stockfish 5 vs Stockfish 121014:
34 - 27 (+18 -8 =38) gives 55 ELO gap

Ok, 55 + 55 is not 83 but that's not the issue. The ranking is visible to the eye. The father fish beats its son, its son beats K8 and the father beats K8. Logical enuff.

In Rapidroid event that i'm conducting with same conditions on Exynos 4412 cpu, the gap between SF5 and K8 is currently 56 ELO with Bayes and 62 with Elostat.

Not still fully convinced, i've filtered SF5 vs K8 games from Rapidroid and i've seen +13 -4 =13 which makes 107 ELO. Similar superiority.

It's clear now. Stockfish 5 is still the strongest engine available for Android. Uh, dot, finito, done, gone...

SSDF's rating list update of 11 Dec 2014

SSDF has just published an update to World's unique multiplatform computer rating list.

I noticed that popular topers like Stockfish 5, Komodo 8, Houdini, Gull are still missing but i repeat again what SSDF is doing since almost the birth of computer chess is an admirable work.

I simply disagree with harsh critics appearing in some forums and chatrooms.

To me, they deserve respect for maintaining a list which is:
> multiplatform, not only uci engines but mobiles, DOS progs and tabletop machines,
> based on 2h/40m tournament time controls, which largely overtakes bullet lists made with faster cpu's,
> using pondering by default thanks to connected computers, meaning closer to real life chess games.

Indeed it's very difficult to finance such big garage of PC's nowadays. Be it from commercial software developers in the past or not, they still survive and do the job for free still in these days when nobody is paying you for testing their programs.

Well enough said, here's SSDF Top-10:
1Komodo 7.0 MP x64 2GB Q6600 2,4 GHz329541-3647678%3073
2Komodo 5.1 MP x64 2GB Q6600 2,4 GHz325428-2679271%3099
3Deep Rybka 4 x64 2GB Q6600 2,4 GHz320924-23104873%3038
4Stockfish 3 MP x64 2GB Q6600 2,4 GHz320725-2395870%3060
5Deep Hiarcs 14 2GB Q6600 2,4 GHz320023-22102068%3071
6Deep Rybka 3 x64 2GB Q6600 2,4 GHz319422-21137175%3001
7Naum 4.2 MP x64 2GB Q6600 2,4 GHz314721-21108361%3071
8Naum 4 x64 2GB Q6600 2,4 GHz311921-20127666%3002
9Deep Junior Yokoh x64 2GB Q6600 2,4 GHz311930-3053055%3082
10Deep Junior 13.3 2GB x64 Q6600 2,4 GHz311323-2292654%3083

Full list can be downloaded at their page at: http://ssdf.bosjo.net/list.htm

December 13, 2014

RAPIDROID: The unique Android chess engines rating list gets a big update


Wow, this one should rock indeed. It's been a while since a first release which introduced rapid time controls and replaced the blitz ranking. No sound came since first two rounds have placed Komodo on top but extended silence didn't mean gears were not turning.

Now it makes two tortured devices working simultaneously since 3 months at full speed, almost without interruption to help build the unique Android rapid chess rating list.

Working in a safe zone, inside Android, staying away from Windows and from any competition doesn't mean i can allow myself giving up accuracy of the experiment. I do this not only to explore the strength of mobile chess programs but also to challenge the barriers of statistics science, in a Don Quichotte fashion! Who knows i can't break down a wind mill?

True... in the beginning, the sceptical engineering mind of mine pushed me hard toward an impossible mission, to obtain a reliable list with only 20 games per engine. I admit i simply lost that bet. No way!

The need for enough samples could not be avoided, despite my efforts to vary openings and opponents and randomize things at maximum to simulate a long run. Unfortunately, i've had to extend the experiment up to 150 games per engine to start seeing something speaking. 100 games seem to be the minimum where error margins fit into +/-60 ELO. Then, fluctuations seem to stabilize significantly. If you take a look at the graphical elo trends by rounds shown in the image below, you will visualize what's happening in the long run, among a wide population of engines.

Nothing is clear before 100 games played.
Grrr! Why do they always keep moving up n down!?

This was anoher lessons learnt case of statistics for me. Regarding the ranking, it has definitely shrank compared to my previous 5 sec/move blitzoid list. I think it's quite reasonable because more thinking time helps weaker engines resist more against stronger ones or said in a different way, it makes life a little bit harder for top engines.

I've also found out that engines with bigger gap between blitz and rapid, often refer to technical reasons or bugs. For instance, Ivanhoe performed worse in rapid and after deep analysis it came obvious that this engine can't use any hash memory, probably due to a bad compile. It just needs more help from hash tables when using more time per move but there's none used in fact and logically the performance is going down.

My final comment is about Komodo 8 which simply disappointed. I'm sure it can't be "statistical noise" (Oh! What a popular term nowadays!) anymore. Android looks different than Windows here. My guess is that it's somehow linked to how the engine binary is compiled. Komodo looks strong enough to threaten the crown of Stockfish in TCEC at present on 16 cores of a double-Xeon monster-PC but here in the modest Android 32-bit arena, Stockfish 5, the already outdated May-2014 code, is still clearly ruling against all other engines.

If you ask me which is the strongest Android engine today, the confident answer is Stockfish 5!

I must hereby claim that even the development version of 12-Oct-14, delivered with Droidfish 1.55, plays weaker than Stockfish 5 (details to come soon). On Windows 20-25 ELO increase over SF5 is confirmed and true. However, Android side shows an opposite panorama, maybe due to different compiling tools used.

Now, time to stop blah blah and let the list talk. You will notice this time, the number of cores and the operating system infos are added. Would it be a prior warning about intruders from other op systems? My wink of an eye here...

BAYES ELO RATINGS BASED ON 4036 GAMES BY 56 PROGRAMS
## Name                   c O/S   elo  +  -  gam sco oppo drw
01 Stockfish 5            4 And32 3139 51 49 142 76% 2959 36%
02 Komodo 8               4 And32 3083 48 46 142 67% 2971 44%
03 Critter 1.6a           4 And32 3004 45 45 142 55% 2972 50%
04 Firenzina 2.4.1        4 And32 2982 45 45 142 48% 2992 48%
05 BlackMamba 2.0         4 And32 2906 47 47 148 56% 2857 43%
06 RobboLito 0.085e4l     1 And32 2855 49 49 148 51% 2834 32%
07 Senpai 1.0             4 And32 2854 50 49 142 57% 2802 35%
08 Komodo32 3 AB          1 And32 2845 48 48 144 50% 2849 42%
09 Texel 1.05a8           1 And32 2794 49 49 148 56% 2745 28%
10 Gaviota v1.0-d         4 And32 2763 47 48 144 47% 2792 38%
11 Toga II 3.0            1 And32 2682 47 47 146 49% 2683 37%
12 Arasan 15.2 JA         4 And32 2664 47 47 152 53% 2646 34%
13 Deuterium v14.3.34.130 1 And32 2639 45 45 168 52% 2622 30%
14 DiscoCheck 4.3         1 And32 2623 47 47 156 47% 2649 28%
15 GNU Chess 5.50-32      1 And32 2613 47 47 148 51% 2610 36%
16 IvanHoe 9.46b          4 And32 2609 47 47 156 49% 2612 33%
17 Rhetoric 1.4.1         1 And32 2571 47 47 148 48% 2584 34%
18 RedQueen 1.1.3 TCEC JA 4 And32 2524 49 50 144 43% 2578 27%
19 Crafty_23.4.JA         1 And32 2508 49 49 144 48% 2520 25%
20 Rodent 1.00            1 And32 2488 49 48 148 51% 2472 26%
21 Alfil 12.10            1 And32 2485 49 48 140 51% 2474 29%
22 Daydreamer 1.75 JA     1 And32 2467 48 48 148 44% 2513 28%
23 Rotor 0.7a             1 And32 2448 48 48 140 50% 2450 31%
24 cheng3 1.07 JA         1 And32 2422 49 49 146 53% 2399 25%
25 GarboChess 3           1 And32 2393 48 48 144 46% 2426 28%
26 DanasahZ_0.4.JA_xb     1 And32 2392 50 50 144 48% 2405 26%
27 Sloppy_0.23.JA_xb      1 And32 2386 48 48 146 52% 2369 33%
28 Scorpio_2.7.JA_xb      1 And32 2381 48 48 146 51% 2374 25%
29 GNU Chess 6.0.2        1 And32 2373 49 48 144 55% 2338 24%
30 Tucano_1.04.AB_xb      1 And32 2324 51 51 142 54% 2297 17%
31 Pepito v1.59           1 And32 2295 47 47 150 48% 2317 29%
32 BetsabeII_1.30.JA_xb   1 And32 2289 51 50 150 57% 2234 15%
33 GreKo_9.0.JA_uci       1 And32 2286 47 48 150 47% 2308 28%
34 Typhoon_1.0.r358.JA_xb 1 And32 2275 48 49 146 49% 2281 26%
35 Diablo 0.5.1b JA       1 And32 2252 50 50 144 51% 2243 18%
36 Sungorus 1.4 JA        1 And32 2203 50 51 146 43% 2256 21%
37 Phalanx_XXIII.JA_xb    1 And32 2195 53 52 144 57% 2131 13%
38 Olithink_5.3.2.JA_xb   1 And32 2170 52 53 144 51% 2139 20%
39 TJchess 1.1U           1 And32 2134 50 51 140 46% 2158 21%
40 Natwarlal_0.14.JA_xb   1 And32 2133 52 52 142 52% 2106 19%
41 Myrddin_0.86.JA_xb     1 And32 2110 50 51 144 46% 2142 21%
42 Jazz 6.40 JA           1 And32 2091 50 51 144 48% 2107 23%
43 Scidlet_2.61b2.JA_xb   1 And32 2067 53 53 140 55% 2013 18%
44 KmtChess_1.21.JA_xb    1 And32 2049 52 52 140 50% 2044 21%
45 AdroitChess0.4 JA      1 And32 1952 54 54 138 52% 1926 17%
46 Sjeng_1.12.JA_xb       1 And32 1877 57 58 138 47% 1899 12%
47 BikJump v1.8           1 And32 1860 55 55 138 53% 1812 20%
48 ZCT-0.3.2500           1 And32 1780 60 61 138 51% 1753  9%
49 Leonidas_r83.JA_xb     1 And32 1771 57 57 138 56% 1696 14%
50 Sjaak_4.68.JA_xb       1 And32 1735 58 58 138 54% 1669 13%
51 Zzzzzz_3.5.1.JA_xb     1 And32 1566 58 58 138 51% 1569 22%
52 Tscp_1.8.1.AB_xb       1 And32 1536 60 60 138 45% 1577 12%
53 Rocinante 2.0 JA       1 And32 1518 62 62 138 49% 1535  7%
54 VIRUTOR CHESS 1.1.1    1 And32 1372 59 60 138 42% 1447  9%
55 Chess for Android      1 And32 1283 58 61 138 33% 1448 13%
56 Simplex 0.9.8          1 And32 1062 73 85 138 11% 1496  5%


Rapidroid test platform specification:
* Samsung Galaxy Note II @ 1.6 Ghz x 4 cores + 256MB hash for SP & MP Android programs,
* Polypad 1010IPS tablet @ 1.6 Ghz x 2 cores + 128MB hash for SP Android programs,
* HTC Diamond @ 528Mhz to be used for Windows Mobile programs, with 16MB hash size,
* i7 M620 @ 2.67 Ghz + Arena 3.5 + 2GB hash tables for Windows X64 programs
* iPod Touch 64G @ 600 Mhz to be used for Windows Mobile programs
* DosBox 1.74 used to run DOS programs,
* WinVICE used to run Commodore-64 programs,
* Messtiny UCI adapters or CB-Emu2014 used to emulate Mephisto programs,
* Own books disabled and replaced by 20 ply openings taken from Adam Hair's 10 move book, whenever possible.
* Openings selection for max variety, queens on board, no check or capture at last ply, preferably rated between +0.15 to +0.39 by Stockfish and Komodo.
* Opening positions played twice with different colors, whenever possible,
* Repeating openings and twin games avoided between two programs,
* Tablebases and pondering off,
* Time control: 15 to 30 sec/move or closest possible, identical for both programs.

Deuterium 14.3.34.130 Android: 2633 ELO after 168 games!

Here's the results obtained by the most recent Android chess engine Deuterium. Close to Arasan and Toga, the elo outcome is in line with PC engine rankings, bearing in mind the gap between Android and reference Windows hardware, usually 120 to 180 ELO. We may conclude the code was ported very well without loss and that deserves congratulations.
By the way i wish this potentially good engine adopts multi processing too. Not yet included, not even under Windows.
The gauntlet was conducted according to the specs of my RAPIDROID ranking. Thus, Deuterium joins the competition starting from round 15, to be released soon with an update.


December 2, 2014

New Android engine: Deuterium 14.3.34.130

Ferdinand Mosca, the author of Deuterium chess engine, kindly included an Android version of his engine in the latest release numbered 14.3.34.130. It's really fun to meet with another new engine on the battlefield.

Deuterium Android is a 32-bit compile like all others available. The interesting point is that this one is already verified to work without problem under Android Lollipop 5.0.

For those who plan to upgrade their devices and accordingly their Android versions, it must be known that Lollipop, the first 64 bit Android release incorporates deep changes, beyond the ramp up to 64 bits, various internal functional changes (ie. security policies) that would probably make existing 32 bit chess engine binaries fail to work.

At present i'm happy with my portable chess environment and i'm not really willing to go forward yet. I'd say precaution and patience first versus Lollipop.

Regarding Deuterium, it's rated about 2800 elo on popular rankings. It's not an MP engine. On our Android devices, it will use only one core as well. After first quick tests, i expect it to perform 2600-2650 on a Galaxy Note II 1.6Ghz.
180 second test shows 139Knps but it doesn't
look reliable as a benchmark as reported
nodes fluctuate from one depth to the next

Those who want to download it may visit the homepage: HERE
In case of access problem, a mirror copy is uploaded to my engine collection repository: HERE
The complete archive folder with many other engines is: HERE