November 8, 2015

RAPIDROID RELOADED: October-2015

Another delayed release to blame on the lack of time during week days.

The highlights are:
* ADDED Galjoen 0.30.2 as new Android engine
* ADDED Sayuri 2015.10.02 as new Android engine
* ADDED Mephisto Roma 68020 as new Windows engine
* ADDED Mephisto Roma Turbo as new Windows engine
* UPDATED Bitfoot 150707 to 150922
* UPDATED Clubfoot r2293 to 150907
* UPDATED Chenard 2014.07.11 to 2015.08.15
* UPDATED Gunborg 1.0 to 1.39
* CALIBRATED all engines according to CCRL 40/4
* EXCLUDED GreKo 13.1: Often hangs on move
* EXCLUDED Laser 0.2: Stays in RAM and hogs CPU
* EXCLUDED Mchess2: Doesn't start
* EXCLUDED Protector 1.9: Often hangs on move

ON A HOT ROOF:
Komodo 9.2 still leads. Error margins are wide enough but the superority over Stockfish 6 looks permanent. It's no surprise as major rating lists show the same on Windows. We're just witnessing that Android is no exception. I expect both Komodo and Stockfish will be updated just after TCEC-8 and maybe Rapidroid will include them in December release.

CALIBRATION HEADACHE:
I've put a lot of effort to establish a trustable correlation formula between CCRL and Rapidroid before this release without success. Grrr!

Unfortunately all below methods fail to cover the inconsistencies: 
* Use only single thread 32 bit CCRL engines as anchors, both UCI and XB or only UCI
* Use only single thread 64 bit CCRL engines as anchors, both UCI and XB or only UCI
* Use all 32 bit CCRL engines as anchors, both UCI and XB or only UCI

I'm not an expert of statictics at all but i can understand why a perfect method can't be found:

1) 40/4 is not first priority of CCRL vs 40/40 and many Android engines miss in it. However Rapidroid must be calibrated vs CCRL 40/4 and not 40/40 given the CPU speeds and time controls. 

2) Exynos 4412 is about 1.9 times slower than CCRL's Athlon X2 4600+ according to single threaded integer test of Geekbench, the only multiplatform cpu benchmark to represent chess performance because computer chess is more like integer math and not like floating point at all! At least cross-checking the results indirectly through different PC CPU's showed Exynos-Athlon equation is not far from being realistic. An error resides there though...

3) I've estimated Exynos to be 54 ELO weaker than Athlon assuming that each NPS doubling is 60 ELO. However we can't be sure whether it's 50 or 60 or 70 without deeply testing. Since i don't have time for that, the doubling estimation remains a mystery, a potential source of error.

4) There's still a distribution difference between Rapidroid and CCRL. We need to add many more samples to Rapidroid to see if the whole list will shrink more or not. I believe my ranking is slightly inflated than reality for the moment. Thus the toppers Komodo and Stockfish may be slightly overrated.

There are indeed other sources of deviation like multi thread scaling differences but let's not get into deep detail and come to the preliminary solution i've introduced: Bearing in mind the error won't be completely avoided, i've decided to distribute it in the widest possible way by using as many anchors as possible, 32 bit and 64 bit, estimating an additional gap from 32 to 64 bit. XB engines are left off given that Android implementation in Chess for Android is not at the same level as UCI.

Finally, the calibration of the ranking is now based on 39 UCI engines selected from CCRL 40/4. Rapidroid tries to place all Android engines 65 ELO lower than their 32 bit CCRL counterparts.

For 64 bit, the target is set to 110 ELO below CCRL. Simple and unproven! The usual gaps here would be 54 and 99 arithmetically but a safety margin must be preserved.

Now, i estimate the calibration error like +/- 30 ELO.

MEPHISTO SIMULATION DISASTER:
I don't own one of those Mephisto table tops from 1980s but i'd always dreamed of buying one when i was a child. Some dreams don't come true and that never happened. Hopefully after decades, we have simulation software like MESSUI which runs many older chess computers on PCs. They are really back!

More credits must be given to UCI compatible versions which can play automated games via Arena GUI. Besides, they can play vs Android engines via the network remote connection i've been using to rate Rybka and The King lately.

Therefore, two Mephisto's are welcome in Rapidroid. Indeed i had high hopes but Roma failed to reach the expected level of 1930 at original speed. With full speed of an i7 processor that didn't change too much. Only 1800 something. Great deception so far.

Should we blame the low performance on the slow network interface (wasn't it the same for Rybka anyway?) or the simulation interface MESSTINY managing things through the WB2UCI soft adapter?

Do conversions provoke a remarkable slowdown?

If always not, Roma UCI should be really that weak in bookless play, starting from 20 ply openings.

There are other oldies like Mephisto Gideon Pro or Rebel working the same way as Mephisto Roma. If i can find spare time i will test them too.

BAYES RATINGS AFTER 14303 GAMES PLAYED BY 125 PROGRAMS
Rnk Name                     O/S T  Elo   +   - gam sco oppo dra
001 Komodo 9.2               A32 4 3339  44  41 254 82% 3080 27%
002 Stockfish 6              A32 4 3322  42  40 254 81% 3082 29%
003 Critter 1.6a             A32 4 3150  35  35 258 57% 3099 46%
004 Sting SF 4.8.4 JA        A32 4 3122  38  37 258 58% 3053 30%
005 Firenzina 2.4.1 xTreme   A32 4 3117  36  36 256 53% 3090 44%
006 BlackMamba 2.0           A32 4 3096  36  35 256 51% 3092 45%
007 Texel 1.05               A32 4 3054  37  37 260 52% 3039 32%
008 Rybka 2.3.2a mp          A32 4 3052  90  92  40 45% 3094 35%
009 DeepSaros ver.2.3f       A32 4 3035  37  37 264 56% 2988 33%
010 Senpai 1.0               A32 4 3003  37  37 272 54% 2977 28%
011 RobboLito 0.085e4l       A32 1 2973  37  37 266 52% 2964 30%
012 Cheng 4.39               A32 4 2965  37  37 270 49% 2975 30%
013 Hiarcs 13.71             IOS 2 2956 115 111  28 61% 2878 21%
014 Shredder 1.7.0           IOS 2 2922 109 108  26 54% 2903 46%
015 Gaviota v1.0-d           A32 4 2901  36  36 274 50% 2909 29%
016 Hakkapeliitta 3.0        A32 1 2882  37  37 266 47% 2914 29%
017 Arasan 18.0              A32 4 2881  35  36 276 48% 2902 32%
018 Exchess_7.71.beta.JA_xb  A32 4 2863  37  37 262 47% 2901 32%
019 Grapefruit 1.0           A32 4 2850  34  34 286 48% 2865 40%
020 Cyclone 3.4              A32 4 2839  35  35 276 51% 2834 39%
021 DiscoCheck 5.2.1         A32 1 2816  35  35 282 49% 2823 34%
022 Deep Saros 0.9           A32 4 2803  34  34 280 49% 2811 34%
023 Toga II 3.0              A32 1 2801  35  35 268 48% 2813 35%
024 Deuterium v14.3.34.130   A32 1 2766  34  34 288 52% 2752 37%
025 Bobcat 6.4b              A32 1 2759  35  35 282 49% 2770 26%
026 Fruit Reloaded 2.1       A32 1 2751  35  35 252 52% 2737 41%
027 GNU Chess 5.60           A32 1 2749  36  36 266 53% 2727 29%
028 Doch32 1.3.4 JA          A32 1 2748  35  35 272 48% 2766 33%
029 Chess Pro 2016.02        IOS 2 2744 112 114  22 45% 2772 55%
030 Murka 3                  A32 1 2733  33  33 294 52% 2721 35%
031 Scorpio_2.7.7.JA_xb      A32 1 2714  39  39 230 47% 2745 29%
032 The King 3.50 x64        A32 1 2703  46  46 158 42% 2759 32%
033 IvanHoe 9.46b            A32 4 2700  36  36 268 45% 2739 29%
034 Strelka 5                A32 1 2699  36  36 266 54% 2668 30%
035 TheMadPrune 1.7.04       A32 4 2675  38  38 240 45% 2706 30%
036 Crafty_24.1.JA_xb        A32 1 2668  36  35 280 53% 2645 25%
037 Rodent 1.7 build 1       A32 1 2639  35  36 270 49% 2648 31%
038 Tucano_5.00.JA_xb        A32 1 2634  38  38 224 47% 2656 36%
039 CNVCS 1.2.0              IOS 2 2622  94  95  36 46% 2645 36%
040 RedQueen 1.1.97          A32 4 2593  36  36 268 46% 2625 25%
041 Bison 15.1               A32 1 2592  36  36 272 49% 2597 30%
042 Rhetoric 1.4.1           A32 1 2586  36  36 272 52% 2577 29%
043 Alfil 12.10              A32 1 2568  36  36 268 49% 2570 29%
044 Gull 1.2 JA              A32 1 2547  36  36 276 52% 2528 29%
045 Chess Genius 4.0.00      IOS 2 2546 180 198  10 30% 2716 20%
046 Cheese 1.7               A32 1 2527  35  35 278 45% 2564 31%
047 Rotor 0.8                A32 1 2516  36  36 272 46% 2551 27%
048 Daydreamer 1.75 JA       A32 1 2508  35  35 274 49% 2514 34%
049 GarboChess 3             A32 1 2506  36  36 272 50% 2505 25%
050 Fridolin 2.00            A32 4 2492  35  35 274 52% 2475 32%
051 Chess Genius 2.6.4       A32 1 2489 224 239   4 38% 2538 75%
052 Glaurung Mainz           A32 1 2486  42  43 188 44% 2523 29%
053 Danasah_5.07.JA_xb       A32 1 2463  37  37 256 52% 2445 26%
054 BBChess 1.3b JA          A32 4 2455  35  35 276 51% 2445 28%
055 Sloppy_0.23.JA_xb        A32 1 2435  36  36 260 51% 2425 30%
056 Dirty_030411.JA_xb       A32 1 2434  37  37 258 51% 2430 27%
057 Phalanx_XXIV.JA_xb       A32 1 2418  37  37 264 48% 2432 19%
058 GreKo_12.5.JA_xb         A32 1 2408  37  37 258 55% 2368 24%
059 Pepito v1.59             A32 1 2398  35  35 274 51% 2390 34%
060 Pawny_1.0.JA_uci2xb      A32 1 2391  37  36 262 53% 2369 28%
061 BetsabeII_1.47.JA_xb     A32 1 2366  38  38 258 50% 2365 19%
062 Maverick 1.0 JA          A32 1 2355  36  36 282 48% 2367 22%
063 Chess Wise 3.1.7         IOS 2 2337 311 508   2  0% 2546  0%
064 Ifrit_m1.8.JA_uci2xb     A32 1 2327  37  36 260 54% 2296 28%
065 Typhoon_1.0.r358.JA_xb   A32 1 2312  36  36 272 55% 2272 25%
066 Zurichess Fribourg       A32 1 2312  55  55 122 51% 2313 21%
067 Diablo 0.5.1b JA         A32 1 2306  37  37 272 50% 2306 22%
068 Olithink_5.3.2.JA_xb     A32 1 2286  37  37 266 51% 2270 21%
069 Amy_0.8.JA_xb            A32 1 2278  36  36 280 47% 2297 23%
070 Myrddin_0.86.JA_xb       A32 1 2268  38  39 264 50% 2258 22%
071 TJchess 1.1U             A32 1 2250  36  36 288 50% 2248 23%
072 Bitfoot 150922.JA        A32 1 2228  38  38 286 57% 2170 16%
073 Natwarlal_0.14.JA_xb     A32 1 2226  37  37 266 52% 2209 24%
074 MangoPaola_1.1.JA_xb     A32 1 2223  38  38 270 52% 2199 21%
075 Sungorus 1.4 JA          A32 1 2199  37  37 266 45% 2235 24%
076 KmtChess_1.21.JA_xb      A32 1 2143  38  38 270 51% 2140 20%
077 NGplay_9.86.JA_xb        A32 1 2138  38  38 264 51% 2128 21%
078 Rattate_Nosferatu.JA_xb  A32 1 2132  38  38 272 48% 2151 17%
079 Scidlet_2.61b2.JA_xb     A32 1 2122  39  39 270 51% 2114 11%
080 Resp_0.19.JA_xb          A32 1 2105  37  37 286 51% 2097 18%
081 Clubfoot 150907.JA       A32 1 2068  39  38 290 65% 1941 13%
082 DanChess_1.04.JA_xb      A32 1 2047  38  38 268 49% 2052 19%
083 Kurt 0.9.2.2 JA          A32 1 1992  37  37 282 48% 2014 18%
084 Robocide 28.12.14.JA     A32 1 1973  35  35 310 50% 1971 19%
085 Witz_Alpha21.JA_xb       A32 1 1951  38  38 264 46% 1985 21%
086 Knightcap_3.7F.JA_xb     A32 1 1936  38  38 260 53% 1915 18%
087 Woodpecker_2.11.JA_xb    A32 1 1929  39  39 254 51% 1925 15%
088 AdroitChess0.4 JA        A32 1 1925  38  38 278 48% 1935 18%
089 BikJump v1.8  ğ          A32 1 1919  35  35 312 47% 1941 21%
090 Leonidas_r83.JA_xb       A32 1 1887  38  39 260 46% 1920 18%
091 Sjeng_1.12.JA_xb         A32 1 1887  39  39 260 51% 1885 17%
092 Gunborg_1.39.JA_uci2xb   A32 1 1883  42  41 242 61% 1791 18%
093 Galjoen_v0.30.2          A32 1 1871  49  49 156 54% 1846 21%
094 ZCT-0.3.2500             A32 1 1858  37  38 292 41% 1931 12%
095 Faile_1.44.JA_xb         A32 1 1852  38  38 250 47% 1875 27%
096 Mephisto Roma Turbo      W64 1 1845  79  83  56 37% 1946 16%
097 Cilian_4.14.JA_xb        A32 1 1841  36  36 284 49% 1848 26%
098 Samchess_JA_xb           A32 1 1836  40  40 248 45% 1881 17%
099 Ecce rev. 508            A32 1 1817  37  37 278 47% 1847 19%
100 Sayuri 2015.10.01        A32 4 1799  39  38 280 55% 1744 13%
101 Claudia v. 0.5           A32 1 1786  55  55 140 53% 1761 16%
102 Surprise_4.3.b13.JA_xb   A32 1 1784  52  52 150 51% 1773 13%
103 Colchess_8.0.JA_xb       A32 1 1778  41  41 224 51% 1760 22%
104 Smash 1.03 JA            A32 1 1766  38  38 290 49% 1765 13%
105 Zzzzzz_3.5.1.JA_xb       A32 1 1690  40  40 216 51% 1666 31%
106 Hoichess_0.12.1.JA_xb    A32 1 1676  38  39 266 50% 1659 20%
107 Chenard_2015.08.15.JA_xb A32 1 1666  45  46 218 47% 1659  9%
108 Tscp_1.8.1.AB_xb         A32 1 1647  43  43 216 46% 1656 18%
109 Colossus 4.0 100X        C64 1 1646 236 208  10 80% 1272 20%
110 Rocinante 2.0 JA         A32 1 1641  39  39 270 49% 1611 19%
111 Kitteneitor_060513.JA_xb A32 1 1639  41  42 206 45% 1641 35%
112 Jester_0.84.JA_xb        A32 1 1632  46  47 200 49% 1596 12%
113 Pulse 1.5-cpp            A32 1 1605  39  39 268 52% 1535 28%
114 Mephisto Roma 68020 UCI  W64 1 1585 123 133  20 35% 1686 30%
115 VIRUTOR CHESS 1.1.1      A32 1 1448  45  45 264 52% 1384 10%
116 Chess for Android        A32 1 1412  45  45 259 55% 1321 14%
117 Superpawn b108 JA        A32 1 1411  52  52 170 51% 1397 18%
118 K2 v.075                 A32 1 1398  58  57 166 55% 1350  5%
119 Chess Titans             W64 1 1311 212 203   7 57% 1281 29%
120 Trappy_Beowulf_2.0.JA_xb A32 1 1182  48  49 254 38% 1284  9%
121 Colossus 4.0             C64 1 1147 170 187  14 36% 1232 14%
122 Byak 8.10.14.JA          A32 1 1095  52  55 186 25% 1340 17%
123 Xadreco_5.7.JA_xb        A32 1  992  60  65 180 15% 1360  8%
124 Novag Secondo            TTC 1  926 271  38   6 42%  950 17%
125 OliveChess 0.2.7         A32 1  573 357-310 150  0% 1471  0%

Rapidroid test platform:
* GT-N7100 1.6 * 4 + 256MB hash: All Android progs
* GT-N5105 1.6 * 4 + 256MB hash: All Android progs
* Codegen Novatab 1.4 * 4 + 256MB: Single thread Android progs,
* Polypad 1010IPS tablet 1.61 * 2 + 128MB: Single thread Android progs
* HTC Diam 528Mhz, 16MB hash: Windows Mobile
* i7 M620 2.67 Ghz dual + Arena 3.5 + 2GB hash: Windows 64
* iPhone5S A7 1.3 Ghz * 2: iOS progs
* DosBox 1.74: DOS progs,
* WinVICE 2.24: Commodore-64 progs
* Messtiny UCI adapters or CB-Emu2014: Mephisto progs
* Openings: 20 ply from Adam Hair or 16 ply from TCEC, no Q exchange, +0.15 to +0.40 evaluated by Stockfish and Komodo at depth 20 minimum, played twice both sides
* Repeating openings and twin games not allowed between two programs
* Tablebases and pondering off
* Time control: 10 to 30 sec/move or 600+0 to 1800+5 or closest known by both programs.

No comments:

Post a Comment