HAL9000

HAL9000
"It just isn't conceivable that you can design a program strong enough to beat players like me."

March 5, 2016

RAPIDROID RELOADED: March-2016 featuring Fritz 14, Gull 3 & Ivanhoe 9.47c

Following plenty of gauntlets, here's another update with some strong newcomers that we've been waiting since a long time.

Despite the high hopes, Fritz and Gull were not able to break into the top 5. Stockfish 7 still crushes everything that gets on its way. Fritz and Gull could bite Komodo from time to time but they can't steal enough draws from Stockfish. That makes the gap between the fish and the reptile more visible now. Android needs a new Komodo version...

The highlights of this release:
* ADDED Fritz 14
* UPDATED Gull 1.2 to 3
* UPDATED Ivanhoe 9.46h to 9.47c beta
* DELETED Cylcone due to similarity with Grapefuit & Toga
* DELETED The Mad Prune due to similarity with Grapefuit & Toga

After 4 extra days of delay, i was able to finish another complete round and to increase the number of games by 18, so that this release is not only an introduction of updates but also some more games by all engines. Narrower error margins matter, though, it's still more than 30 ELOs.

The nasty rule is: Less error requires more games >> More games require more full rounds >> More full rounds require less gauntlets, less updates. Unfortunately, the latter happens very rarely.

One thing which gets clear is the relationship between the average ELO difference between players and the accuracy of the ranking.

Shortly said, the main targets of a perfect rating list are:
* Unlimited & equal number of games played by each engine
* 100% draw rate
* Average ELO difference between opponents of each game = 0
* For each engine, a perfect bell curve distribution of the number of games vs all opponents within the neighborhood of +/- 100 ELO

Sure, no list can reach above targets at once. One can get close to the last one with extreme care but the first three are utopic and impossible.

No matter what, in order to improve the quality of Rapidroid, i've recently decided to continue with less promotions between divisions. It used to move 3 engines up and 3 engines down out of 10 engines after each round.

As it often leads to pairings beyond 200 ELO gap, 2 up and 2 down should be better.

Now, i expect that the draw ratio will increase. The average ELO gap must decrease in parallel. At present, Rapidroid has a low 25.7% draw ratio and ~100 ELO average gap which both mean the overall pairing scenario has been too agressive.

Targets for next release:
* UPDATE Arasan 18.2 to 18.3
* UPDATE Rodent 1.7 to II
* UPDATE Tucano 5.0 to 6.00

Have a nice checkmate!

BAYES RATINGS AFTER 17173 GAMES PLAYED BY 124 PROGRAMS
Rnk Name                     O/S T  Elo   +   - gam sco oppo dra
001 Stockfish 7              A32 4 3352  38  36 336 85% 3077 29%
002 Komodo 9.3               A32 4 3281  35  34 336 76% 3084 33%
003 Critter 1.6a             A32 4 3140  30  30 344 57% 3097 47%
004 Firenzina 2.4.1 xTreme   A32 4 3109  30  30 340 53% 3092 49%
005 Sting SF 4.8.4 JA        A32 4 3109  32  31 338 57% 3057 39%
006 BlackMamba 2.0           A32 4 3072  30  30 338 47% 3096 49%
007 Rybka 2.3.2a mp          W64 4 3070 109 107  26 54% 3044 38%
008 Fritz 14                 A32 4 3067  30  30 336 49% 3078 48%
009 Gull 3 x64 (syzygy)      A32 4 3052  31  31 332 50% 3054 41%
010 Texel 1.05               A32 4 3037  31  31 342 48% 3055 35%
011 Senpai 1.0               A32 4 3016  32  32 338 51% 3011 36%
012 DeepSaros ver.2.3f       A32 4 3010  32  32 334 51% 3003 36%
013 Hiarcs 13.71             IOS 2 2982 121 113  26 65% 2873 23%
014 RobboLito 0.085e4l       A32 1 2976  31  31 342 51% 2968 42%
015 IvanHoe 9.47c beta       A32 1 2970  32  32 326 54% 2948 39%
016 Cheng 4.39               A32 4 2938  32  32 334 44% 2987 38%
017 Shredder 1.7.0           IOS 2 2918 114 113  24 54% 2901 42%
018 Hakkapeliitta 3.0        A32 1 2905  32  32 328 49% 2913 34%
019 Scorpio_2.7.7.JA_xb.arm7 A32 4 2899  36  36 284 56% 2853 28%
020 ExChess_7.88b.JA_xb.arm7 A32 4 2896  32  32 322 44% 2945 38%
021 Gaviota v1.0-d           A32 4 2894  32  32 340 44% 2936 31%
022 Arasan 18.2              A32 4 2894  32  32 330 50% 2895 31%
023 Grapefruit 1.0           A32 4 2855  30  30 340 46% 2877 41%
024 Toga II 3.0              A32 1 2838  32  32 314 55% 2805 37%
025 Deep Saros 0.9           A32 4 2822  31  31 330 47% 2836 39%
026 DiscoCheck 5.2.1         A32 1 2821  32  32 340 43% 2875 32%
027 Deuterium v14.3.34.130   A32 1 2776  31  31 332 50% 2774 40%
028 Bobcat 6.4b              A32 1 2774  32  32 328 49% 2775 30%
029 Doch32 1.3.4 JA          A32 1 2771  32  32 324 48% 2787 37%
030 Crafty_25.0.JA_xb.arm7   A32 1 2769  32  32 314 54% 2742 36%
031 Fruit Reloaded 2.1       A32 1 2765  32  31 308 48% 2780 43%
032 Murka 3                  A32 1 2763  31  31 338 49% 2768 37%
033 Chess Pro 2016.02        IOS 2 2760 111 113  22 45% 2788 55%
034 GNU Chess 5.60           A32 1 2741  33  33 316 51% 2732 28%
035 The King 3.50 x64        W64 1 2733  51  52 122 44% 2772 34%
036 Strelka 5                A32 1 2722  33  33 314 52% 2708 30%
037 RedQueen 1.1.98          A32 4 2678  33  33 310 53% 2656 29%
038 CNVCS 1.2.0              IOS 2 2665 108 109  26 48% 2671 42%
039 Tucano_5.00.JA_xb        A32 1 2654  35  36 260 48% 2666 35%
040 Rodent 1.7 build 1       A32 1 2645  33  33 314 48% 2659 29%
041 Rhetoric 1.4.1           A32 1 2628  33  33 312 54% 2598 32%
042 Mini Rodent 1.0          A32 1 2624  35  35 268 46% 2649 33%
043 Bison 15.1               A32 1 2615  33  33 314 51% 2610 31%
044 Chess Genius 4.0.00      IOS 2 2580 194 254   8 13% 2831 25%
045 Alfil 12.10              A32 1 2577  33  33 314 49% 2583 29%
046 Rotor 0.8                A32 1 2547  33  33 320 47% 2570 31%
047 Daydreamer 1.75 JA       A32 1 2546  32  32 322 50% 2550 34%
048 Cheese 1.7               A32 1 2535  33  34 312 43% 2585 30%
049 Fridolin 2.00            A32 4 2515  32  32 314 53% 2492 33%
050 Chess Genius 2.6.4       A32 1 2514 224 239   4 38% 2562 75%
051 GarboChess 3             A32 1 2501  33  33 312 48% 2520 26%
052 Glaurung Mainz           A32 1 2497  41  41 208 43% 2545 26%
053 Danasah_5.07.JA_xb       A32 1 2487  35  35 296 55% 2446 27%
054 Sloppy_0.23.JA_xb        A32 1 2482  33  33 300 50% 2477 33%
055 BBChess 1.3b JA          A32 4 2475  32  32 324 51% 2465 29%
056 Maverick 1.5 arm         A32 1 2470  33  32 326 56% 2421 31%
057 Dirty_030411.JA_xb       A32 1 2467  35  35 294 52% 2454 28%
058 Phalanx_XXIV.JA_xb.arm7  A32 1 2455  35  35 302 49% 2464 20%
059 Pawny_1.0.JA_uci2xb      A32 1 2428  33  33 312 51% 2416 32%
060 GreKo_12.5.JA_xb         A32 1 2418  34  34 302 52% 2399 25%
061 Pepito v1.59             A32 1 2418  32  32 320 50% 2419 33%
062 BetsabeII_1.47.JA_xb     A32 1 2399  36  36 300 50% 2397 18%
063 Ifrit_m1.8.JA_uci2xb     A32 1 2381  34  34 306 55% 2344 28%
064 Diablo 0.5.1b JA         A32 1 2345  33  33 326 52% 2329 25%
065 zurichess geneva         A32 1 2342  51  51 140 49% 2347 26%
066 Typhoon_1.0.r358.JA_xb   A32 1 2341  34  34 318 54% 2306 24%
067 Olithink_5.3.2.JA_xb     A32 1 2324  34  34 316 51% 2312 21%
068 Amy_0.8.JA_xb            A32 1 2294  34  34 330 50% 2292 21%
069 Myrddin_0.86.JA_xb       A32 1 2278  35  35 314 48% 2283 21%
070 TJchess 1.1U             A32 1 2273  33  33 346 48% 2286 23%
071 Natwarlal_0.14.JA_xb     A32 1 2272  34  33 328 52% 2248 23%
072 Bitfoot 150922.JA        A32 1 2269  34  34 348 57% 2207 15%
073 MangoPaola_1.1.JA_xb     A32 1 2261  34  34 326 50% 2253 21%
074 Sungorus 1.4 JA          A32 1 2241  34  34 322 48% 2251 23%
075 KmtChess_1.21.JA_xb      A32 1 2197  34  34 332 52% 2183 22%
076 Rattate_Nosferatu.JA_xb  A32 1 2182  34  34 336 49% 2193 17%
077 NGplay_9.86.JA_xb        A32 1 2176  33  33 332 52% 2164 23%
078 Scidlet_2.61b2.JA_xb     A32 1 2168  35  34 336 52% 2149 13%
079 Resp_0.19.JA_xb          A32 1 2140  33  33 348 50% 2141 19%
080 Clubfoot 150907.JA       A32 1 2116  36  35 338 62% 2014 14%
081 DanChess_1.04.JA_xb      A32 1 2084  35  35 326 51% 2077 17%
082 Floyd 0.7 JA             A32 1 2082  35  35 344 56% 2029 12%
083 Kurt 0.9.2.2 JA          A32 1 2049  34  34 348 47% 2077 18%
084 Robocide 28.12.14.JA     A32 1 2033  32  32 372 51% 2027 19%
085 Witz_Alpha21.JA_xb       A32 1 2017  34  34 330 48% 2037 20%
086 Woodpecker_2.11.JA_xb    A32 1 1994  35  35 324 50% 1991 16%
087 Knightcap_3.7F.JA_xb     A32 1 1980  35  35 308 52% 1959 18%
088 AdroitChess0.4 JA        A32 1 1970  35  35 336 48% 1984 16%
089 BikJump v1.8             A32 1 1961  32  33 358 46% 1996 22%
090 Sjeng_1.12.JA_xb         A32 1 1944  35  35 314 50% 1943 16%
091 Gunborg_1.39.JA_uci2xb   A32 1 1942  38  37 294 60% 1861 20%
092 Leonidas_r83.JA_xb       A32 1 1931  35  35 316 47% 1954 19%
093 ZCT-0.3.2500             A32 1 1917  35  35 328 42% 1980 12%
094 Faile_1.44.JA_xb         A32 1 1909  34  34 304 47% 1929 28%
095 Samchess_JA_xb           A32 1 1897  36  36 314 42% 1966 17%
096 Mephisto Roma Turbo      W64 1 1896  79  83  56 37% 1997 16%
097 Cilian_4.14.JA_xb        A32 1 1894  34  34 328 53% 1870 26%
098 Ecce rev. 508            A32 1 1856  35  35 324 44% 1908 16%
099 Sayuri 2015.10.01        A32 4 1840  35  35 330 53% 1800 14%
100 Colchess_8.0.JA_xb       A32 1 1818  37  37 274 52% 1793 24%
101 Smash 1.03 JA            A32 1 1814  35  35 336 50% 1807 14%
102 Claudia v. 0.5           A32 1 1811  52  52 158 51% 1804 13%
103 Surprise_4.3.b13.JA_xb   A32 1 1805  49  49 166 48% 1816 16%
104 Zzzzzz_3.5.1.JA_xb       A32 1 1716  37  37 264 49% 1702 31%
105 Hoichess_0.12.1.JA_xb    A32 1 1713  35  36 318 48% 1716 19%
106 Chenard_2015.08.15.JA_xb A32 1 1702  42  42 262 47% 1694 10%
107 Kitteneitor_060513.JA_xb A32 1 1695  37  38 254 47% 1690 35%
108 Tscp_1.8.1.AB_xb         A32 1 1686  40  40 260 47% 1686 16%
109 Jester_0.84.JA_xb        A32 1 1681  42  42 248 49% 1644 13%
110 Colossus 4.0 100X        C64 1 1678 237 208  10 80% 1300 20%
111 Rocinante 2.0 JA         A32 1 1675  37  37 316 50% 1634 18%
112 Pulse 1.5-cpp            A32 1 1621  37  37 320 55% 1521 28%
113 Mephisto Roma 68020 UCI  W64 1 1616 124 134  20 35% 1718 30%
114 VIRUTOR CHESS 1.1.1      A32 1 1486  41  41 320 52% 1425 10%
115 Superpawn b108 JA        A32 1 1455  47  47 212 54% 1415 19%
116 K2 v.075                 A32 1 1439  52  51 206 57% 1372  6%
117 Chess for Android        A32 1 1434  41  42 313 52% 1371 14%
118 Chess Titans             W64 1 1336 212 203   7 57% 1306 29%
119 Trappy_Beowulf_2.0.JA_xb A32 1 1196  44  46 308 37% 1317  9%
120 Colossus 4.0             C64 1 1171 170 185  14 36% 1253 14%
121 Byak 8.10.14.JA          A32 1 1128  47  49 226 25% 1372 18%
122 Xadreco_5.7.JA_xb        A32 1 1018  54  59 220 14% 1393 10%
123 Novag Secondo            TTC 1  945 279  20   6 42%  964 17%
124 OliveChess 0.2.7         A32 1  568 390-352 180  0% 1503  0%

Rapidroid test platform:
* GT-N7100 1.6 * 4 + 256MB hash: All Android progs
* GT-N5105 1.6 * 4 + 256MB hash: All Android progs
* Codegen Novatab 1.4 * 4 + 256MB: Single thread Android progs,
* Polypad 1010IPS tablet 1.61 * 2 + 128MB: Single thread Android progs
* HTC Diam 528Mhz, 16MB hash: Windows Mobile
* i7 M620 2.67 Ghz dual + Arena 3.5 + 2GB hash: Windows 64
* iPhone5S A7 1.3 Ghz * 2: iOS progs
* DosBox 1.74: DOS progs
* WinVICE 2.24: Commodore-64 progs
* Messtiny UCI adapters or CB-Emu2014: Mephisto progs
* Openings: 20 ply from Adam Hair or 16 ply from TCEC, no Q exchange, +0.15 to +0.40 eval by Stockfish and Komodo, depth 20 minimum, played twice both sides
* Repeating openings and twin games not allowed between two programs
* Tablebases and pondering off
* Time control: 10 to 30 sec/move or 600+0 to 1800+5 or closest known by both programs.

Calibration:
* Based on 42 engines rated in CCRL 40/4
* 32 bit engines: Exynos 4412 = (Athlon X2 4600+) - 65 ELO
* 64 bit engines: Exynos 4412 = (Athlon X2 4600+) - 110 ELO
* Bayeselo offset = 2309 (Mean of ELO error vs target: 37.74)

6 comments:

Alex said...

Hi Gurcan!
Do you know how to add opening book to SF 7 in C4a?

Sedar said...

Hello Gurcan,

Indeed, we need next release Komodo or "new" reptile/fish in rating list ;-)

Good job! Thank you :-)

Unknown said...

What program do you use to calculate/record your Bayes Ratings? Is it a program or spreadsheet that you can publish a link for on here? Thanks.

Unknown said...

I maintain all the rounds in a huge excel file where all openings, promotions and updates are monitored. Is this what you wanna see? Other than xls, there's nothing specific: Some Arena help to verify, one big pgn per round and bayeselo to calculate

Unknown said...

Alex, no. It doesn't work. Only Komodo and a few others which come with indirect methods like Critter.

Unknown said...

Thanks for the information. I've been manually entering and keeping tournaments and gauntlets on a spreadsheet too and manually calculating the elo, but wanted something more automatic. I found both the Bayes program and ELOstat that pulls all the data straight off the tour.pgn file that CfA creates. Really liking ELOstat as it is VERY easy to use. Thanks again.