Another delayed release to blame on the lack of time during week days.
The highlights are:
* ADDED Galjoen 0.30.2 as new Android engine
* ADDED Sayuri 2015.10.02 as new Android engine
* ADDED Mephisto Roma 68020 as new Windows engine
* ADDED Mephisto Roma Turbo as new Windows engine
* UPDATED Bitfoot 150707 to 150922
* UPDATED Clubfoot r2293 to 150907
* UPDATED Chenard 2014.07.11 to 2015.08.15
* UPDATED Gunborg 1.0 to 1.39
* CALIBRATED all engines according to CCRL 40/4
* EXCLUDED GreKo 13.1: Often hangs on move
* EXCLUDED Laser 0.2: Stays in RAM and hogs CPU
* EXCLUDED Mchess2: Doesn't start
* EXCLUDED Protector 1.9: Often hangs on move
ON A HOT ROOF:
Komodo 9.2 still leads. Error margins are wide enough but the superority over Stockfish 6 looks permanent. It's no surprise as major rating lists show the same on Windows. We're just witnessing that Android is no exception. I expect both Komodo and Stockfish will be updated just after TCEC-8 and maybe Rapidroid will include them in December release.
CALIBRATION HEADACHE:
I've put a lot of effort to establish a trustable correlation formula between CCRL and Rapidroid before this release without success. Grrr!
Unfortunately all below methods fail to cover the inconsistencies:
* Use only single thread 32 bit CCRL engines as anchors, both UCI and XB or only UCI
* Use only single thread 64 bit CCRL engines as anchors, both UCI and XB or only UCI
* Use all 32 bit CCRL engines as anchors, both UCI and XB or only UCI
I'm not an expert of statictics at all but i can understand why a perfect method can't be found:
1) 40/4 is not first priority of CCRL vs 40/40 and many Android engines miss in it. However Rapidroid must be calibrated vs CCRL 40/4 and not 40/40 given the CPU speeds and time controls.
2) Exynos 4412 is about 1.9 times slower than CCRL's Athlon X2 4600+ according to single threaded integer test of Geekbench, the only multiplatform cpu benchmark to represent chess performance because computer chess is more like integer math and not like floating point at all! At least cross-checking the results indirectly through different PC CPU's showed Exynos-Athlon equation is not far from being realistic. An error resides there though...
3) I've estimated Exynos to be 54 ELO weaker than Athlon assuming that each NPS doubling is 60 ELO. However we can't be sure whether it's 50 or 60 or 70 without deeply testing. Since i don't have time for that, the doubling estimation remains a mystery, a potential source of error.
4) There's still a distribution difference between Rapidroid and CCRL. We need to add many more samples to Rapidroid to see if the whole list will shrink more or not. I believe my ranking is slightly inflated than reality for the moment. Thus the toppers Komodo and Stockfish may be slightly overrated.
There are indeed other sources of deviation like multi thread scaling differences but let's not get into deep detail and come to the preliminary solution i've introduced: Bearing in mind the error won't be completely avoided, i've decided to distribute it in the widest possible way by using as many anchors as possible, 32 bit and 64 bit, estimating an additional gap from 32 to 64 bit. XB engines are left off given that Android implementation in Chess for Android is not at the same level as UCI.
Finally, the calibration of the ranking is now based on 39 UCI engines selected from CCRL 40/4. Rapidroid tries to place all Android engines 65 ELO lower than their 32 bit CCRL counterparts.
For 64 bit, the target is set to 110 ELO below CCRL. Simple and unproven! The usual gaps here would be 54 and 99 arithmetically but a safety margin must be preserved.
For 64 bit, the target is set to 110 ELO below CCRL. Simple and unproven! The usual gaps here would be 54 and 99 arithmetically but a safety margin must be preserved.
Now, i estimate the calibration error like +/- 30 ELO.
MEPHISTO SIMULATION DISASTER:
I don't own one of those Mephisto table tops from 1980s but i'd always dreamed of buying one when i was a child. Some dreams don't come true and that never happened. Hopefully after decades, we have simulation software like MESSUI which runs many older chess computers on PCs. They are really back!
More credits must be given to UCI compatible versions which can play automated games via Arena GUI. Besides, they can play vs Android engines via the network remote connection i've been using to rate Rybka and The King lately.
Therefore, two Mephisto's are welcome in Rapidroid. Indeed i had high hopes but Roma failed to reach the expected level of 1930 at original speed. With full speed of an i7 processor that didn't change too much. Only 1800 something. Great deception so far.
Should we blame the low performance on the slow network interface (wasn't it the same for Rybka anyway?) or the simulation interface MESSTINY managing things through the WB2UCI soft adapter?
Do conversions provoke a remarkable slowdown?
If always not, Roma UCI should be really that weak in bookless play, starting from 20 ply openings.
Do conversions provoke a remarkable slowdown?
If always not, Roma UCI should be really that weak in bookless play, starting from 20 ply openings.
There are other oldies like Mephisto Gideon Pro or Rebel working the same way as Mephisto Roma. If i can find spare time i will test them too.
BAYES RATINGS AFTER 14303 GAMES PLAYED BY 125 PROGRAMS
Rnk Name O/S T Elo + - gam sco oppo dra
001 Komodo 9.2 A32 4 3339 44 41 254 82% 3080 27%
002 Stockfish 6 A32 4 3322 42 40 254 81% 3082 29%
003 Critter 1.6a A32 4 3150 35 35 258 57% 3099 46%
004 Sting SF 4.8.4 JA A32 4 3122 38 37 258 58% 3053 30%
005 Firenzina 2.4.1 xTreme A32 4 3117 36 36 256 53% 3090 44%
006 BlackMamba 2.0 A32 4 3096 36 35 256 51% 3092 45%
007 Texel 1.05 A32 4 3054 37 37 260 52% 3039 32%
008 Rybka 2.3.2a mp A32 4 3052 90 92 40 45% 3094 35%
009 DeepSaros ver.2.3f A32 4 3035 37 37 264 56% 2988 33%
010 Senpai 1.0 A32 4 3003 37 37 272 54% 2977 28%
011 RobboLito 0.085e4l A32 1 2973 37 37 266 52% 2964 30%
012 Cheng 4.39 A32 4 2965 37 37 270 49% 2975 30%
013 Hiarcs 13.71 IOS 2 2956 115 111 28 61% 2878 21%
014 Shredder 1.7.0 IOS 2 2922 109 108 26 54% 2903 46%
015 Gaviota v1.0-d A32 4 2901 36 36 274 50% 2909 29%
016 Hakkapeliitta 3.0 A32 1 2882 37 37 266 47% 2914 29%
017 Arasan 18.0 A32 4 2881 35 36 276 48% 2902 32%
018 Exchess_7.71.beta.JA_xb A32 4 2863 37 37 262 47% 2901 32%
019 Grapefruit 1.0 A32 4 2850 34 34 286 48% 2865 40%
020 Cyclone 3.4 A32 4 2839 35 35 276 51% 2834 39%
021 DiscoCheck 5.2.1 A32 1 2816 35 35 282 49% 2823 34%
022 Deep Saros 0.9 A32 4 2803 34 34 280 49% 2811 34%
023 Toga II 3.0 A32 1 2801 35 35 268 48% 2813 35%
024 Deuterium v14.3.34.130 A32 1 2766 34 34 288 52% 2752 37%
025 Bobcat 6.4b A32 1 2759 35 35 282 49% 2770 26%
026 Fruit Reloaded 2.1 A32 1 2751 35 35 252 52% 2737 41%
027 GNU Chess 5.60 A32 1 2749 36 36 266 53% 2727 29%
028 Doch32 1.3.4 JA A32 1 2748 35 35 272 48% 2766 33%
029 Chess Pro 2016.02 IOS 2 2744 112 114 22 45% 2772 55%
030 Murka 3 A32 1 2733 33 33 294 52% 2721 35%
031 Scorpio_2.7.7.JA_xb A32 1 2714 39 39 230 47% 2745 29%
032 The King 3.50 x64 A32 1 2703 46 46 158 42% 2759 32%
033 IvanHoe 9.46b A32 4 2700 36 36 268 45% 2739 29%
034 Strelka 5 A32 1 2699 36 36 266 54% 2668 30%
035 TheMadPrune 1.7.04 A32 4 2675 38 38 240 45% 2706 30%
036 Crafty_24.1.JA_xb A32 1 2668 36 35 280 53% 2645 25%
037 Rodent 1.7 build 1 A32 1 2639 35 36 270 49% 2648 31%
038 Tucano_5.00.JA_xb A32 1 2634 38 38 224 47% 2656 36%
039 CNVCS 1.2.0 IOS 2 2622 94 95 36 46% 2645 36%
040 RedQueen 1.1.97 A32 4 2593 36 36 268 46% 2625 25%
041 Bison 15.1 A32 1 2592 36 36 272 49% 2597 30%
042 Rhetoric 1.4.1 A32 1 2586 36 36 272 52% 2577 29%
043 Alfil 12.10 A32 1 2568 36 36 268 49% 2570 29%
044 Gull 1.2 JA A32 1 2547 36 36 276 52% 2528 29%
045 Chess Genius 4.0.00 IOS 2 2546 180 198 10 30% 2716 20%
046 Cheese 1.7 A32 1 2527 35 35 278 45% 2564 31%
047 Rotor 0.8 A32 1 2516 36 36 272 46% 2551 27%
048 Daydreamer 1.75 JA A32 1 2508 35 35 274 49% 2514 34%
049 GarboChess 3 A32 1 2506 36 36 272 50% 2505 25%
050 Fridolin 2.00 A32 4 2492 35 35 274 52% 2475 32%
051 Chess Genius 2.6.4 A32 1 2489 224 239 4 38% 2538 75%
052 Glaurung Mainz A32 1 2486 42 43 188 44% 2523 29%
053 Danasah_5.07.JA_xb A32 1 2463 37 37 256 52% 2445 26%
054 BBChess 1.3b JA A32 4 2455 35 35 276 51% 2445 28%
055 Sloppy_0.23.JA_xb A32 1 2435 36 36 260 51% 2425 30%
056 Dirty_030411.JA_xb A32 1 2434 37 37 258 51% 2430 27%
057 Phalanx_XXIV.JA_xb A32 1 2418 37 37 264 48% 2432 19%
058 GreKo_12.5.JA_xb A32 1 2408 37 37 258 55% 2368 24%
059 Pepito v1.59 A32 1 2398 35 35 274 51% 2390 34%
060 Pawny_1.0.JA_uci2xb A32 1 2391 37 36 262 53% 2369 28%
061 BetsabeII_1.47.JA_xb A32 1 2366 38 38 258 50% 2365 19%
062 Maverick 1.0 JA A32 1 2355 36 36 282 48% 2367 22%
063 Chess Wise 3.1.7 IOS 2 2337 311 508 2 0% 2546 0%
064 Ifrit_m1.8.JA_uci2xb A32 1 2327 37 36 260 54% 2296 28%
065 Typhoon_1.0.r358.JA_xb A32 1 2312 36 36 272 55% 2272 25%
066 Zurichess Fribourg A32 1 2312 55 55 122 51% 2313 21%
067 Diablo 0.5.1b JA A32 1 2306 37 37 272 50% 2306 22%
068 Olithink_5.3.2.JA_xb A32 1 2286 37 37 266 51% 2270 21%
069 Amy_0.8.JA_xb A32 1 2278 36 36 280 47% 2297 23%
070 Myrddin_0.86.JA_xb A32 1 2268 38 39 264 50% 2258 22%
071 TJchess 1.1U A32 1 2250 36 36 288 50% 2248 23%
072 Bitfoot 150922.JA A32 1 2228 38 38 286 57% 2170 16%
073 Natwarlal_0.14.JA_xb A32 1 2226 37 37 266 52% 2209 24%
074 MangoPaola_1.1.JA_xb A32 1 2223 38 38 270 52% 2199 21%
075 Sungorus 1.4 JA A32 1 2199 37 37 266 45% 2235 24%
076 KmtChess_1.21.JA_xb A32 1 2143 38 38 270 51% 2140 20%
077 NGplay_9.86.JA_xb A32 1 2138 38 38 264 51% 2128 21%
078 Rattate_Nosferatu.JA_xb A32 1 2132 38 38 272 48% 2151 17%
079 Scidlet_2.61b2.JA_xb A32 1 2122 39 39 270 51% 2114 11%
080 Resp_0.19.JA_xb A32 1 2105 37 37 286 51% 2097 18%
081 Clubfoot 150907.JA A32 1 2068 39 38 290 65% 1941 13%
082 DanChess_1.04.JA_xb A32 1 2047 38 38 268 49% 2052 19%
083 Kurt 0.9.2.2 JA A32 1 1992 37 37 282 48% 2014 18%
084 Robocide 28.12.14.JA A32 1 1973 35 35 310 50% 1971 19%
085 Witz_Alpha21.JA_xb A32 1 1951 38 38 264 46% 1985 21%
086 Knightcap_3.7F.JA_xb A32 1 1936 38 38 260 53% 1915 18%
087 Woodpecker_2.11.JA_xb A32 1 1929 39 39 254 51% 1925 15%
088 AdroitChess0.4 JA A32 1 1925 38 38 278 48% 1935 18%
089 BikJump v1.8 ğ A32 1 1919 35 35 312 47% 1941 21%
090 Leonidas_r83.JA_xb A32 1 1887 38 39 260 46% 1920 18%
091 Sjeng_1.12.JA_xb A32 1 1887 39 39 260 51% 1885 17%
092 Gunborg_1.39.JA_uci2xb A32 1 1883 42 41 242 61% 1791 18%
093 Galjoen_v0.30.2 A32 1 1871 49 49 156 54% 1846 21%
094 ZCT-0.3.2500 A32 1 1858 37 38 292 41% 1931 12%
095 Faile_1.44.JA_xb A32 1 1852 38 38 250 47% 1875 27%
096 Mephisto Roma Turbo W64 1 1845 79 83 56 37% 1946 16%
097 Cilian_4.14.JA_xb A32 1 1841 36 36 284 49% 1848 26%
098 Samchess_JA_xb A32 1 1836 40 40 248 45% 1881 17%
099 Ecce rev. 508 A32 1 1817 37 37 278 47% 1847 19%
100 Sayuri 2015.10.01 A32 4 1799 39 38 280 55% 1744 13%
101 Claudia v. 0.5 A32 1 1786 55 55 140 53% 1761 16%
102 Surprise_4.3.b13.JA_xb A32 1 1784 52 52 150 51% 1773 13%
103 Colchess_8.0.JA_xb A32 1 1778 41 41 224 51% 1760 22%
104 Smash 1.03 JA A32 1 1766 38 38 290 49% 1765 13%
105 Zzzzzz_3.5.1.JA_xb A32 1 1690 40 40 216 51% 1666 31%
106 Hoichess_0.12.1.JA_xb A32 1 1676 38 39 266 50% 1659 20%
107 Chenard_2015.08.15.JA_xb A32 1 1666 45 46 218 47% 1659 9%
108 Tscp_1.8.1.AB_xb A32 1 1647 43 43 216 46% 1656 18%
109 Colossus 4.0 100X C64 1 1646 236 208 10 80% 1272 20%
110 Rocinante 2.0 JA A32 1 1641 39 39 270 49% 1611 19%
111 Kitteneitor_060513.JA_xb A32 1 1639 41 42 206 45% 1641 35%
112 Jester_0.84.JA_xb A32 1 1632 46 47 200 49% 1596 12%
113 Pulse 1.5-cpp A32 1 1605 39 39 268 52% 1535 28%
114 Mephisto Roma 68020 UCI W64 1 1585 123 133 20 35% 1686 30%
115 VIRUTOR CHESS 1.1.1 A32 1 1448 45 45 264 52% 1384 10%
116 Chess for Android A32 1 1412 45 45 259 55% 1321 14%
117 Superpawn b108 JA A32 1 1411 52 52 170 51% 1397 18%
118 K2 v.075 A32 1 1398 58 57 166 55% 1350 5%
119 Chess Titans W64 1 1311 212 203 7 57% 1281 29%
120 Trappy_Beowulf_2.0.JA_xb A32 1 1182 48 49 254 38% 1284 9%
121 Colossus 4.0 C64 1 1147 170 187 14 36% 1232 14%
122 Byak 8.10.14.JA A32 1 1095 52 55 186 25% 1340 17%
123 Xadreco_5.7.JA_xb A32 1 992 60 65 180 15% 1360 8%
124 Novag Secondo TTC 1 926 271 38 6 42% 950 17%
125 OliveChess 0.2.7 A32 1 573 357-310 150 0% 1471 0%
Rapidroid test platform:
* GT-N7100 1.6 * 4 + 256MB hash: All Android progs
* GT-N5105 1.6 * 4 + 256MB hash: All Android progs
* Codegen Novatab 1.4 * 4 + 256MB: Single thread Android progs,
* Polypad 1010IPS tablet 1.61 * 2 + 128MB: Single thread Android progs
* HTC Diam 528Mhz, 16MB hash: Windows Mobile
* i7 M620 2.67 Ghz dual + Arena 3.5 + 2GB hash: Windows 64
* iPhone5S A7 1.3 Ghz * 2: iOS progs
* DosBox 1.74: DOS progs,
* WinVICE 2.24: Commodore-64 progs
* Messtiny UCI adapters or CB-Emu2014: Mephisto progs
* Openings: 20 ply from Adam Hair or 16 ply from TCEC, no Q exchange, +0.15 to +0.40 evaluated by Stockfish and Komodo at depth 20 minimum, played twice both sides
* Repeating openings and twin games not allowed between two programs
* Tablebases and pondering off
* Time control: 10 to 30 sec/move or 600+0 to 1800+5 or closest known by both programs.
No comments:
Post a Comment