Below list is the result of ~16000 games played and seems accurate enough to me now. I don't see any big interest in going any further, up to all 50 Silversuite positions. It's better to stop here and switch to another experiment.
Reasons are:
> Starting from round-17, the list is not inflating or shrinking. It's arithmeticallly proven by the mean elo average of 92 programs fixed at 475 elo! The ranking is accurate and the whole bunch must have reached to a balance.
> Elo changes of each round dropped to as low as +/- 1 to 5. A very few engines take two digit updates.
> ...and a psychological reason: I wanna start something multi-platform with longer time controls, to cover a larger timeframe of the computer chess history, something partly retroactive.
I have also tried to align the whole list according to CCRL 40/4. A comparison between CPU power of the reference devices between Blitzoid and CCRL, revealed that Exynos-4412 running on 1.6Ghz x 4 cores is not a toy at all. In terms of nodes per second, Exynos is able to deliver about 1/4 to 1/3 search power compared to the Athlon that CCRL is using as reference CPU.
I knew i was too harsh about Android ratings since the beginning. Because being safe is better than exaggerating numbers.
But at this point i conclude the project earlier than expected, a calibration remains a must. I needed to add 250 elo to all the engines in order to obtain a comparable level.
Even after this, Blitzoid is around 200 ELO below CCRL, which should be quite reasonable. If Stockfish versions have slighly less gaps than other engines, this must be related to elo distortion of ~20 elo at the top. If we had only one Stockfish version playing, let's say v5, it would collect less ELO. This is because Stockfish versions obviously reach better elo against their predecessors compared to other engines. This is a known and common behaviour in computer chess.
Therefore, this is probably the final Blitzoid list to be published, before i go for the RAPIDROID project, mentioned before in other posts. My guess is that Rapidroid thing will bring much more fun.
BLITZOID RANKING / 03-SEP-2014
15968 GAMES PLAYED BY 92 PROGRAMS
COMPUTED WITH BAYESELO, OFFSET: 2367! (WAS 2117)
# Name elo + - gam sco oppo drw CCRL(gap)
01 Stockfish 5 3264 33 32 358 77% 3064 34% 3369(+105)
02 Stockfish DD 3227 31 31 358 73% 3066 41% 3310(+83)
03 Stockfish 4 3154 28 28 386 59% 3088 48% 3266(+112)
04 Stockfish 3 3147 29 29 386 58% 3088 38% 3231(+84)
05 Stockfish 2.3.1 3118 28 28 384 54% 3082 47% 3216(+98)
06 Critter 1.6a 3115 28 28 384 55% 3081 45% 3230(+115)
07 Critter 1.4 3112 29 29 384 55% 3070 43%
08 Critter 1.2 3051 31 31 380 58% 2973 36% 3204(+153)
09 Stockfish 2.0 3037 31 31 382 51% 3018 31% 3162(+125)
10 BlackMamba 2.0 3037 32 32 370 63% 2917 35%
11 RobboLito 0.085e4l 2972 31 31 372 48% 2977 35%
12 Komodo32 2.03 JA 2959 31 31 378 46% 2983 30% 3066(+107)
13 RobboLito 0.085g3l 2940 31 31 370 51% 2922 32%
14 Komodo32 3 AB 2920 32 32 370 48% 2925 27% 3104(+184)
15 Senpai 1.0 2893 31 31 372 44% 2936 31% 3102(+209)
16 Texel 1.04 2851 34 34 354 51% 2839 23% 2988(+137)
17 Komodo32 1.3 JA 2831 32 32 366 47% 2852 27% 2987(+156)
18 Gaviota v1.0 2830 32 32 366 56% 2785 29% 2961(+131)
19 Texel 1.03 2801 33 33 354 46% 2829 26% 2936(+135)
20 IvanHoe 9.46b 2748 33 33 358 54% 2717 23% 3082(+334)
21 Toga II 3.0 2744 33 33 366 50% 2754 23% 2878(+134)
22 Gaviota v0.86 2677 33 33 352 48% 2696 24%
23 Arasan 15.2 JA 2655 33 33 356 47% 2682 22%
24 Toga II 2.0 JA 2635 32 32 356 48% 2664 26%
25 Toga II 1.4.1SE 2635 32 32 358 44% 2692 28% 2822(+187)
26 DiscoCheck 3.7.1 2591 32 32 352 47% 2618 26% 2737(+146)
27 Texel 1.01 2590 32 32 360 45% 2636 24% 2795(+205)
28 Arasan 13.4 2572 32 32 350 48% 2593 26%
29 Arasan 14.0.1 2542 31 31 356 47% 2564 28%
30 GNU Chess 5.50 2539 31 32 352 47% 2562 28% 2770(+231)
31 DiscoCheck 4.0.1 2539 31 31 354 47% 2560 27%
32 Crafty_23.4.JA_xb 2533 32 32 352 51% 2535 23% 2779(+246)
33 gaviota v0.84 2520 32 32 348 51% 2522 22%
34 Crafty_23.5.JA_xb 2505 33 33 352 45% 2550 19% 2793(+288)
35 Rhetoric 1.4 2504 33 33 320 0.5 2503 26% 2720(+216)
36 RedQueen 1.1.2 2502 32 33 354 45% 2549 19%
37 Alfil 12.10 w32 2498 32 32 346 51% 2491 24% 2639(+141)
38 RedQueen 1.1.3 TCEC 2494 33 34 350 46% 2536 19%
39 Rodent 1.00 2486 31 31 344 53% 2465 32% 2690(+204)
40 Rotor 0.7a 2448 31 31 350 49% 2455 27% 2621(+173)
41 Rodent 0.18.0 2442 31 31 344 55% 2406 31%
42 Daydreamer 1.75 JA 2428 32 32 348 48% 2439 26% 2675(+247)
43 cheng3 1.07 JA 2426 32 32 344 49% 2437 22% 2659(+233)
44 GarboChess 3 2424 32 32 346 53% 2402 24%
45 Scorpio_2.7.JA_xb 2419 32 32 344 0.5 2422 23% 2774(+355)
46 Rotor 0.8 2417 32 32 342 49% 2425 27% 2613(+196)
47 gaviota v0.83 2400 33 33 342 48% 2415 21%
48 Sloppy_0.23.JA_xb 2376 31 31 344 48% 2386 28% 2621(+245)
49 Pepito v1.59 2349 33 33 340 49% 2356 22% 2516(+167)
50 Tucano_1.04.AB_xb 2336 33 33 340 51% 2326 21% 2547(+211)
51 Danasah_4.88.JA_xb 2330 32 32 342 48% 2339 26% 2541(+211)
52 DanasahZ_0.4.JA_xb 2319 32 32 338 49% 2328 28%
53 GNU Chess 6.0.2 2312 32 32 340 48% 2323 25%
54 DoubleCheck 2.6 JA 2311 33 33 340 51% 2301 19%
55 Danasah_5.06.JA_xb 2295 32 32 336 54% 2262 29%
56 DoubleCheck 2.7 2293 34 34 340 49% 2295 15%
57 BetsabeII_1.30.JA_xb 2269 33 33 340 52% 2247 17% 2367(+98)
58 Danasah_4.66.JA_xb 2265 33 33 338 53% 2240 24% 2532(+267)
59 Diablo 0.5.1b JA 2257 32 32 340 53% 2235 25% 2385(+128)
60 Typhoon_1.0.r358.JA_xb 2245 33 33 340 51% 2231 19% 2416(+171)
61 GreKo 9.0 JA 2227 33 33 340 53% 2203 19%
62 Greko 8.2 2216 32 32 336 55% 2181 26% 2525(+309)
63 Olithink_5.3.2.JA_xb 2216 34 34 340 52% 2201 17% 2407(+191)
64 GreKo 9.8 AB 2211 33 33 340 48% 2221 23% 2477(+266)
65 Phalanx_XXIII.JA_xb 2200 34 34 340 48% 2217 13% 2373(+173)
66 GreKo_10.0.JA_xb 2190 32 32 340 51% 2182 23% 2491(+301)
67 Sungorus 1.4 JA 2166 34 34 340 48% 2174 16% 2311(+145)
68 TJchess 1.1U 2100 33 34 340 51% 2082 23% 2334(+234)
69 BetsabeII_1.22.JA_xb 2094 35 35 336 56% 2041 16%
70 Natwarlal_0.14.JA_xb 2089 34 34 340 51% 2068 14% 2266(+177)
71 Myrddin_0.86.JA_xb 2080 35 34 340 55% 2030 15% 2366(+286)
72 DoubleCheck 2.3 2075 35 35 340 50% 2065 16%
73 KmtChess_1.21.JA_xb 2047 34 34 340 49% 2050 18% 2286(+239)
74 Jazz 6.40 JA 2047 34 34 340 47% 2064 20%
75 Scidlet_2.61b2.JA_xb 2003 35 35 340 50% 1997 17%
76 Jazz v444 JA 1984 35 35 340 49% 1989 17% 2213(+229)
77 Jazz v5.01 JA 1971 35 35 340 54% 1940 21% 2226(+255)
78 Sjeng_1.12.JA_xb 1830 37 37 338 51% 1808 10%
79 BikJump v1.8 1828 36 36 338 51% 1811 14%
80 AdroitChess0.4 JA 1802 38 38 334 49% 1803 11% 1978(+176)
81 AdroitChess 0.3 1749 38 38 330 50% 1728 14% 2001(+252)
82 Leonidas_r83.JA_xb 1744 37 38 334 53% 1702 16% 1956(+212)
83 ZCT-0.3.2500 1729 39 39 328 51% 1702 11% 2026(+297)
84 BikJump v2.1P 1698 38 38 330 49% 1689 14% 2102(+404)
85 Sjaak_4.68.JA_xb 1698 40 39 324 57% 1608 11%
86 Tscp_1.8.1.AB_xb 1612 40 40 324 50% 1598 10% 1704(+92)
87 Zzzzzz_3.5.1.JA_xb 1572 38 39 322 48% 1578 19%
88 Rocinante 2.0 JA 1512 40 40 318 49% 1519 12% 1602(+90)
89 VIRUTOR CHESS 1.1.4 1368 40 41 314 38% 1479 11%
90 VIRUTOR CHESS 1.1.1 1359 40 41 314 36% 1480 12%
91 Chess for Android 1220 45 47 314 22% 1498 8%
92 Simplex 0.9.8 995 65 14 314 7% 1527 3% 2413(+1418)
Blitzoid test platform:
* Samsung Galaxy Note II @ 1.6 Ghz without downscaling
* 64MB hash tables where selectable
* 4 cpu threads where selectable
* Own books disabled and replaced by Silver Opening Suite positions (20 of 50 played)
* Opening positions played twice with different colors
* Tablebases and pondering off
* GUI: Aart Bik's Chess for Android
* Time control: 5 sec/move
No comments:
Post a Comment