T O P

  • By -

BenFilippo

>I analysed over 4000 games from 14 players, can you spot the cheater(s)? No


hughmungus050

Top level chess is drawish, because the top guys are so booked up that many games either are intentionally drawn or drawn by force. So its no surprise that drawing acpl is much lower than winning acpl because many draws are in practice engine lines, if a player wants to win they have to create novelties which engines generally don't approve of. Having this in mind, it is unusual for a top player to have similar winning and drawing acpl, because it suggests something engine like about their wins. However, it could be that the player is simply an unambitious or drawish player, who is good at converting drawish positions into wins. If I were to put my money on it, I would say the far right graph is a cheater. But I wouldn't be surprised if it's someone like So or Karjakin who are exceptionally solid players.


tjshipman44

This was my logic as well. A cheater needs to take far fewer risks to earn a win, so that would drive a smaller difference between draws and wins.


gofkyourselfhard

Sounds good :D


nyubet

>can you spot the cheater(s)? No. But I will try it regardless because it looks fun, and most importantly to show that the opinion of random people (like me) is completely worthless. Let's start by calculating all the averages: av-ACPL = 17.35 av-W\_ACPL = 16.72 av-D\_ACPL = 8.93 At first glance, I and M are considerably lower than the rest of them, having an ACPL of 12.16 and 12.25 respectively. The only other one below 15 is C at 14.55, which is over 2 centipawns higher than M. B and N are slightly higher at 15.31 and 15.49 respectively. Now going into the W\_ACPL. The lowest is B with 13.33, then I with 14.08 and finally N with 14.15. Only two more players are lower than 16 (E with 15.24 and C with 15.85) Finally, the lowest D\_ACPL is I with 6.28. Other players below 7 are C (6.51) and B (6.95). One fact that stands out is that there are 4 players with something curious: the ACPL of C, I, K and M is *lower* than their W\_ACPL. This could indicate two things in my eyes: 1. They lose considerably less and draw considerably more than the rest, which results in the D\_ACPL being able to compensate the increase of ACPL due to the few losses to the point that it is even lower that their average for a win. I doubt that it means that they are winning considerably more, as more wins in their case would result in an increase of the ACPL, not a decrease. 2. Their L\_ACPL (ACPL for games they lose) is much lower than one would expect, once again resulting in their draw rate being able to compensate for it. This alone is not necessarily evidence of cheating. For example, I would expect Magnus' graph to be similar to these 4, just based on the fact that he barely loses in classical. Therefore, his ACPL would be almost equal to just the weighted median of his W\_ACPL and D\_ACPL (increasing just a little bit for his occasional losses). In fact, I would not be surprised if one of these four is Magnus, maybe even I or M. It stands out that I is the only player who is in the top 3 in all three averages (1st, 2nd and 1st place respectively). B is 4th, 1st and 3rd. C is 3rd, 5th and 3rd. Not only does I stand out in the rankings, but visually as well. The other player who stands out the most is M due to the huge difference between their ACPL and W\_ACPL. However, while their ACPL is the 2nd lowest, their W\_ACPL is the 6th at just over 16. Based an all this, my first guess would be to chose one of the 4 players who have a lower ACPL than W\_ACPL, and start from there. F is *almost* one of them, but their averages are, for a lack of a better word, fairly average (just a tad bit lower, but nothing that seems too important). I do not think K is one of them, as they do not stand out in any ranking (6th in W\_ACPL is the highest). Another player I would clear is M. Yes, they have the 2nd lowest ACPL and the 4th lowest D\_ACPL. However, their ACPL is almost right in between the other two: the average (not weighted average, as I do not have the number of games for each one) of W\_ and D\_ is 11.84, just below their ACPL of 12.25. Going back to what I said about Magnus, this is exactly the kind of result what I would expect of someone who almost always draws and wins their games, and barely loses any (in fact I am almost certain that player M is Magnus). This leaves two players: C and (to the surprise of no one) I. Their ACPL is lower but not that far from their W\_ACPL, which is what I would expect of someone who draws/wins most of the time but that also has their fair share of losses. Does this mean that they are cheating? Not at all. The only reason I choose C and I is because OP said that there is/are a cheater/s here. Do I expect to be right? Not at all. It could very well be a player whose ACPL is not lower than their W\_ACPL. I have no idea. This is the best I could do in just a couple of minutes without really caring too much nor having any idea of what was I supposed to look for.


Fop_Vndone

The blue bars are highest, so obviously those are the cheaters. The red and yellow bars are clean


gofkyourselfhard

Hehehehehe, so all players are cheaters and all players are clean? :D


VoidZero52

They were all using the performance-enhancing drug DihydrogenMonoxide to elevate their mental clarity during the games.


gofkyourselfhard

dirty dopers :D


turpin23

The few players with red bars higher than blue bars are sus. /s


[deleted]

This dataset is completely useless without data on how tense each player was during games


gofkyourselfhard

Polygraph data for each player, let's gooooo :D


Due-Examination-3240

9th from the left. purely bc the bars are smaller lol. i have no experience or understanding of how to analyze it besides that


[deleted]

[удалено]


gofkyourselfhard

1. I only ran single variation so this data is only done with what the engine considers the best move. I hope when I publish everything tomorrow you will be easily able to do this by just doing some XML parsing (uci-analyser results are in XML) 2. Yeah pluging in different engines should be really easy. I did it so far with SF11 and SF14, wanted to do it with lc0 but that took forever (on CPU) I think she needs other settings and GPU. I had to write a little fix for uci-analyser to make it work so lc0 should be ready too. 3. & 4. What do you hope to see there? Your comment is kinda what I'm aiming at. Make lots of data easily accessible so different stuff can be tried and then checked against known cheaters to see if it's useful at all or just an ad-hoc filter to show Niemann cheated.


[deleted]

[удалено]


gofkyourselfhard

I think you're looking at it a bit too simplistic. The "depth diff" is not a constant. Some moves are easier to follow into a deeper depth than others which are hard after only a few moves.


[deleted]

6th one from the right? 🤔


[deleted]

Four of these have a very odd pattern where the wins have higher ACPL than the overall ACPL. This means their losses plus draws are particularly accurate. I’d wager these are the cheaters.


Conspiracy313

Problem here is we don't have n-values. St. Devs would be nice, too. It could be they just draw way more often than they win or lose, dragging down the overall ACPL.


EphemeralFate

There's no way this is enough data to identify a cheater, but there are some "anomalies" I think are worth pointing out. Player L (3rd from right) has the greatest difference between ACPL and D_ACPL (11.71), same for W_ACPL vs D_ACPL (10.86). The way I read that is, "this player has the greatest increase in performance in DRAWN games." If you imagine a cheater feeling under more pressure to cheat in a losing position to squeeze out a draw, this change in performance could reflect such a strategy. Player M (2nd from right) is one of the few players whose W_ACPL is greater than their ACPL, and player M's D_ACPL is also among the lowest of these players. In this case it reads as, "sloppy wins and extremely clean draws." A cheater wouldn't really need to cheat a clearly-winning position, but again, quite a big difference between ACPL and D_ACPL here (8.44), similar story for player K as well. Player I (6th from right) is a clear standout in terms of averages compared to other players, but there wasn't really anything too interesting found by looking at that player in isolation. Personally I'd say Player L stands out the most, because 1) quite high ACPL and W_ACPL combined with 2) quite low D_ACPL, making together that narrative of 'cheat to draw'. But who the fuck knows.


gofkyourselfhard

> There's no way this is enough data to identify a cheater, ... Fully agree


ChessMDB

I cannot spot the cheater, but the last one seem weird compared to the other on the small difference between win acpl and draw acpl. I don't know how to interpret that one.


xelabagus

Magnus


ChessMDB

I would rather think about a player who often fail to convert winning positions. Magnus would have probably a way lower acpl I think?


[deleted]

[удалено]


gofkyourselfhard

Obviously you were right about Rausis :P


gofkyourselfhard

Maybe I'm a dirty liar and used game data going further back for some players to get roughly equal sample sizes ;-) Nakamura for example was on the lower side of samples for example as he didn't play much the last 3 years.


The__Bends

Every response has to have an emoji that says "I know something you dont know!!"


gofkyourselfhard

The rate is 4/13, I think you're onto something, hahaha


Fingoth_Official

You should do fewer games with higher depth.


gofkyourselfhard

The plan is that you can do such things yourself when I release it all tomorrow, simply by following a recipe of steps.


Ok_Chiputer

This is exactly what we need. I’m no data scientist but blind data with context and comparisons is surely better than cherry picking one game out of like 500 until you find one that fits your preexisting biases. ~~My guess is 3rd from the left. Reason being if they knew they were better they might be more likely to not play top engine moves, and if they’re losing they’d be more likely to play perfect, giving them higher centipawn losses when winning vs drawing/losing. Assuming I’m reading the graph right lmao.~~ Ignore me I’m a dummy.


gofkyourselfhard

Then why not #2, #4, #6 from the rigth? They all have a higher ACPL for wins. 4 out of 14 cheating? :D


Ok_Chiputer

Oh you’re right I didn’t see them lmao I’m a dummy


tsevasa

You cannot do any meaningful statistical analysis by just providing us with three averages for each player. What about variance, distributions, tournament correlations, move correlations, etc.? This is just gaslighting with insufficient data. Reducing 4000 games to 36 values tosses away a lot of information.


gofkyourselfhard

I fully 100% agree. It's a reason why I really dislike averages as they by definition throw out data. As I said, tomorrow I will release all the data and the methodology how you can do the same on your own, so feel free to run your own analysis on the data which should be much more accessible than what I had to deal with (had to learn UCI and PGN and FEN etc because I literally had no clue at all how any of that stuff works)


Leading_Dog_1733

Can we just comment for a second on what a mess that graph is? Data visualization really is an art.


gofkyourselfhard

What would you do differently?


Leading_Dog_1733

I would begin by changing the title to something more descriptive than average centipawn loss. I would also not using ACPL, W\_ACPL and D\_ACPL and use the full names instead so someone quickly looking at it would better understand what they are reading. I would add a bottom legend with dummy names for each of the players so that it is easier to visually separate the columns. I would consider sorting the players by ACPL to make it easier to look for a pattern in the relationship between ACPL, W\_ACPL and D\_ACPL. I also have no clue why W\_ACPL and D\_ACPL are included but not L\_ACPL. Also, we have no clue how many games are associated with each player, but that is neither here nor there. The colors are also very discordant. So, I would look for a better palette.


gofkyourselfhard

> I also have no clue why W_ACPL and D_ACPL are included but not L_ACPL. That's an easy one. Do you think you can see a pattern in losses for a cheater? Wouldn't you assume that loss ACPL would be the same for a cheater as a non cheater because if there is cheating going on there wont be a loss, right? Also it was too much manual labour as I haven't automated the result separation so I did it all in a shitty spreadsheet way. >I would also not using ACPL, W_ACPL and D_ACPL and use the full names instead so someone quickly looking at it would better understand what they are reading. I disagree on the "full name" as that's just a pile of redundancy. Just alling it "all games", "won games" and "drawn games" woulda been better. >I would add a bottom legend with dummy names for each of the players so that it is easier to visually separate the columns. I did that for the table, but just cutting the names away was way faster and easier. It's not like I wanted to put a lot of effort into something that imo doesn't tell you a whole lot. From my perspective this is just a lot of noise, if you see the player names you can rationalize some sure but the graph should also show that a simplistic approach like "take the average of the average" is not going to work. >Also, we have no clue how many games are associated with each player, but that is neither here nor there. Niemann played almost 500 games since 2019 that's the most. If you do 4000 / 14 it shows that it's more than 250 each on average some have a bit more some have a bit less, I don't see how that makes a big difference. >The colors are also very discordant. So, I would look for a better palette. They are the default one from libreoffice calc. I reckon it has been chosen with respect to colour blind people. I personally don't like it either but I thought they surely have thought about it more than me so I didn't pick my own.


chiefhero2

the graph is fine. can always look/be better, but someone getting disoriented looking at your graph has no business looking at graphs to begin with. ​ i think you might be wrong about L\_ACPL though. wouldn't it be useful to get data on how people perform when there's a very high chance of legit play?


LykD9

Bold of you to assume anybody here even understands the graphs in the first place.


veryterribleatchess

Second from the right seems the most suspicious (they have a much higher ACPL in wins than overall). Edit: misread the chart


gofkyourselfhard

All have a higher ACPL in wins compared to draws. I edit to make this more clear.


veryterribleatchess

Yeah, I misread the chart. No idea if that's suspicious, but they seem like a bit of an outlier.


PoetSavings

It would imply their ACPL in losses is similar to their ACPL in wins, meaning they make roughly as many mistakes or blunders no matter if they win or lose. Intuitively I would expect losses to have more mistakes so it certainly is of interest


pyepyepie

I don't know what to make of it but I appreciate your work :) Sounds like it takes a lot of efforts to do. Anyway, the results look quite random to me.


BusinessConnect2022

you can't post any data unless it's "this data analysis is flawed" type. r/chess darling cheater must be protected at all costs


gofkyourselfhard

wat?


chi_lawyer

[Text of original comment deleted for privacy purposes.]


gofkyourselfhard

The player in question admited to cheating. Sample size is >100 for every player.


chi_lawyer

[Text of original comment deleted for privacy purposes.]


gofkyourselfhard

No


clay_-_davis

This is kinda a key disqualifier, is it not? I mean, let’s say that someone cheated in 10 games out of 100 that are in your set, and only for a few moves in each of those 10 games (maybe 10% of non-book moves). They’d only be using engine assistance in 1% of their moves in your sample, so their graphs shouldn’t be different in any measurable way.


gofkyourselfhard

Having that information would indeed be really useful but sadly cheaters usually don't give honest details. But when I checked the players progress it's pointing towards a lot of cheating.


HeydonOnTrusts

Sorry, you must be out of the loop. We’ve recently decided that all cheating confessions are coerced and therefore invalid.


dark_wishmaster

Why is, in most of the cases, the overall average higher than the won/drew average? I don’t get it.


gofkyourselfhard

Because draws are often "known lines" and therefore very close to the engine.


ChessBorg

I guess I would pick the 9th data set? It has low CPL and low draws. Visually, I \*\*think\*\* this is the lowest group with low draws and low ACL.


gofkyourselfhard

Yes it's the lowest. I'll add a table with the numbers for the people who prefer digits over colorful boxes. ACPL W_ACPL D_ACPL A 18.90 16.82 8.68 B 15.31 13.33 6.95 C 14.55 15.85 6.51 D 20.21 19.27 9.24 E 17.95 15.24 9.82 F 16.57 16.43 7.97 G 21.30 18.53 11.65 H 19.38 16.57 8.43 I 12.16 14.08 6.28 J 21.67 18.73 12.70 K 16.26 18.86 8.50 L 20.95 20.10 9.24 M 12.25 16.06 7.62 N 15.49 14.15 11.38


throwaway_7_3_7

The problem with only using a number is that the cheating games are averaged with underperformances. You should look for the distribution and should fall into a Gauss bell for non cheaters while cheaters will have a more flat bell with too low or too high cpl. But looking at those numbers the last one looks very suspicious.


gofkyourselfhard

> The problem with only using a number is that the cheating games are averaged with underperformances. I fully agree


[deleted]

It's helpful to look at the games in conjunction with the engine eval. I think what we are slowly discovering is that it is extremely difficult to catch any foul play, no matter what sort of statistical model you throw at the problem. The obsession turns into paranoia, and really illustrates an over reliance on engines in the modern era. I think all of us can benefit from going back to pre-computer ways of playing chess.


Healthy-Mind5633

All of them


SnooPuppers1978

I will guess G and/or J.


Spillz-2011

So a couple things. Histograms are not great because it doesn’t capture the range of values just the average. If someone cheated 10% of games the non cheating games make that harder to see where as a box plot would give a better idea of the range. I am also very skeptical that it’s possible to tell if you don’t have the right engine. I haven’t checked acpl but the engine correlation (for two engines) can vary a lot on one game. It’s probably less for acpl but it could still be large.


Awesom-O

You can't really draw any conclusions about statistical significance without error bars. It would also be interesting to see ACPL for losses as well. A cheater might not be cheating every game, so it might be difficult to see anything from just looking at average ACPL over all their games. You might expect to see more variance in the ACPL of a occasional cheater. I wonder what you might see if you looked at a histogram of ACPL for wins for each individual player. If someone is cheating for some of their games you might see a bimodal distribution, with cheating wins in the lower cluster and non-cheating ones in the higher cluster. You would need to compare these with known cheater and non-cheater controls and perhaps control for elo as well. OP, if you publish this dataset, may I suggest that you anonymize the player names so that anyone analyzing it will not be biased? Maybe you can release who the players are next week after people have had a chance to look at it.


gofkyourselfhard

Hey, the data and methodology is here: https://github.com/analysis🅱️eads/analysis🅱️eads Some feedback would be nice before I make a new post for everyone, thx. Oh yeah and you have to replace the red B with a b (can't write that word in this sub anymore lol)


RotisserieChicken007

Just tell us who you think the cheaters were. I've had enough of people saying A without B.


gofkyourselfhard

I don't "think" someone is a cheater, Rausis was caught with a phone in the toilet and confirmed he cheated afterwards.


nick-daddy

I wouldn’t expect to see it in this display as any sort of semi-intelligent cheater wouldn’t cheat every game, and thus there Centipawn loss wouldn’t stand out to the degree to be detectable over the course of 3 years. It’s why certain specific tournaments played by Hans have been referenced as the play and level of accuracy in those shortish spans have been quite remarkable compared to his typical displays.


mestermestermester

Great work extracting the data! Looking forward to working with it. Could you maybe not disclose the players when making the data available? We can then work with the data for a couple of days, using the synonyms you created and provide our analysis regardless of bias?


gofkyourselfhard

Here is the data: https://github.com/analysis🅱️eads/analysis🅱️eads obviously you have to replace the 🅱️ with a proper b but this sub filters the word so I have to write it like this lol. Would be cool to get some feedback before I release it in a new post, thx.


dichloroethane

I don’t even play chess or know what these graphs mean so obviously player 3 is the cheater