I would be interested to know what you make of a recent paper in Nature from AlphaDev which claims to have developed a novel faster sorting algorithm. It did not contain a new sorting algorithm. Coming up with assembly level tricks is not the same as finding a new sorting algorithm. The results contained in it are interesting. It just does not do what it says on the tin. Is AI given a free pass when it comes to reviews in that particular venue?
It ends up being counter productive for the authors. The lack of pushback means that published overstatements, once apparent, end up lowering credit that naturally would be attributed to the genuine contribution. And it does not do Nature any favours of course.
Have you requested feedback from Dr. Gelman? This sort of thing seems like it would be right up his alley, even if the subject matter isn’t his area of expertise.
It is getting even worse, when it comes to real-life research that aims to find causal relations between genAI and outcome variables, that many organisations can resonate with like productivity. Quality of texts was measured by three guys (who unwillingly fed the training data set, which is a very human form of data leakage) and a self-invented interrater reliability via IRC = .40, which is "small", at best. Also a renown journal:
Brilliant work!!! Guidelines like these help keep science on the pathway of accuracy, productivity, and ethical clarity.
I would be interested to know what you make of a recent paper in Nature from AlphaDev which claims to have developed a novel faster sorting algorithm. It did not contain a new sorting algorithm. Coming up with assembly level tricks is not the same as finding a new sorting algorithm. The results contained in it are interesting. It just does not do what it says on the tin. Is AI given a free pass when it comes to reviews in that particular venue?
Yes, exactly. The framing of that paper was really unfortunate and it is embarrassing that Nature reviewers/editors apparently did not push back.
It ends up being counter productive for the authors. The lack of pushback means that published overstatements, once apparent, end up lowering credit that naturally would be attributed to the genuine contribution. And it does not do Nature any favours of course.
It would be good if the authors corrected the matter openly.
This seems like it applies well to avoiding ML screwups in a commercial setting also.
Have you requested feedback from Dr. Gelman? This sort of thing seems like it would be right up his alley, even if the subject matter isn’t his area of expertise.
It is getting even worse, when it comes to real-life research that aims to find causal relations between genAI and outcome variables, that many organisations can resonate with like productivity. Quality of texts was measured by three guys (who unwillingly fed the training data set, which is a very human form of data leakage) and a self-invented interrater reliability via IRC = .40, which is "small", at best. Also a renown journal:
https://www.science.org/doi/10.1126/science.adh2586