Thank you for this! A lot of what you say really resonates for me and I especially appreciate the point about the "wishful mnemonic" (<= Drew McDermott's wonderful term) "predict".
I think that the way that arXiv is used also a factor in the culture of hype and supposedly "fast progress" in deep learning/AI, and it is always valuable to point to peer-reviewed venues for papers that have actually been peer reviewed.
Second, I object to the assertion that NLP is a "highly circumscribed domain" like chess or Go. There are tasks within NLP that are highly circumscribed, but that doesn't go for the domain as a whole. I have no particular expertise in computer vision, but at the very least it also seems extremely ill-defined compared to chess and Go. If it's "highly circumscribed" it isn't in the same way the games are. You kind of get to this in the next paragraph (for both NLP and CV), but I think it would be better to avoid the assertion. These domains only look "highly circumscribed" if you look at them without any domain expertise. (Though again, for CV, it's a little unclear what the domain of expertise even is...)
Finally, I'd like to respond to this: "Noted AI researcher Rich Sutton wrote an essay in which he forcefully argued that attempts to add domain knowledge to AI systems actually hold back progress."
That framing suggests that the progress we should care about is progress in AI. But that is entirely at odds with what you say above regarding domain experts:
"They are not seen as partners in designing the system. They are not seen as clients whom the AI system is meant to help by augmenting their abilities. They are seen as inefficiencies to be automated away, and standing in the way of progress."
I think it is a mistake to cede the framing to those who think the point of this all is to build AI, rather than to build tools that support human flourishing. (<= That lovely turn of phrase comes from Batya Friedman's keynote at NAACL 2022.)
Hi Emily! Thank you for your comments and corrections. We’ve changed the link to your paper.
Calling CV and NLP highly circumscribed domains was awkward wording and completely unintended. We’ll edit the sentence to better reflect what we meant, which I think is consistent with what you say.
We also completely agree with the importance of not ceding the framing! Just a quick clarification: the term progress in that sentence is essentially quoting Sutton's post; we are not adopting that framing. I agree that the sentence could have been worded better. Sorry for the confusion!
You're very welcome! I was inspired to share them because I understood that to be your purpose in sharing your work in this way (to get feedback from the community). I hope they are helpful in that way.
Thanks for this. The hype and hubris coming from some corners of the community is harmful in a lot of ways. Why would a student work hard on their education if they've been convinced human labour is about to become obsolete? Look at the steady flow of questions on quora from people concerned all sorts of careers are about to become obsolete. The popular narrative on AI has to change.
This is amazing work and sorely needed when the majority of the discourse focuses on extrapolating successes in narrow domains to general success in a much wider field (think solved object detection = solved computer vision). This particular framing about domain experts "failing to recognize that professional expertise is more than the externally visible activity" is so crucial, yet so often overlooked.
Thank you for this! Just curious: When you say "Almost any engineering product encounters an unending variety of corner cases in the real world that weren’t anticipated during development," are you referring to NLP as "an engineering product"? Is NLP commonly understood to be "an engineering product" in the deep learning community?
Also, who is considered a "domain expert" in NLP other than linguists?
I'd argue that NLP, just like any advanced CV algorithm or other ML system, is an "engineering product". Assigning any more "meaning" to a deep-learning algorithm is part of the problem, IMO
Thanks, Shamik. I'm just trying to understand how these concepts are framed in the deep learning community. As someone from outside the community who has studied this stuff, I've found it difficult to penetrate the jargon and actually get a sense of what NLP is, how it's developed, and, yes, what that "means" for the people who interact with it. I'm a humanities person. Meaning is the gig, lol.
Maybe science can clarify what is NOT possible, like with perpetual motion. "Nearly impossible" isn't worth investing in. What specific harm is done by the hype? With some uses, too much trust in the output could be dangerous. With other uses, the only harm is over spending. In a specific area of use, what harm is done by the hype (and putting too much trust in the output). That could be divided into column.s.
There are all sorts of harms. Look at Amazon trying to use ML to vet resumes and screening out women. Look at those systems for determining bail that basically reinforce invidious stereotypes despite studies showing that those decisions are not just socially bad but incorrect. (e.g. Having a job is a major determinant of whether someone is going to show up in court, but it is subsumed by other statistical effects so that skin color is more highly weighted.)
The problem with ML in those cases is that it is supposed to be conceived of as fair because an impartial computer made the decision even though the algorithm and training set were not particularly unbiased. (Hell, these systems are even influenced by the order of the training set, and I haven't heard what researchers are doing about this.)
“Algorithms are still made by human beings, and those algorithms are still pegged to basic human assumptions,” she told writer Ta-Nehisi Coates at the annual MLK Now event. “They’re just automated assumptions. And if you don’t fix the bias, then you are just automating the bias.”
For a more professional critique, try Cathy O'Neil who worked in the finance sector, but now does consulting on algorithmic bias. She wrote Weapons of Math Destruction describing it. Thanks for giving me a hook to mention her and her work.
Dying for you to tackle ML aspects of biology and genetics. AlphaFold/RoseTTA/ and now Protein MPNN and many more revolutionary papers in AI for biology. Please hit this!!
Rodney Brooks pointed out that even successful ML systems embed a lot of non-ML derived knowledge. For example, he notes that most visual recognition systems use a form of spatial convolution to drive the ML recognition. This embeds the knowledge that images are size and location invariant in non-ML code.
Writing an ML program to perform symbolic integration, as described in https://deepai.org/publication/the-use-of-deep-learning-for-symbolic-integration-a-review-of-lample-and-charton-2019, winds up with a pretty good system, but that paper winds up stone souping it. It would work much better with a simplification step. Why not add examples with Bessel and other special functions? Throw in a chicken, some carrots, an onion, some herbs and a box of noodles, and you've got a really good soup.
As best I can tell, ML is a sort of compression algorithm. If you give it a good overview of a space, it can map it into another space, but, like compression algorithms, it only works well on a particular space and that space is small compared to everything possible. That means it is hard to make ML work in the face of minor deviations that in fact result in big differences.
Except we do have self driving cars, and those domain experts like doctors are probably wrong when they say things like AI can’t determine race by looking at X-rays. So this is a click bait article just trying to ride the disillusionment troph of new tech adoption.
Thank you for this! A lot of what you say really resonates for me and I especially appreciate the point about the "wishful mnemonic" (<= Drew McDermott's wonderful term) "predict".
Three quick bits of feedback:
For our paper "AI and the Everything in the Whole Wide World Benchmark" (Raji et al 2021, NeurIPS Datasets and Benchmarks track) it would better to point to the published version instead of arXiv as you are now doing. You can find that here: https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/084b6fbb10729ed4da8c3d3f5a3ae7c9-Abstract-round2.html
I think that the way that arXiv is used also a factor in the culture of hype and supposedly "fast progress" in deep learning/AI, and it is always valuable to point to peer-reviewed venues for papers that have actually been peer reviewed.
Second, I object to the assertion that NLP is a "highly circumscribed domain" like chess or Go. There are tasks within NLP that are highly circumscribed, but that doesn't go for the domain as a whole. I have no particular expertise in computer vision, but at the very least it also seems extremely ill-defined compared to chess and Go. If it's "highly circumscribed" it isn't in the same way the games are. You kind of get to this in the next paragraph (for both NLP and CV), but I think it would be better to avoid the assertion. These domains only look "highly circumscribed" if you look at them without any domain expertise. (Though again, for CV, it's a little unclear what the domain of expertise even is...)
Finally, I'd like to respond to this: "Noted AI researcher Rich Sutton wrote an essay in which he forcefully argued that attempts to add domain knowledge to AI systems actually hold back progress."
That framing suggests that the progress we should care about is progress in AI. But that is entirely at odds with what you say above regarding domain experts:
"They are not seen as partners in designing the system. They are not seen as clients whom the AI system is meant to help by augmenting their abilities. They are seen as inefficiencies to be automated away, and standing in the way of progress."
I think it is a mistake to cede the framing to those who think the point of this all is to build AI, rather than to build tools that support human flourishing. (<= That lovely turn of phrase comes from Batya Friedman's keynote at NAACL 2022.)
Hi Emily! Thank you for your comments and corrections. We’ve changed the link to your paper.
Calling CV and NLP highly circumscribed domains was awkward wording and completely unintended. We’ll edit the sentence to better reflect what we meant, which I think is consistent with what you say.
We also completely agree with the importance of not ceding the framing! Just a quick clarification: the term progress in that sentence is essentially quoting Sutton's post; we are not adopting that framing. I agree that the sentence could have been worded better. Sorry for the confusion!
You're very welcome! I was inspired to share them because I understood that to be your purpose in sharing your work in this way (to get feedback from the community). I hope they are helpful in that way.
Thanks for this. The hype and hubris coming from some corners of the community is harmful in a lot of ways. Why would a student work hard on their education if they've been convinced human labour is about to become obsolete? Look at the steady flow of questions on quora from people concerned all sorts of careers are about to become obsolete. The popular narrative on AI has to change.
Thank you for saying this. There is too much hype and substance is trailing far behind everywhere.
This is amazing work and sorely needed when the majority of the discourse focuses on extrapolating successes in narrow domains to general success in a much wider field (think solved object detection = solved computer vision). This particular framing about domain experts "failing to recognize that professional expertise is more than the externally visible activity" is so crucial, yet so often overlooked.
One small point: The citation for "Problem Formulation and Fairness" is the arXiv version. Here's a citation to the FAT 19 paper: https://dl.acm.org/doi/10.1145/3287560.3287567
Thank you for this! Just curious: When you say "Almost any engineering product encounters an unending variety of corner cases in the real world that weren’t anticipated during development," are you referring to NLP as "an engineering product"? Is NLP commonly understood to be "an engineering product" in the deep learning community?
Also, who is considered a "domain expert" in NLP other than linguists?
I'd argue that NLP, just like any advanced CV algorithm or other ML system, is an "engineering product". Assigning any more "meaning" to a deep-learning algorithm is part of the problem, IMO
Thanks, Shamik. I'm just trying to understand how these concepts are framed in the deep learning community. As someone from outside the community who has studied this stuff, I've found it difficult to penetrate the jargon and actually get a sense of what NLP is, how it's developed, and, yes, what that "means" for the people who interact with it. I'm a humanities person. Meaning is the gig, lol.
Maybe science can clarify what is NOT possible, like with perpetual motion. "Nearly impossible" isn't worth investing in. What specific harm is done by the hype? With some uses, too much trust in the output could be dangerous. With other uses, the only harm is over spending. In a specific area of use, what harm is done by the hype (and putting too much trust in the output). That could be divided into column.s.
There are all sorts of harms. Look at Amazon trying to use ML to vet resumes and screening out women. Look at those systems for determining bail that basically reinforce invidious stereotypes despite studies showing that those decisions are not just socially bad but incorrect. (e.g. Having a job is a major determinant of whether someone is going to show up in court, but it is subsumed by other statistical effects so that skin color is more highly weighted.)
The problem with ML in those cases is that it is supposed to be conceived of as fair because an impartial computer made the decision even though the algorithm and training set were not particularly unbiased. (Hell, these systems are even influenced by the order of the training set, and I haven't heard what researchers are doing about this.)
AOC mentioned algorithm bias in 2019.
“Algorithms are still made by human beings, and those algorithms are still pegged to basic human assumptions,” she told writer Ta-Nehisi Coates at the annual MLK Now event. “They’re just automated assumptions. And if you don’t fix the bias, then you are just automating the bias.”
https://www.vox.com/science-and-health/2019/1/23/18194717/alexandria-ocasio-cortez-ai-bias
Hype inevitably leads to overconfidence in the output, often in realms where overconfidence in the output can be tragic and irreversible.
For a more professional critique, try Cathy O'Neil who worked in the finance sector, but now does consulting on algorithmic bias. She wrote Weapons of Math Destruction describing it. Thanks for giving me a hook to mention her and her work.
Dying for you to tackle ML aspects of biology and genetics. AlphaFold/RoseTTA/ and now Protein MPNN and many more revolutionary papers in AI for biology. Please hit this!!
Rodney Brooks pointed out that even successful ML systems embed a lot of non-ML derived knowledge. For example, he notes that most visual recognition systems use a form of spatial convolution to drive the ML recognition. This embeds the knowledge that images are size and location invariant in non-ML code.
Writing an ML program to perform symbolic integration, as described in https://deepai.org/publication/the-use-of-deep-learning-for-symbolic-integration-a-review-of-lample-and-charton-2019, winds up with a pretty good system, but that paper winds up stone souping it. It would work much better with a simplification step. Why not add examples with Bessel and other special functions? Throw in a chicken, some carrots, an onion, some herbs and a box of noodles, and you've got a really good soup.
As best I can tell, ML is a sort of compression algorithm. If you give it a good overview of a space, it can map it into another space, but, like compression algorithms, it only works well on a particular space and that space is small compared to everything possible. That means it is hard to make ML work in the face of minor deviations that in fact result in big differences.
Except we do have self driving cars, and those domain experts like doctors are probably wrong when they say things like AI can’t determine race by looking at X-rays. So this is a click bait article just trying to ride the disillusionment troph of new tech adoption.
Don't you mean flying cars? We have those.