The definition of "agents and agentic" needs to expand given the space that these are deployed in.
Even humans as agents cannot be useful or deployed in any space and do not have reliable general purpose evals.
Our general purpose evals are things like iq, reasoning but to have agency in a domain - we do not operate independently, we acquire specialized knowledge, we collaborate with humans and tools.
You can have practically useful agents specialized to domains that can operate well on domain specific evals rather than general purpose eval - think fine-tuning, tool knowhow, collaboration.
IMO before we jump down the evals well, we need to distinguish what we are evaluating for - practical outcomes or general reasoning.
hey, I'm doing a research about AI agents as well, if you could advice me with list of important papers, I shouldn't miss that would be very helpful, thank you in advance!!
I have also written extensively about Agent applications in my own publications.
AI Agent (Artificial Intelligence Agent) is an intelligent entity capable of perceiving the environment, making decisions, and executing actions. Unlike traditional artificial intelligence, AI Agents possess the ability to think independently and utilize tools to gradually achieve given objectives.
Why can AI Agents automatically decompose tasks? They are referred to as a 'role framework,' a programming paradigm whose core is to endow large language models with a strategic thinking structure for problem-solving. This framework simulates the process humans use to tackle issues.
It can be said that the importance of developing AI Agents is comparable to that of apps and the Apple App Store in the internet era.
I took a slightly different approach and one from an enterprise automation practitioner perspective. Here is how Multi-Agent Framework, will help build the Autonomous Enterprise: https://www.linkedin.com/posts/doug-shannon_iot-iiot-edgecomputing-activity-7213534350049521665-mJ4j?utm_source=share&utm_medium=member_ios
The definition of "agents and agentic" needs to expand given the space that these are deployed in.
Even humans as agents cannot be useful or deployed in any space and do not have reliable general purpose evals.
Our general purpose evals are things like iq, reasoning but to have agency in a domain - we do not operate independently, we acquire specialized knowledge, we collaborate with humans and tools.
You can have practically useful agents specialized to domains that can operate well on domain specific evals rather than general purpose eval - think fine-tuning, tool knowhow, collaboration.
IMO before we jump down the evals well, we need to distinguish what we are evaluating for - practical outcomes or general reasoning.
hey, I'm doing a research about AI agents as well, if you could advice me with list of important papers, I shouldn't miss that would be very helpful, thank you in advance!!
I have also written extensively about Agent applications in my own publications.
AI Agent (Artificial Intelligence Agent) is an intelligent entity capable of perceiving the environment, making decisions, and executing actions. Unlike traditional artificial intelligence, AI Agents possess the ability to think independently and utilize tools to gradually achieve given objectives.
Why can AI Agents automatically decompose tasks? They are referred to as a 'role framework,' a programming paradigm whose core is to endow large language models with a strategic thinking structure for problem-solving. This framework simulates the process humans use to tackle issues.
It can be said that the importance of developing AI Agents is comparable to that of apps and the Apple App Store in the internet era.
you sound like a management consultant
My main job is as a programmer, and I spend most of my time programming.