top of page

Subscribe to our newsletter

Write a
Title Here

I'm a paragraph. Click here to add your own text and edit me. I’m a great place for you to tell a story and let your users know a little more about you.

© Indic Pacific Legal Research LLP.

For articles published in VISUAL LEGAL ANALYTICA, you may refer to the editorial guidelines for more information.

The Ethics of Advanced AI Assistants: Explained & Reviewed



Recently, Google DeepMind had published a 200+ pages-long paper on the "Ethics of Advanced AI Assistants". The paper in most ways is extensively authored, well-cited and requires a condensed review, and feedback.



Hence, we have decided that VLiGTA, Indic Pacific's research division, may develop an infographic report encompassing various aspects of this well-researched paper (if necessary).


This insight by Visual Legal Analytica features my review of this paper by Google DeepMind.


 

This paper is divided into 6 parts, and I have provided my review, and an extractable insight on the key points of law, policy and technology, which are addressed in this paper.


Part I: Introduction to the Ethics of Advanced AI Assistants




To summarise the introduction, there are 3 points which could be highlighted from this paper:


  1. The development of advanced AI assistants marks a technological paradigm shift, with potential profound impacts on society and individual lives.

  2. Advanced AI assistants are defined as agents with natural language interfaces that plan and execute actions across multiple domains in line with user expectations.

  3. The paper aims to systematically address the ethical and societal questions posed by advanced AI assistants.


Now, the paper has attempted well to address 16 different questions on AI assistants, and the ethical & legal-policy ramifications associated with them.


The 16 questions can be summarised in these points:

  • How are AI assistants by definition unique among the classes of AI technologies?

  • What could be the possible capabilities of AI assistants and if value systems exist, then what could be defined as a "good" AI assistant with all-context evidence? Are there any limits on these AI assistants?

  • What should an AI assistant be aligned with? What could be the real safety issues around the realm of AI assistants and what does safety mean for this class of AI technologies?

  • What new forms of persuasion might advanced AI assistants be capable of? How can appropriate control of these assistants be ensured to users? How can end users (and the vulnerable ones) be protected from AI manipulation and unwanted disclosure of personal information?

  • Since AI assistants can do anthropomorphisation, is it morally problematic or not? Can we permit this anthropomorphisation conditionally?

  • What could be the possible rules of engagement for human users and advanced AI assistants? What could be the possible rules of engagement among AI assistants then? What about the impact of introducing AI assistants to users on non-users?

  • How would AI assistants impact information ecosystem and its economics, especially the public fora (or the digital public square of the internet as we know it)? What is the environmental impact of AI assistants?

  • How can we be confident about the safety of AI Assistants and what evaluations might be needed at the agent, user and system levels?


I must admit that these 16 questions are intriguing for the most part.


Let's also look at the methodology applied by the authors in that context.


  • The authors clearly admit that the facets of Responsible AI, like responsible development, deployment & use of AI assistants - are based on the possibility if humans have the capacity of ethical foresight to catch up with the technological progress. The issues of risk and impact come later.

  • The authors also admit that there is ample uncertainty about the future developments and interaction effects (a subset of network effects) due to two factors - (1) the nature; and (2) the trajectory (of evolution) of the class of technology (AI Assistants) itself. The trajectory is exponential and uncertain.

  • For all privacy and ethical issues, the authors have rightly pointed out that AI Assistant technologies will be subject to rapid development.

  • The authors also admit that uncertainty arises from many factors, including the complementary & competitive dynamics around AI Assistants, end users, developers and governments (which can be related to aspects of AI hype as well). Thus, it is humble and reasonable to admit in this paper how a purely reactive approach to Responsible AI ("responsible decision-making") is inadequate.

  • The authors have correctly argued in the methodology segment that the AI-related "future-facing ethics" could be best understood as a form of sociotechnical speculative ethics. Since the narrative of futuristic ethics is speculative for something non-exist, regulatory narratives can never be based on such narratives. If narratives have to be socio-technical, they have to make practical sense. I appreciate the fact that the authors would like to take a sociotechnical paper throughout this paper based on interaction dynamics and not hype & speculation.



Part II: Advanced AI Assistants


Here is a key summary of this part in the paper:


  1. AI assistants are moving from simple tools to complex systems capable of operating across multiple domains.

  2. These assistants can significantly personalize user interactions, enhancing utility but also raising concerns about influence and dependence.


Conceptual Analysis vs Conceptual Engineering


There is an interesting comparison on Conceptual Analysis vs Conceptual Engineering in an excerpt, which is highlighted as follows:


In this paper, we opt for a conceptual engineering approach. This is because, first, there is no obvious reason to suppose that novel and undertheorised natural language terms like ‘AI assistant’ pick out stable concepts: language in this space may itself be evolving quickly. As such, there may be no unique concept to analyse, especially if people currently use the term loosely to describe a broad range of different technologies and applications. Second, having a practically useful definition that is sensitive to the context of ethical, social and political analysis has downstream advantages, including limiting the scope of the ethical discussion to a well-defined class of AI systems and bracketing potentially distracting concerns about whether the examples provided genuinely reflect the target phenomenon.

Here's a footnote which helps a lot in explaining this tendency taken by the authors of the paper.


Note that conceptually engineering a definition leaves room to build in explicitly normative criteria for AI assistants (e.g. that AI assistants enhance user well-being), but there is no requirement for conceptually engineered definitions to include normative content.

The authors are opting for a "conceptual engineering" approach to define the term "AI assistant" rather than a "conceptual analysis" approach.


Here's an illustration to explain what this means:


Imagine there is a new type of technology called "XYZ" that has just emerged. People are using the term loosely to describe various different systems and applications that may or may not be related.


There is no stable, widely agreed upon concept of what exactly "XYZ" refers to.

In this situation, taking a "conceptual analysis" approach would involve trying to analyse how the term "XYZ" is currently used in natural language, and attempting to distill the necessary and sufficient conditions that determine whether something counts as "XYZ" or not.


However, the authors argue that for a novel, undertheorized term like "AI assistant", this conceptual analysis approach may not be ideal for a couple of reasons:


  1. The term is so new that language usage around it is still rapidly evolving. There may not yet be a single stable concept that the term picks out.

  2. Trying to merely analyze the current loose usage may not yield a precise enough definition that is useful for rigorous ethical, social and political analysis of AI assistants.


Instead, they opt for "conceptual engineering" - deliberately constructing a definition of "AI assistant" that is precise and fits the practical needs of ethical/social/political discourse around this technology.The footnote clarifies that with conceptual engineering, the definition can potentially include normative criteria (e.g. that AI assistants should enhance user well-being), but it doesn't have to.


The key is shaping the definition to be maximally useful for the intended analysis, rather than just describing current usage.

So in summary, conceptual engineering allows purposefully defining a term like "AI assistant" in a way that provides clarity and facilitates rigorous examination, rather than just describing how the fuzzy term happens to be used colloquially at this moment.


Non-moralised Definitions of AI


The authors have also opted for a non-moralised definition of AI Assistants, which makes sense because systematic investigation of ethical and social AI issues are still nascent. Moralised definitions require a well-developed conceptual framework, which does not exist right now. A non-moralised definition thus works and remains helpful despite the reasonable disagreements about the permissive development and deployment practices surrounding AI assistants.


This is a definition of an AI Assistant:

We define an AI assistant here as an artificial agent with a natural language interface, the function of which is to plan and execute sequences of actions on the user’s behalf across one or more domains and in line with the user’s expectations.

From Foundational Models to Assistants


The authors have correctly inferred that large language models (LLMs) must be transformed into AI Assistants as a class of AI technology in a serviceable or productised fashion. Now there could be so many ways to do, like creating a mere dialogue agent. This is why techniques like Reinforcement Learning from Human Feedback (RLHF) exist.


These assistants are based on the premises that humans have to train a reward model, and the model parameters would naturally keep updating via RLHF.


Potential Applications of AI Assistants


The authors have listed the following applications of AI Assistants by keeping a primary focus on the interaction dynamics between a user and an AI Assistant:


  • A thought assistant for discovery and understanding: This means that AI Assistants are capable to gather, summarise and present information from many sources quickly. The variety of goals associated with a "thought assistant" makes it an aid for understanding purposes.

  • A creative assistant for generating ideas and content: AI Assistants sometimes help a lot in shaping ideas by giving random or specialised suggestions. Engagement could happen in multiple content formats, to be fair. AI Assistants can also optimise for constraints and design follow-up experiments with parameters and offer rationale on an experimental basis. This definitely creates a creative loop.

  • A personal assistant for planning and action: This may be considered an Advanced AI Assistant which could help to develop plans for an end user, and may act on behalf of its user. This requires the Assistants to utilise third party systems and understand user contexts & preferences.

  • A personal AI to further life goals: This could be a natural extension of a personal assistant, based on an extraordinary level of trust that a user may ought to have in their agents.


These use cases that are outlined are generalistic, and more focused on the Business-to-Consumer (B2C) outset of things. However, from a perspective of Google, the listing of applications makes sense.




Part III: Value Alignment, Safety, and Misuse


This part can be summarised in the following ways:


  1. Value alignment is crucial, ensuring AI assistants act in ways that are beneficial and aligned with both user and societal values.

  2. Safety concerns include preventing AI assistants from executing harmful or unintended actions.

  3. Misuse of AI assistants, such as for malicious purposes, is a significant risk that requires robust safeguards.


AI Value Alignment: With What?


Value alignment in the case of artificial intelligence becomes important and necessary for several reasons. First off, technology is inherent value-centric and becomes political for the power dynamics it can create or influence. In this paper, the authors have asked questions on the nature of AI Value Alignment. For example, they do ask as to what could be subject to a form of alignment, as far as AI is concerned. Here is an excerpt:


Should only the user be considered, or should developers find ways to factor in the preferences, goals and well-being of other actors as well? At the very least, there clearly need to be limits on what users can get AI systems to do to other users and non-users. Building on this observation, a number of commentators have implicitly appealed to John Stuart Mill’s harm principle to articulate bounds on permitted action.

Although, philosophically, the paper lacks diverse literary understanding, because of many reasons on the way AI ethics narratives are based on narratives of ethics, power and other concepts of Western Europe and Northern American countries.


Now, the authors have discussed varieties of misalignment to address potential aspects of alignment of values for AI Assistants by examining the state of every stakeholder in the AI-human relationship:

  • AI agents or assistants: These systems aim to achieve goals which are designed to provide tender assistance to users. Now, despite having a committal sense of idealism to achieve task completion, misalignment could be committed by AI systems by behaving in a way which is not beneficial for users;

  • Users: Users as stakeholders can also try to manipulate the ideal design loop of an AI Assistant to get things done in a rather erratic way which could not be cogent with the exact goals and expectations attributed to an AI system;

  • Developers: Even if developers try to align the AI technology with specific preferences, interests & values attributable to users, there are ideological, economic and other considerations attached with them as well. That could also affect a generalistic purpose of any AI system and could cause relevant value misalignment;

  • Society: Both users & non-users may cause AI value misalignment as groups. In this case, societies imbibe societal obligations on AI to benefit and prosper all;


On this paper, this paper has outlined 6 instances of AI value misalignment:


  1. The AI agent at the expense of the user (e.g. if the user is manipulated to serve the agent’s goals),

  2. The AI agent at the expense of society (e.g. if the user is manipulated in a way that creates a social cost, for example via misinformation),

  3. The user at the expense of society (e.g. if the technology allows the user to dominate others or creates negative externalities for society),

  4. The developer at the expense of the user (e.g. if the user is manipulated to serve the developer’s goals),

  5. The developer at the expense of society (e.g. if the technology benefits the developer but creates negative externalities for society by, for example, creating undue risk or undermining valuable institutions),

  6. Society at the expense of the user (e.g. if the technology unduly limits user freedom for the sake of a collective goal such as national security).


There could be even other forms of misalignment. However, their moral character could be ambiguous.


  • The user without favouring the agent, developer or society (e.g. if the technology breaks in a way that harms the user),

  • Society without favouring the agent, user or developer (e.g. if the technology is unfair or has destructive social consequences).


In that case, the authors elucidate about a HHH (triple H) framework of Helpful, Honest and Harmless AI Assistants. They appreciate the human-centric nature of the framework and admit its own inconsistencies and limits.


Part IV: Human-Assistant Interaction


Here is a summary to explain the main points discussed in this part.


  1. The interaction between humans and AI assistants raises ethical issues around manipulation, trust, and privacy.

  2. Anthropomorphism in AI can lead to unrealistic expectations and potential emotional dependencies.


Before we get into Anthropomorphism, let's understand the mechanisms of influence by AI Assistants discussed by the authors.


Mechanisms of Influence by AI Assistants


The authors have discussed the following mechanisms: