View a markdown version of this page

Definitions - Generative AI Lens

Definitions

  • Agent: An AI system that can perform tasks autonomously and interact with its environment to achieve specific goals.

  • Bias and fairness testing: Evaluating and mitigating potential biases or unfair outcomes from AI models, particularly in areas like gender, race, or age.

  • Continuous pre-training: The process of continuously updating a pre-trained model with new data to improve its performance and adapt to evolving domains or tasks.

  • Chunking: Breaking up large data files into small, discreet chunks to allow the foundation model to fit that data into a context window.

  • Data management: The process of identifying, collecting, storing, aggregating, searching, tracking, governing, and using data.

  • Embedding: Transforms chunks of data into vectors that represent semantic meaning.

  • Fine-tuning: The process of adapting a pre-trained model to a specific task or domain by training it on a smaller, task-specific dataset.

  • Foundation models: Large language models pre-trained on vast amounts of data, serving as a foundation for downstream tasks and fine-tuning.

  • Foundation model providers: Companies or organizations that develop and release foundation models for use by others.

  • Generative AI: AI systems capable of generating new content, such as text, images, or code, based on input data or prompts.

  • Hallucination: A phenomenon where a generative AI model produces outputs that are inconsistent, factually incorrect, or unrelated to the input prompt.

  • Human oversight: Mechanisms for human experts to review, validate, and control critical decisions or outputs from AI models.

  • Indexing: Process of inserting embedded chunks into a vector data store.

  • Knowledge graph: A structured representation of real-world entities and their relationships, used to enhance the contextual understanding and reasoning capabilities of AI systems.

  • LLMOps or GenAIOps: Operational practices and principles for managing the lifecycle of large language models (LLMs), including model selection, data preparation, deployment, monitoring, and governance.

  • Model card: A document that provides key information about a machine learning model, including its intended use, training data, performance characteristics, and potential limitations or biases.

  • Model customization: The process of modifying a foundation model using various techniques to control its behavior.

  • Model distillation: A technique for creating a smaller, more efficient model that mimics the behavior of a larger, more advanced model.

  • Model evaluation: The process of assessing the performance, robustness, and other characteristics of language models using various metrics and techniques.

  • Model gateway: An interaction layer offering secure access to the model hub through standardized APIs.

  • Model hub: A central repository providing access to enterprise foundation models from first-party, third-party, and open-source providers.

  • Model interpretability: The ability to understand and explain the reasoning behind a model's outputs, increasing transparency and interpretability.

  • Model orchestration: Encapsulation of multistep workflows which are characteristic of generative AI workflows.

  • Pre-Training: Building a foundation model from scratch. Requires GPU clusters to run continuously for weeks.

  • Prompt catalog: A centralized repository for storing, managing, and versioning prompts used to interact with generative AI models.

  • Prompt engineering: The practice of carefully crafting prompts to guide language models to produce desired outputs.

  • Provisioned throughput: Feature of Amazon Bedrock that allows you to provision a higher level of throughput at a fixed cost for predictable, high-throughput workloads.

  • Quantization: Techniques for reducing the precision of model parameters, thereby decreasing the memory footprint and computational requirements.

  • Responsible AI: The practice of developing and deploying AI systems in a manner that prioritizes fairness, transparency, accountability, and adherence to ethical principles.

  • Retrieval-Augmented Generation (RAG): A technique/architectural style where a language model's output is augmented with relevant information retrieved from a corpus of documents. This technique is employed to make sure the responses are grounded with the documents and to reduce hallucination.

  • Self-hosted models: AI models that are deployed and managed by the organization using them, rather than relying on a third-party provider.

  • Serverless architecture: An architecture pattern where the cloud provider automatically manages the allocation and provisioning of computational resources, allowing for scalability and cost optimization.

  • Tokenization: The process of breaking down input text into smaller units called tokens, which can be words, subwords, or characters, as a preprocessing step for natural language processing tasks.

  • Vector store: A specialized data store for efficient storage and retrieval of high-dimensional vector embeddings, often used in semantic search and retrieval tasks. Vector stores such as Amazon OpenSearch Service serverless support different search algorithms.

  • Zero-shot learning: The ability of a model to perform a task or make predictions on examples it has never seen before, without requiring task-specific training data.

For the latest AWS terminology, see the AWS glossary in the AWS Glossary Reference.