Definitions
-
Agent: An AI system that can perform tasks autonomously and interact with its environment to achieve specific goals.
-
Bias and fairness testing: Evaluating and mitigating potential biases or unfair outcomes from AI models, particularly in areas like gender, race, or age.
-
Continuous pre-training: The process of continuously updating a pre-trained model with new data to improve its performance and adapt to evolving domains or tasks.
-
Chunking: Breaking up large data files into small, discreet chunks to allow the foundation model to fit that data into a context window.
-
Data management: The process of identifying, collecting, storing, aggregating, searching, tracking, governing, and using data.
-
Embedding: Transforms chunks of data into vectors that represent semantic meaning.
-
Fine-tuning: The process of adapting a pre-trained model to a specific task or domain by training it on a smaller, task-specific dataset.
-
Foundation models: Large language models pre-trained on vast amounts of data, serving as a foundation for downstream tasks and fine-tuning.
-
Foundation model providers: Companies or organizations that develop and release foundation models for use by others.
-
Generative AI: AI systems capable of generating new content, such as text, images, or code, based on input data or prompts.
-
Hallucination: A phenomenon where a generative AI model produces outputs that are inconsistent, factually incorrect, or unrelated to the input prompt.
-
Human oversight: Mechanisms for human experts to review, validate, and control critical decisions or outputs from AI models.
-
Indexing: Process of inserting embedded chunks into a vector data store.
-
Knowledge graph: A structured representation of real-world entities and their relationships, used to enhance the contextual understanding and reasoning capabilities of AI systems.
-
LLMOps or GenAIOps: Operational practices and principles for managing the lifecycle of large language models (LLMs), including model selection, data preparation, deployment, monitoring, and governance.
-
Model card: A document that provides key information about a machine learning model, including its intended use, training data, performance characteristics, and potential limitations or biases.
-
Model customization: The process of modifying a foundation model using various techniques to control its behavior.
-
Model distillation: A technique for creating a smaller, more efficient model that mimics the behavior of a larger, more advanced model.
-
Model evaluation: The process of assessing the performance, robustness, and other characteristics of language models using various metrics and techniques.
-
Model gateway: An interaction layer offering secure access to the model hub through standardized APIs.
-
Model hub: A central repository providing access to enterprise foundation models from first-party, third-party, and open-source providers.
-
Model interpretability: The ability to understand and explain the reasoning behind a model's outputs, increasing transparency and interpretability.
-
Model orchestration: Encapsulation of multistep workflows which are characteristic of generative AI workflows.
-
Pre-Training: Building a foundation model from scratch. Requires GPU clusters to run continuously for weeks.
-
Prompt catalog: A centralized repository for storing, managing, and versioning prompts used to interact with generative AI models.
-
Prompt engineering: The practice of carefully crafting prompts to guide language models to produce desired outputs.
-
Provisioned throughput: Feature of Amazon Bedrock that allows you to provision a higher level of throughput at a fixed cost for predictable, high-throughput workloads.
-
Quantization: Techniques for reducing the precision of model parameters, thereby decreasing the memory footprint and computational requirements.
-
Responsible AI: The practice of developing and deploying AI systems in a manner that prioritizes fairness, transparency, accountability, and adherence to ethical principles.
-
Retrieval-Augmented Generation (RAG): A technique/architectural style where a language model's output is augmented with relevant information retrieved from a corpus of documents. This technique is employed to make sure the responses are grounded with the documents and to reduce hallucination.
-
Self-hosted models: AI models that are deployed and managed by the organization using them, rather than relying on a third-party provider.
-
Serverless architecture: An architecture pattern where the cloud provider automatically manages the allocation and provisioning of computational resources, allowing for scalability and cost optimization.
-
Tokenization: The process of breaking down input text into smaller units called tokens, which can be words, subwords, or characters, as a preprocessing step for natural language processing tasks.
-
Vector store: A specialized data store for efficient storage and retrieval of high-dimensional vector embeddings, often used in semantic search and retrieval tasks. Vector stores such as Amazon OpenSearch Service serverless support different search algorithms.
-
Zero-shot learning: The ability of a model to perform a task or make predictions on examples it has never seen before, without requiring task-specific training data.
For the latest AWS terminology, see the AWS glossary in the AWS Glossary Reference.