Technical standards and evaluations

Overview

Technical standards—shared expectations for definitions, performance of products and systems, and testing—underpin nearly all aspects of technology and manufacturing globally. Evaluations check whether those expectations are being met through tests, measurements, audits, and certifications. Together, standards and evaluations help ensure that technologies are safe, reliable, and compatible as they scale across sectors and borders.

Their impact often goes unnoticed but is ever-present: firehoses fit every hydrant in the country, credit cards work at ATMs worldwide, blood tests produce reliable results across different labs, and electrical outlets deliver consistent voltage. This is all because organizations defined shared technical specifications and verified compliance with them. The International Organization for Standardization alone has published over 25,000 international standards, covering everything from screw thread dimensions to AI risk management.

Standards and evaluations play a central role in emerging technology governance. For AI, standards help define terminology, set performance and safety expectations, and establish testing methodologies, while evaluations assess whether systems meet those expectations in practice. For biotechnology, standards govern everything from how labs handle dangerous pathogens to how experimental results are documented and reproduced, while evaluations verify that facilities and researchers meet those requirements. In both fields, foundational questions like what “safe” or “trustworthy” means in measurable terms are still being worked out.

This guide explains how the US government develops and uses technical standards and evaluations, how they intersect with emerging technology policy, outlines key institutions and processes, and discusses considerations and opportunities for working in this space.

Why does the government use technical standards and evaluations?

Government involvement in technical standards serves several overlapping purposes:

Reducing interoperability and coordination costs: Without shared technical language, government agencies, companies, and other organizations risk building incompatible systems. Standards provide common definitions, data formats, measurement methods, and interfaces that let organizations coordinate more effectively, preventing every program from having to invent bespoke solutions.
Making other policy tools enforceable. Regulations, procurement requirements, and grant conditions often need technical specificity, which standards can help supply. Regulators often use existing standards rather than developing their own technical requirements from scratch.
Building trust in products and systems. When technologies are novel and vendor claims are hard to verify, buyers, regulators, and the public need a basis for confidence. Evaluations (e.g. tests, audits, benchmarks) can verify performance and safety and provide a basis for comparing products and systems. The National Highway Traffic Safety Administration (NHTSA), for example, evaluates vehicle safety using standardized testing protocols that produce comparable evidence across manufacturers.
Filling gaps where regulation is premature or impractical. For emerging technologies, regulation may lag behind development. Standards and evaluations can shape industry practice in the interim, influencing behavior without the force of law. This is especially relevant for AI and biotechnology, where agencies like the National Institute for Standards and Technology (NIST) have published voluntary risk management frameworks that are widely adopted despite carrying no legal mandate. Although technically voluntary, such standards can become effectively binding when agencies reference them in procurement requirements, grant conditions, or regulatory guidance—meaning that organizations must comply to win contracts, receive funding, or satisfy regulators. When effective, these voluntary standards can also lay the groundwork for future regulation by establishing the technical foundations that rules eventually build on.

Technical standards and evaluations basics

What are technical standards?

A technical standard is a shared set of requirements, definitions, or specifications that guide how a product, process, or system should perform, communicate, or be measured. Standards let multiple organizations build compatible systems and let buyers and regulators compare like with like.

Standards come in several forms, including:

Terminology and definition standards align meaning for key terms so that different institutions are talking about the same thing.
Performance standards set measurable targets (e.g. accuracy, latency, reliability) and acceptable thresholds.
Safety and risk standards specify hazard controls, safety margins, and risk management practices.
Data and interoperability standards define data formats and interfaces so that systems can exchange information reliably.
Measurement and test standards define how evaluators should measure performance, safety, and uncertainty.

In practice, many standards blend several of these functions: a single standard might define terminology, set performance thresholds, and specify the test methods used to verify compliance.

While the term “standards” formally refers to consensus documents produced by accredited standards development organizations (SDOs) through structured, multi-year processes involving industry, government, and international partners, many use the term to describe a broader set of instruments, including agency-published frameworks, federal guidance, evaluation protocols, and best practices.

This guide covers both, but the distinction matters: formal consensus standards carry wide legitimacy and durability but can take years to develop, while best practices, frameworks and evaluation approaches can be updated much more quickly. This is an important advantage for fast-moving fields like AI, where the technology may evolve far faster than any multi-year consensus process can effectively track.

What are evaluations?

An evaluation is a structured process for generating evidence about whether a system, product, or practice meets a standard or other stated claim. Depending on the domain, evaluations can test performance, safety, robustness, security, compliance, or other characteristics.

Evaluations typically rely on some combination of:

Test protocols: Step-by-step procedures that define inputs, conditions, and scoring rules (e.g. crash test protocols used to rate vehicle safety).
Benchmarks: Curated tasks or scenarios designed to test a specific capability or risk (e.g. standardized test sets used to compare AI system performance).
Reference data and materials: Shared baselines that ensure measurements are comparable across different evaluators and labs.
Reporting requirements: Documentation standards that support reproducibility and allow others to verify results.

In many government settings, evaluation results gain force through conformity assessment—the processes that make verification credible, repeatable, and usable for decision-making. Conformity assessment typically works in three layers:

Testing and audits generate evidence about whether a system meets requirements.
Certification packages that evidence into a formal signal that buyers and regulators can act on.
Accreditation verifies that the labs and certifiers performing these checks are themselves competent and consistent.

The lifecycle of standards and evals

Most standards and evals work follows roughly the same pathway. If you understand this chain, you can usually “place” any institution, document, or job in the ecosystem.

Problem identification: An agency, industry group, or standards committee identifies a coordination, safety, security, or quality problem that needs a shared solution.
Standard development: A working group defines requirements, interfaces, and measurement or test methods to address the problem.
Evaluation: A lab, assessor, or internal team tests whether systems meet the relevant requirements or claims.
Adoption: A decision-maker ties compliance or results to a consequential decision (e.g. procurement, regulation, certification, grant conditions, or licensing).
Revision: Adoption changes behavior over time; failures and edge cases trigger updates to the standard.

Who makes standards?

Standards can originate directly from government agencies, from non-governmental standards development organizations (SDOs), or commonly through some combination, where agencies develop technical foundations that SDOs then incorporate into formal consensus standards.

Below are some of the key institutions for emerging technology policy.

International and multilateral bodies:

International Organization for Standardization (ISO): A global, non-governmental organization that develops voluntary consensus standards across nearly every industrial and technological domain, with over 25,000 active standards. ISO standards shape global expectations for quality, safety, and interoperability.
International Electrotechnical Commission (IEC): Works alongside ISO on standards for electrical, electronic, and related technologies, including joint AI and cybersecurity standards.
International Telecommunication Union (ITU): A UN agency that develops standards for information and communication technologies, with growing work on AI and digital infrastructure.

US coordinating bodies:

American National Standards Institute (ANSI): A non-governmental organization that coordinates the US voluntary standards system. ANSI doesn’t usually write standards itself; rather, it accredits US standards developers and represents US interests in ISO and IEC.¹

US government:

National Institute of Standards and Technology (NIST): A federal agency under the Department of Commerce that develops measurement methods, reference materials, test protocols, and evaluation frameworks. NIST is not a regulatory agency, but its work frequently becomes the technical backbone for agency programs, procurement requirements, and regulatory guidance.
Other federal agencies: The Food and Drug Administration (FDA), Department of Defense (DOD), the Centers for Disease Control and Prevention (CDC), and others also create and adopt standards in their domains. Their roles are covered more in the sections below.

Industry and professional standards bodies:

IEEE Standards Association, HL7, and hundreds of other sector-specific organizations develop voluntary consensus standards through committees that include industry, government, and academic participants. Key bodies for AI and biosecurity are listed in the emerging technology section below.

How US agencies adopt and use standards

In US practice, agencies use standards in policymaking and procurement in several common ways:

Incorporating standards into a regulation, either by reference (using the standard as-is) or as a starting point when drafting new requirements (e.g. OSHA incorporates ANSI standards for protective equipment directly into its workplace safety rules, meaning employers must comply with those standards as a legal requirement).
Treating compliance with a standard as an accepted way to satisfy a broader regulatory requirement² (e.g. FDA allows medical device manufacturers to demonstrate safety by conforming to recognized consensus standards, rather than independently proving each requirement from scratch).
Relying on widespread industry adoption of a standard, while keeping formal regulation available if voluntary compliance proves insufficient (e.g. NIST’s Cybersecurity Framework was widely adopted by critical infrastructure operators voluntarily before elements of it were referenced in binding requirements).

Agencies also reference standards in contract language and vendor qualification requirements, meaning companies must meet referenced standards to win or keep federal contracts. This can effectively create mandatory requirements without new regulation. For example, DOD’s Cybersecurity Maturity Model Certification (CMMC) requires defense contractors to meet specific cybersecurity standards to be eligible for contracts.

Several commitments and directives reinforce government use of standards. The WTO Technical Barriers to Trade Agreement instructs countries to base technical regulations on international standards when those standards meet domestic needs. In the US, the National Technology Transfer and Advancement Act and OMB Circular A-119 direct agencies to use voluntary consensus standards whenever feasible, rather than writing unique government standards from scratch. According to the ISO, the US federal government is the largest single creator and user of specifications and standards, with more than 44,000 statutes, technical regulations, or purchasing specifications.

Beyond regulation and procurement, standards and evaluations also intersect with other major policy tools:

Federal R&D funding: Agencies can tie grants to shared performance metrics, standardized test protocols, or agreed reporting formats, shaping both what gets funded and how research is conducted and measured.
Trade and geopolitics: International standards can function as technical barriers to trade. When a country adopts a standard as a market requirement, foreign companies that don’t meet it face higher costs to enter that market, even without tariffs or explicit restrictions. This enables countries that lead in setting standards to shape global markets in their favor.

Together, these channels explain why standards work often produces outsized downstream effects without new laws or regulations.

Standards & evaluations for emerging technology

Emerging technologies create substantial uncertainty for standard-setting: rapid iteration, unclear failure modes, dual-use risks, and weak consensus on what constitutes “safe” or “effective”. Traditional standards processes, built for mature industries where the science is settled, often struggle to keep pace. In response, US agencies have leaned on a few practical approaches: publishing voluntary risk-management frameworks and guidance, developing measurement and test methods, building evaluation capacity within government, and using procurement and funding conditions to encourage adoption.

AI

AI standards work looks more fragmented than in mature engineering fields. No single body controls AI standard-setting end to end, and many “standards-like” documents take the form of risk frameworks, evaluation protocols and best practices, or benchmarks. In recent years, NIST has played a central coordinating role in the US by publishing risk-management frameworks and profiles, developing measurement and evaluation methods and best practices, and contributing technical expertise to broader standards efforts.

NIST’s Center for AI Standards and Innovation (CAISI) leads federal AI evaluations, establishes voluntary agreements with frontier AI developers for pre-deployment testing, and coordinates with DOD, DHS, and the intelligence community on national security-related AI assessments.³ In February 2026, CAISI launched the AI Agent Standards Initiative to develop standards for agentic AI systems. NIST’s Information Technology Laboratory (ITL) also contributes to domestic and international standards initiatives.

Beyond NIST, several non-governmental and international organizations play key roles in AI standards, evaluation, and governance (illustrative examples, not comprehensive):

Standards bodies

These organizations develop formal technical standards through structured, consensus-based processes:

ISO/IEC JTC 1/SC 42: The joint ISO and IEC committee that leads international AI standardization, covering terminology, governance, risk management, and lifecycle processes. Its standards are often referenced in national regulations and procurement requirements worldwide.
IEEE Standards Association (IEEE SA): Develops technical standards for AI system properties like transparency, robustness, and ethical design (e.g. the IEEE 7000 series on ethical concerns in system design).
International Telecommunication Union (ITU): ITU maintains the AI Standards Exchange Database, co-hosted the 2025 International AI Standards Summit in Seoul, and publishes standards relevant to AI in telecommunications, multimedia authenticity, and network infrastructure.

Third-party evaluation organizations

These organizations build evaluation tools or directly assess AI systems, generating evidence that can inform standards and policy:

METR (Model Evaluation & Threat Research): Evaluates frontier AI models for potentially dangerous capabilities, including autonomous replication, cyberoffense, and persuasion.
Apollo Research: Evaluates frontier AI models for deceptive and strategic behaviors, including scheming, sandbagging, and in-context reward hacking.
MLCommons: Maintains open benchmark suites (including MLPerf) widely used as reference points for AI model performance and hardware evaluation.
AI Evaluator Forum (AEF): A consortium of independent AI evaluation organizations focused on developing shared standards for third-party evaluations (founding members include METR and RAND).
Evals Consensus Initiative: A cross-sector effort to develop shared guidelines for how AI evaluations should be conducted and reported. The initiative convenes experts from industry, academia, government, and civil society to identify where agreement already exists on evaluation best practices, with the goal of producing concise, broadly endorsed guidance.

Governance and coordination bodies

These organizations develop principles, convene stakeholders, and shape the policy environment around AI standards without producing formal standards themselves:

OECD AI Policy Observatory: Curates policy-facing AI resources, supports cross-country comparison through frameworks and indicators, and houses the OECD AI Principles—the first intergovernmental standard on AI, adopted in 2019 and updated in 2024. The Global Partnership on AI (GPAI), which merged with the OECD in 2024, brings together 44 member countries and produced the 2025 General-Purpose AI Code of Practice alongside the EU AI Office. OECD expert groups (AIGO and ONE AI) convene government officials and outside experts to develop guidance that influences national AI strategies.
Frontier Model Forum (FMF): A coalition of frontier AI developers focused on advancing safety best practices and harmonizing approaches to frontier AI safety frameworks. FMF publishes technical reports on safety framework implementation, coordinates threat and vulnerability information-sharing among members, and funds independent safety research through the AI Safety Fund.
Partnership on AI (PAI): A multistakeholder forum that develops responsible AI guidance and supports shared evaluation practices across industry, civil society, and academic members.

Below is a timeline of key federal and international actions shaping how AI systems are standardized, evaluated, and governed, along with a compilation of major AI-related standards by issue area.

Major recent developments in AI standards and evaluations

February 2026: NIST’s Center for AI Standards and Innovation (CAISI) launches the AI Agent Standards Initiative to develop interoperability, security, and identity standards for agentic AI systems.
December 2025: NIST releases a preliminary draft of the Cybersecurity Framework Profile for AI, providing guidance on managing AI-specific cybersecurity risks.
September 2025: CAISI conducts evaluations of several leading DeepSeek models, assessing capabilities that may pose national security risks.
July 2025: The Trump administration releases the AI Action Plan, naming NIST in many recommended actions related to AI standards, evaluation, and measurement science.
June 2025: The US AI Safety Institute is reorganized as the Center for AI Standards and Innovation (CAISI), with a renewed focus on national security evaluations, international standards leadership, and voluntary industry agreements.
July 2024: NIST publishes the Generative AI Profile (AI 600-1), a companion to the AI RMF identifying 12 risks unique to or exacerbated by generative AI and mapping them to management actions.
November 2023: Following the Biden Administration’s Executive Order on AI, NIST establishes the US AI Safety Institute to develop science-based evaluation methods and coordinate with international counterparts. The Trump administration later rescinded this Executive Order in January 2025.
January 2023: NIST publishes the AI Risk Management Framework (AI RMF 1.0), completing a two-year public development process involving thousands of stakeholders. The framework becomes one of the most widely referenced AI governance documents globally.

Major AI-related standards & evaluations

Below is a non-exhaustive map from common AI policy concerns to standards, guidance, and evaluation frameworks that aim to address them.

AI concern	Relevant standards, guidance, or evaluation frameworks
Managing system risk and governance	NIST AI Risk Management Framework NIST Generative AI Profile ISO/IEC 42001 (AI management systems)ISO/IEC 23894 (AI risk management guidance)OECD AI Principles
Transparency, accountability, and responsible design	IEEE 7000 (ethical system design)IEEE 7001 (transparency)ISO/IEC/IEEE 24748-7000 (ethically aligned design processes)
Data quality and lifecycle controls	ISO/IEC 5259 series (data quality for analytics and ML)ISO/IEC 8183 (data life-cycle framework)
Performance, robustness, and safety	ISO/IEC 23282 (NLP evaluation methods)NIST ARIA evaluations and red-teaming guidanceMLCommons MLPerf benchmark suites NIST AI 800-2 (Practices for Automated Benchmark Evaluations of Language Models)
Misuse and dual-use risk	NIST AI 800-1 (Managing Misuse Risk for Dual-Use Foundation Models)NIST Generative AI Profile misuse-assessment guidance
AI cybersecurity	NIST Cybersecurity Framework Profile for AI (draft, December 2025)NIST SP 800-53 Control Overlays for Securing AI Systems (in development)
Bias and discrimination	ISO/IEC TR 24027 (bias in AI-aided decision making)IEEE 7003 (algorithmic bias)NIST AI RMF fairness and trustworthiness guidance

Biosecurity and life sciences

Biosecurity relies on a patchwork of lab practices, clinical standards, safety engineering requirements, and sector-specific guidance rather than a single unified standard setter. In practice, biosafety and biosecurity work often turns on operational protocols, facility and equipment requirements, and verification methods that labs, hospitals, and manufacturers can implement and auditors can check.

The closest thing to a unifying baseline is the CDC and NIH’s Biosafety in Microbiological and Biomedical Laboratories (BMBL), which defines biosafety levels, risk assessment methods, and recommended practices for laboratories working with hazardous biological agents. The BMBL is advisory, not regulatory, but it functions as a de facto national standard: institutions incorporate its requirements into their own policies, and federal agencies reference it in grant conditions and oversight programs. When an institutional biosafety committee evaluates a proposed experiment or a federal inspector reviews a BSL-3 facility, the BMBL is typically the benchmark they’re working from.

Beyond lab safety, biosecurity standards also govern dual-use research oversight, synthetic biology screening, and diagnostic testing. The NIH Guidelines for Research Involving Recombinant or Synthetic Nucleic Acid Molecules set requirements for institutions receiving NIH funding, enforced through institutional biosafety committees. For clinical and diagnostic laboratories, CLSI standards and FDA-recognized consensus standards establish the technical requirements that labs must meet to ensure accurate, reproducible results. Bio standards are increasingly relevant to pandemic resilience in the built environment, where indoor air quality standards for infection prevention (such as ASHRAE Standard 241, the first standard specifically focused on control of infectious aerosols) set ventilation and filtration requirements for schools, hospitals, and other public spaces.

Unlike AI, where standards work is still defining what to measure, biosecurity standards often build on decades of established science, with the main policy challenges centering on keeping oversight current as capabilities advance (particularly in synthetic biology and gain-of-function research) and extending coverage to non-federally funded work.

Key non-governmental and international organizations in biosecurity standards include:

Laboratory and clinical standards organizations

These organizations develop standards for laboratory safety, diagnostics, and biomanufacturing quality:

ASTM International (e.g. E55, E35): Standards relevant to biopharmaceutical manufacturing, contamination control, disinfectants, sterilization, and antimicrobial testing.
Association for the Advancement of Medical Instrumentation (AAMI): Standards and guidance for sterilization, device safety, and related quality systems.
Clinical and Laboratory Standards Institute (CLSI): Clinical lab standards, methods, and guidance that support consistent diagnostics and biosafety practices.
ISO TC 212: Clinical laboratory testing and in vitro diagnostics.
ISO TC 276: Biotechnology standards, including bioprocessing and biobanking.
CDC and related US guidance publishers: Practice-defining guidance that often functions as a baseline in labs and regulated settings, especially when integrated into institutional requirements.
World Health Organization (WHO): Publishes the Laboratory Biosafety Manual, the foundational international reference for lab biosafety practices.

Biosecurity governance and dual-use risk organizations

Biosecurity standards respond to an evolving set of international treaties, norms, and governance mechanisms that define what risks need managing. Key elements of this landscape include:

Biological Weapons Convention (BWC): The foundational international treaty (1972) prohibiting biological weapons. The BWC establishes the legal norm but notably lacks a formal verification mechanism, which makes voluntary standards and screening tools especially important for ensuring compliance. Its review conferences have endorsed voluntary biorisk management standards, and ongoing working groups are negotiating strengthened compliance measures.
UN Security Council Resolution 1540 (2004): Requires all states to adopt domestic controls, including legislation, export controls, and physical protection measures, to prevent biological, chemical, and nuclear weapons from reaching non-state actors. These obligations drive much of the national-level standards infrastructure for pathogen security and biosafety.
Australia Group: An informal 42-country arrangement that coordinates national export controls on dual-use biological equipment, pathogens, and toxins, maintaining common control lists that function as de facto international standards for what materials require licensing.
WHO: Beyond its laboratory standards (listed above), WHO published the 2022 Global Guidance Framework for the Responsible Use of the Life Sciences, updating international biosafety and biosecurity norms to address dual-use research, gene editing, and synthetic biology.
International Gene Synthesis Consortium (IGSC) and International Biosecurity and Biosafety Initiative for Science (IBBIS): As gene synthesis becomes cheaper and more accessible, screening of nucleic acid orders is emerging as a critical biosecurity chokepoint. The IGSC developed the existing industry Harmonized Screening Protocol; IBBIS, launched in 2024, is building the Common Mechanism, a free, open-source international screening tool, and working toward enforceable international screening standards.

Below is a timeline of key federal and international actions shaping biosecurity and life sciences standards and oversight, along with a compilation of major bio-related standards by issue area.

Major bio-related standards

Below is a non-exhaustive map from common biosecurity concerns to standards, guidance, and evaluation frameworks that aim to address them.

Biosecurity concern	Relevant standards, guidance, or evaluation frameworks
Preventing lab-acquired infections	CDC Biosafety in Microbiological and Biomedical Laboratories (BMBL)WHO Laboratory Biosafety Manual; ISO 35001 (biorisk management); institutional biosafety committee (IBC) procedures
Managing BSL facility containment	BMBL facility and engineering requirements for BSL-1 through BSL-4WHO laboratory biosecurity guidance; ASHRAE ventilation and indoor air quality standards NFPA airflow and engineering control standards
Reducing occupational hazards	Lab-focused biosafety guidance in BMBL and WHO LBM OSHA PPE standards NFPA ventilation and fire-protection engineering standardsinstitutional biosafety and exposure-control practices.
Preventing diversion or misuse of biological materials	NIH design requirements manual for biomedical laboratories and animal research facilities (DRM)Laboratory access-control and accountability procedures derived from international guidance
Dual use research of concern	NIH Guidelines for Research Involving Recombinant or Synthetic Nucleic Acid MoleculesUSG Policy for Oversight of Dual Use Research of Concern and Pathogens with Enhanced Pandemic Potential (OSTP)⁴Framework for Nucleic Acid Synthesis Screening (OSTP)Australia Group control list of dual use biological equipment
Protecting plant and animal health	International Plant Protection Convention (IPPC) standardsWorld Organisation for Animal Health (WOAH) codes and manuals
Reducing airborne disease transmission in buildings	ASHRAE Standard 241 (Control of Infectious Aerosols)ASHRAE ventilation and indoor air quality standards (Standard 62.1, 62.2)CDC ventilation guidance for infection preventionWHO guidelines on ventilation and airborne precautions

Major recent developments in biosecurity standards and evaluations

August 2025: CDC and NIH publish the sixth edition of the Biosafety in Microbiological and Biomedical Laboratories (BMBL), adding new material on inactivation verification, sustainability, large-scale biosafety, and clinical lab practices.
May 2025: President Trump’s Executive Order on biological research safety directs OSTP and HHS to revise oversight of gain-of-function research and establish a strategy to govern non-federally funded risky life-science research.
April 2025: The congressionally established National Security Commission on Emerging Biotechnology releases its final report, recommending modernization of biotechnology oversight, expanded biodefense coordination, and stronger safeguards for responsible innovation, including development of international biosafety and biosecurity standards.
September 2024: OSTP publishes the revised Framework for Nucleic Acid Synthesis Screening, building on the 2023 HHS screening guidance to require that federally funded research procure synthetic nucleic acids only through providers that screen for sequences of concern and verify customer legitimacy. NIST is directed to develop supporting technical standards.
May 2024: OSTP releases updated government-wide policy for oversight of dual-use research of concern and pathogens with enhanced pandemic potential, expanding the scope of covered research and strengthening institutional review requirements.
June 2023: ASHRAE publishes Standard 241, the first US standard specifically focused on control of infectious aerosols in buildings, establishing minimum requirements for equivalent clean airflow in occupied spaces to reduce airborne disease transmission. The standard was developed with input from federal agencies and public health experts following heightened attention to airborne transmission during the COVID-19 pandemic.⁵

Why (not) work on technical standards & evaluations?

Standards and evaluations can shape technology faster than formal regulation. Here’s why this work matters for policy impact and career development.

The case for impact

When the US government creates, adopts, or references standards, it can steer technology development without passing new legislation or standing up large-scale funding programs. Standards and evaluations sit upstream of regulation, procurement, and funding, so small technical choices can propagate widely.

Early influence on technology design: When technologies move quickly, rulemaking often lags. Standards let agencies shape private-sector design choices before products scale, by publishing guidance, building evaluation programs, or referencing consensus standards in procurement. Because standards can specify what systems must achieve (e.g. accuracy thresholds or safety benchmarks) rather than dictate how they must work, they preserve room for innovation while advancing public objectives. Performance standards for vehicle fuel efficiency, for example, set targets without prescribing engine designs.
Reduced fragmentation: When multiple companies and standards bodies push incompatible definitions, metrics, or test methods, the result is confusion, duplicated compliance costs, and the risk that a single dominant player locks in its preferred approach as the de facto standard. Government-backed baselines can preempt this.
Trust through evidence: Public evaluations and shared test protocols make claims about safety, reliability, or transparency more verifiable. NHTSA’s vehicle crash-test ratings, for example, give consumers a credible, independent basis for comparing safety.
International reach: US-origin standards frequently get adopted or adapted by other governments, international buyers, and multilateral bodies. Countries that lead in developing standards for emerging technologies can shape global markets around their domestic industry’s approach, which is one reason US participation in international standards bodies is frequently framed as a national competitiveness issue.

The case for professional growth

Standards and evaluation work builds technical depth and broad policy fluency simultaneously, with skills that transfer to regulatory roles, federal R&D program management, procurement, and private-sector compliance and safety teams.

Drafting the specifications, not just advising on them: Unlike most policy roles, standards work involves writing the actual measurement methods, thresholds, and test protocols that become binding requirements. This develops an unusually concrete form of technical policy expertise.
Early vantage point: Standards work is where key terms get defined, metrics get chosen, and “good enough” gets quantified, often before regulators or procurement offices lock in downstream requirements. This gives practitioners early visibility into how a technology area is likely to be governed.
Cross-sector relationships: Standards work often requires structured engagement with industry, academia, and other agencies through committees, workshops, and public comment cycles. The professional network this builds spans sectors in a way that few other government roles offer.

Limitations

Voluntary by default: Standards don’t bind unless agencies embed them in regulation, procurement requirements, grant conditions, or accreditation programs. This means adoption depends on stakeholders seeing enough value to comply voluntarily.
Political contestation: Standards development is shaped by the interests and power dynamics of its participants. Industry participants lobby for specifications that reduce their compliance costs, concentrated technical expertise can skew what looks “feasible” even without bad intent, countries compete for influence over international standards to advantage their domestic industries, and advocacy groups push for more stringent requirements.
Update lag: Fast-moving technologies can outpace consensus processes, leaving standards outdated shortly after publication. Detailed standards take longer to develop, which creates a persistent tension between precision and timeliness, especially for AI and other emerging technologies. Even in relatively mature fields like cybersecurity risk management, NIST standards can take five to ten years to develop or update.
Measurement gaps: Standards and evaluations tend to measure what is technically straightforward to test, not necessarily what matters most for safety or public welfare. When standards set measurable targets, organizations naturally optimize for those metrics, sometimes at the expense of harder-to-measure properties. An AI model might score well on a benchmark without being safe or reliable in deployment; a lab might pass an inspection checklist without addressing its most serious operational risks.
Consensus tradeoffs: Standards developed through broad consensus can reflect what is widely achievable rather than what is most ambitious. The need for universal applicability can push requirements toward a lowest-common-denominator baseline, though agencies like NIST also conduct measurement research that feeds into more demanding future standards.

How does the USG engage with standards? Who’s involved?

US government agencies engage with standards and evaluations in three main ways. The distinction matters because each mode involves different actors, carries different legal weight, and offers different leverage points for people working in the space.

Agencies creating standards themselves: Some agencies develop technical standards directly, such as NIST’s measurement methods or FDA’s device classification protocols. The resulting standards can be voluntary (e.g. the AI Risk Management Framework) or carry binding force (e.g. EPA emissions standards), depending on the agency’s statutory authority.
Agencies participating in external standard-setting bodies: Most global standards are developed not by governments but by organizations like ISO and ANSI-accredited technical committees. In these settings, agencies participate as one voice among industry, academic, and civil society representatives rather than as the decision-maker, which means influence depends on showing up consistently and contributing technical expertise.
Agencies or legislators adopting or referencing external standards: Rather than writing technical requirements from scratch, agencies and lawmakers frequently incorporate existing consensus standards into regulations, procurement rules, or grant conditions. This is the primary mechanism through which voluntary standards acquire legal force, and it means that work done in external standards bodies (mode 2) can eventually become binding through government action.

Agencies creating standards themselves

The process for creating technical standards varies by agency, type of standard, and whether the result is mandatory or voluntary. At a conceptual level, most agency-led processes include these steps:

Identify the technical need: The agency identifies the need for a new or updated standard, whether triggered by a regular review cycle, a statutory mandate, new scientific evidence, or an incident that exposes gaps. This typically starts with program offices, technical divisions, and senior leadership.
Convene subject-matter experts: The agency assembles technical expertise to inform the standard’s scope and content, through workshops, requests for information (RFIs), advisory committee meetings, or interagency working groups. Participants typically include federal scientists and engineers, external experts from academia or industry, interagency partners, and advisory boards.
Draft technical content: Internal teams—agency technical staff, policy analysts, legal counsel, and sometimes contractors or federally funded research and development centers—draft the standard, including definitions, technical requirements, testing methods, and data expectations. Drafts typically draw on existing research, operational experience, and relevant external standards.
Seek broader stakeholder input (when applicable): Agencies may release draft standards for public comment or hold meetings to gather feedback. Some agencies, like NIST, do this routinely; others, like CDC, do so selectively. Input comes from industry groups, think tanks, laboratories, researchers, nonprofits, state and local partners, and standards development organizations.
Revise and finalize: The agency incorporates feedback, reconciles conflicting input, and verifies internal consistency. Final documents undergo legal review and leadership approval before publication.
Implement and update: The agency communicates the new standard to the relevant audience (e.g. regulated entities, federal programs) and revises it over time as science advances, new risks emerge, or operational experience reveals gaps. Program offices, compliance teams, and technical staff manage this ongoing cycle.

Examples of agency-created standards in emerging tech

Biosafety in Microbiological and Biomedical Laboratories (BMBL) – Sixth edition

What it is: BMBL is CDC and NIH’s primary biosafety reference for biomedical, clinical, and research laboratories. It provides recommended best practices, risk assessment methods, facility expectations, and agent-specific guidance for work at Biosafety Levels 1–4.

Why it matters: BMBL functions as a de facto national biosafety standard. BMBL is advisory, not regulatory, but is widely treated as the national baseline for laboratory biosafety and is incorporated into many federal programs, institutional biosafety policies, and accreditation frameworks.

How it was produced:

Identify the technical need: CDC and NIH update BMBL periodically to reflect new science, evolving laboratory practices, and lessons from biosafety incidents.
Convene subject-matter experts: CDC and NIH led the process, convening scientific and professional experts through technical working groups, advisory committees, and external review.
Draft technical content: Federal biosafety experts prepared revised sections, agent summaries, and appendices. The sixth edition added new material on inactivation verification, laboratory sustainability, large-scale biosafety, and clinical laboratory practices.
Seek stakeholder input: More than 200 scientists, biosafety officers, and professional colleagues contributed as reviewers, guest editors, and subject-matter experts.
Revise and finalize: CDC and NIH incorporated feedback, harmonized recommendations with guidance from other organizations and federal agencies, and approved the final text.
Publish and implement: The sixth edition was published in August 2025. CDC and NIH provide online training and downloadable resources to support adoption, and future updates will continue the periodic revision cycle.

NIST AI Risk Management Framework (AI RMF)

What it is: The NIST AI RMF provides voluntary guidance for managing risks to individuals, organizations, and society associated with AI systems. It aims to help organizations integrate trustworthiness (e.g. transparency, safety, and accountability) into AI design, development, and deployment.

Why it matters: Although voluntary, the AI RMF has become a widely referenced framework for federal agencies, AI developers, and private companies. It serves as a common language for defining AI risks and recommended practices.

How it was produced:

Identify the technical need: NIST was asked to develop a framework to support trustworthy AI following directives in the National AI Initiative Act.
Convene subject-matter experts: NIST convened a series of public workshops with researchers, companies, civil society organizations, and government partners. NIST also issued an RFI in the Federal Register to collect input on the scope and structure of the framework. Read the comments received here.
Draft technical content: NIST issued several draft versions, notably its second draft in August 2022 that invited written feedback by email and discussion during an open workshop.
Seek broader stakeholder input: Public comment periods accompanied each draft, drawing input from hundreds of organizations and individuals across sectors.
Revise and finalize: NIST iteratively revised the Framework based on feedback.
Publish and implement: NIST released AI RMF 1.0 in January 2023 and has continued to build on the framework with companion profiles, including the Generative AI Profile.⁶

Agencies participating in external (non-governmental) standard-setting bodies

Most technical standards are developed not by governments but by non-governmental standards development organizations. The National Technology Transfer and Advancement Act and OMB Circular A-119 direct federal agencies to rely on voluntary consensus standards whenever feasible, so agencies have strong incentives to shape those standards from within. Agencies can’t control the process, but they can participate by:

Nominating subject-matter experts to serve on technical committees
Voting on draft standards through formal ballot processes
Contributing research, test data, or technical evidence
Attending working groups or plenary meetings
Establishing or participating in ANSI-accredited US Technical Advisory Groups (TAGs), which develop consensus US positions for international standards negotiations

Examples of NIST engaging with ANSI and ISO

NIST ITL as an ANSI-accredited standards developer: NIST’s Information Technology Laboratory (ITL) has been accredited by ANSI as a standards developer since 1984. Today, NIST ITL maintains one major American National Standard: ANSI/NIST-ITL 1-2011, a globally used specification for the interchange of fingerprint, facial, and other biometric data. This standard is widely adopted across government and intergovernmental systems and demonstrates how federal agencies contribute to voluntary consensus standards without controlling the broader standards ecosystem.

NIST administers the U.S. TAG to ISO/TC 276 (Biotechnology): NIST administers the US Technical Advisory Group (TAG) for ISO Technical Committee 276, which develops international biotechnology standards. The TAG is accredited by ANSI, and its membership includes representation from industry, academia, nonprofits, and federal agencies. As TAG administrator, NIST coordinates US input on ISO ballots, drafts US positions, and organizes participation in working groups on bioprocessing, analytical methods, biobanking, and data integration. This structure demonstrates the relationship between technical leadership (NIST), US consensus building and coordination (ANSI), and international standards development (ISO).

Agencies or legislators adopting or referencing external standards

External standards become part of federal policy when they are incorporated into regulation, procurement requirements, guidance documents, or evaluation and accreditation programs. Adoption allows the government to use well-established technical expectations without developing every standard internally.

Agencies adopt or reference external standards in several ways:

Regulation: Agencies write legally binding rules that require compliance with specific external standards. Incorporation by reference allows regulators to use detailed technical material without reprinting it in the statute itself.
Procurement and contracting: Acquisition programs reference external standards in Requests for Proposals (RFPs), Statements of Work (SOWs), and contract clauses. Contractors must meet the specified standards for their products or services to be accepted.
Guidance or recommended practices: Agencies issue nonbinding guidance that recommends the use of external standards. While not legally enforceable, these recommendations influence practice across regulated industries and can shape expectations in advance of formal regulation.
Evaluation, certification, and accreditation programs: Agencies rely on external standards in conformity assessment processes. Many federal certification, accreditation, and testing programs require compliance with recognized consensus standards to demonstrate safety, quality, or performance.

Examples of agencies referencing voluntary standards

CDC use of HL7 for immunization data exchange

What it is: Health Level Seven (HL7) is a set of global voluntary consensus standards for exchanging clinical and administrative health data. The HL7 Version 2.5.1 Implementation Guide for Immunization Messaging is the consolidated standard used across the US for immunization information exchange.

Why it matters: CDC relies on HL7 standards to ensure consistent and interoperable reporting between electronic health records (EHRs), immunization information systems (IIS), pharmacies, clinics, and state or local public health departments. Because HL7 is an external, consensus-based standard, CDC can use it to define uniform data formats and message structures without developing a government-unique specification. This improves data quality and enables national-scale immunization surveillance.

DOD adoption of ISO standards for procurement

What it is: The DOD adopts some ISO standards through formal adoption notices issued by the Defense Logistics Agency (DLA).

Why it matters: Adoption makes these international voluntary consensus standards usable as contract requirements across DOD acquisition programs. Rather than writing military-unique specifications, DOD can reference these adopted ISO standards directly in solicitations, SOWs, and quality or testing requirements.

Working on standards & evals: types of roles and career opportunities

Type of Role	Responsibilities	Typical background (for full-time roles)	Security clearance	Location	Career guides & opportunities
Federal standards & evaluations staff	Develop, maintain, or update technical standards; run public comment processes; coordinate with SDOs; conduct risk or performance assessments; translate science into technical guidance	Bachelor’s degree for junior roles; advanced technical training for specialized roles; experience in scientific, engineering, or policy analysis	Sometimes required (for national security-adjacent roles)	Washington, DC; Gaithersburg, MD (NIST campus); some roles at national labs or field offices	Executive Branch, NIST, FDA, CDC, National Labs and FFRDCs
Acquisition & procurement staff	Integrate standards into solicitations and contracts; evaluate vendor compliance; work with test and evaluation offices; ensure systems meet international or domestic standards requirements	BA/MA; engineering, procurement, or systems background helpful; program management experience	Sometimes required (for national security-adjacent roles)	Primarily Washington, DC; also military installations and agency field offices across the US	Executive Branch, DOD, DHS
Think tank researchers or advocates	Conduct policy research and analysis; submit comments on proposed standards and participate in standard-setting processes; develop recommendations; advocate for policy changes; engage with policymakers and media.	BA or MA for junior roles; MA/JD/PhD for mid-career/senior; subject matter expertise; experience in policy analysis or communications	Rarely required	Primarily Washington, DC; some in major cities or remote	Working in think tanks (+ fellowships, think tanks working on emerging tech policy, & resources)
Congressional staff	Support members and committees in overseeing standards-related agencies (particularly NIST); shape legislation that affects standards policy, including authorization and appropriations for federal measurement and standards programs; prepare hearings on standards-related topics; engage with NIST, SDOs, industry, and other stakeholders	BA for junior roles; BA/MA/JD for mid-career/senior roles; strong communication skills. Prior Hill experience matters more than formal credentials for senior roles; fellowships can help bypass this requirement.	Rarely required (e.g. some Armed Services or Intelligence committee staff)	Washington, DC	Working in Congress (+ internships, fellowships, & full-time roles)
Multilateral standards organizations	Support development of international consensus standards; coordinate technical committees; facilitate cross-country negotiations; align standards with global regulatory, safety, and interoperability needs; coordinate US participation through national standards bodies like ANSI	BA/MA for policy or coordination roles; STEM or engineering background for technical positions; experience with standards, international policy, or technical writing	Not required	Geneva (ISO, ITU), Brussels (IEC), remote and US-based roles at ANSI and related institutions	Multilateral governance careers; career pages for ISO, ANSI, IEC, ITU
Industry and professional standards organizations⁷	Manage standards committees; coordinate voluntary consensus processes; draft and maintain technical specifications; engage with companies, researchers, and government liaisons; track emerging technology needs in sectors such as AI, health IT, aerospace, and biotech	BA/MA for policy, program, or coordination roles; STEM or engineering for technical committee support; experience with industry standards or applied research	Not required	Can be located across the US but hubs in DC, New York, Boston	Career pages for IEEE, ASTM, HL7, SAE, AAMI; early-career standards coordination roles at major SDOs
Third-party evaluation organizations	Evaluate technologies against safety, performance, or risk criteria; develop evaluation methodologies, benchmarks, and test protocols; publish findings that inform standards development and policy decisions; collaborate with government agencies and standards bodies on evaluation design	Advanced degree (MA/PhD) in computer science, statistics, or a relevant technical field; research experience in machine learning, measurement science, or experimental design; familiarity with benchmark development and evaluation methodology	Sometimes required (for national security-related evaluations)	Washington, DC; San Francisco Bay Area; remote (varies by organization)	Career pages for METR, Apollo Research, MLCommons
Conformity assessment & testing staff	Conduct product testing, inspection, or certification against published standards; perform laboratory accreditation assessments; support organizations seeking ISO, industry, or government-recognized certification; evaluate whether products, systems, or processes meet specified technical requirements; document and report test results for regulatory or procurement purposes	BA/MA in engineering, science, or quality management; laboratory experience or auditing credentials (e.g. ISO lead auditor certification) helpful; familiarity with relevant testing standards and conformity assessment procedures	Rarely required; some defense or national security testing roles may require clearance	Across the US; testing and certification organizations operate nationally, with concentrations near manufacturing hubs and federal agency locations	Career pages for UL Solutions, Intertek, BSI, TÜV; ANSI National Accreditation Board (ANAB) lists accredited certification bodies

Preparing for standards & evals roles

Building technical depth in a relevant domain. Most standards roles require enough subject-matter expertise to evaluate technical claims, assess tradeoffs in standard design, and engage credibly with engineers, scientists, and industry representatives. Graduate training in a STEM field, law, or a related discipline is common at agencies like NIST, though not always required for policy-focused positions at think tanks or in Congress.
- For AI evaluation roles specifically, technical expertise in machine learning methods, statistical measurement, and experimental design are valuable, and many evaluation organizations draw heavily from candidates with graduate-level research experience (e.g. PhDs or Master’s in computer science, statistics, or related fields).
Learning how standards processes work. Familiarize yourself with how standards are developed, adopted, and referenced, including the roles of organizations like NIST, ANSI, ISO, and sector-specific standards development organizations. Some organizations offer in-depth public resources on standards and evals, including UPenn’s course series and ANSI’s overview of the US standards system (see more below). Reading recent Federal Register notices related to standards, such as requests for comment on NIST frameworks, can also help you understand how agencies engage with the public during standard development.
Participating in standards development. Many standards committees are open to individual participants, including early-career professionals. Joining or even observing a technical committee at an organization like IEEE, ASTM, or an ANSI-accredited body gives you firsthand exposure to consensus processes, stakeholder negotiation, and technical drafting.
Gaining experience in measurement, testing, or evaluation. Hands-on experience with test design, data collection, benchmarking, or conformity assessment is valuable, whether through lab work, quality assurance roles, or research assistantships. If you’re in a technical field, look for opportunities to work on evaluation protocols, contribute to benchmark development, or support compliance testing. NIST and national labs often hire students and postdocs for measurement-focused work, and MLCommons runs a Rising Stars initiative for recent PhD grads.
- For AI, contributing to open evaluation infrastructure—such as developing or maintaining benchmarks on platforms like Hugging Face, or participating in shared evaluation challenges run by organizations like MLCommons—can build practical experience and visibility in the evaluation community.
Completing a relevant fellowship or internship. Several programs place early-career professionals in standards-adjacent roles: the AAAS Science & Technology Policy Fellowship places scientists and engineers in federal agencies (including NIST and other standards-relevant offices), and NIST’s own internship and postdoctoral programs offer direct exposure to measurement science and standards development. Congressional fellowships and internships can also provide relevant experience for those interested in the legislative side of standards policy.
Engaging with the standards and evaluation community. Think tanks, professional societies, and advocacy organizations that work on standards policy regularly host public events, publish research, and accept public comments. Tracking organizations like ANSI, ASTM, IEEE, and NIST, as well as think tanks covering standards-related topics, can help you understand current debates and build professional relationships. Attending events like ANSI’s annual World Standards Day, NIST workshops, or SDO committee meetings can expand your network.
- The AI evaluation community has its own professional circuit, including conferences like NeurIPS, ICML, and FAccT, along with workshops hosted by organizations like METR and the AI Evaluator Forum.
Writing or publishing on standards-related topics. Demonstrating that you can analyze and communicate about standards and evals, whether through policy briefs, blog posts, academic papers, or public comments, signals both technical credibility and policy fluency. Submitting a comment on a NIST draft framework or writing an analysis of a proposed standard for a policy outlet are practical ways to build a portfolio.
- For evaluation roles, publishing in venues focused on measurement or benchmarking (e.g. NeurIPS, ICML, and FAccT) can signal technical credibility to both government and independent evaluation organizations.

Appendices: Day-in-the-life

Day in the life: Stories from people working on technical standards

In testimony to Congress, Dr. James Olthoff, then acting Director of NIST, described the agency’s role in active coordination, rather than passive observation, in setting standards:

It is important to appreciate that participation in international consensus standards development by Federal agencies does not equal a passive engagement or abdication of Federal responsibilities to the private sector. NIST recognizes that for certain sectors of exceptional national importance, self-organization may not produce a desirable outcome on its own in a timely manner. In such instances, where time is of essence to address national priorities, the Federal government can play the important role of an “effective convener” to catalyze standards development critical for these sectors. Current national priorities include the development and deployment of artificial intelligence (AI) systems, bioscience technologies, robust cryptography for a post-quantum world, and cybersecurity standards for securing Federal government IT systems and the interactions of these systems.”

In testimony to Congress, former NIST director (now president of ANSI), Dr. Laurie Locascio, described NIST’s role in collaborating with industry to build trust in emerging technologies to facilitate adoption:

NIST is the only Federal laboratory with a mission entirely focused on driving U.S. innovation and industrial competitiveness – and we do this through well-understood, verified measurements and standards that are critical for every step in the product development and commercialization cycle – from invention to refinement, manufacture to sales, and regulation to decommissioning. NIST collaborates with industry in this work every step of the way. By working together and asking, “What are the difficult measurement problems that are holding your industry back?”, NIST can help entire sectors overcome barriers to domestic and international competitiveness. NIST’s measurement and standards solutions form the basis for the Nation’s innovation to flourish – and not just for a month, or a year, but to set up U.S. businesses for decades of technological leadership to ensure our economic and national security…We know that building trust in the technology of the future is critical. If new technologies are not trusted, they are not adopted. If new technologies are not adopted, we lose out on establishing leadership in the next cutting-edge technology, and on the benefits to our economic and national security.