ONNX Origins

Laying the Groundwork for an Open Standard

Part 1 of a series on ONNX.

In this section, we cover:

  • How ONNX originated and why it was created
  • The roles Microsoft and other industry leaders played in its development
  • The evolution of ONNX as an open, widely adopted standard

1: Introduction to ONNX and ONNX Runtime

1.1: What ONNX Is and Why It Was Created

Open Neural Network Exchange (ONNX) is an open standard for representing machine learning models. It was initiated by leading AI companies to address the problem of framework fragmentation: each deep learning framework (TensorFlow, PyTorch, Caffe2, etc.) traditionally had its own model format, making it hard to move models between tools. ONNX was created in 2017 by Microsoft and Facebook (with later support from AWS and others) as a shared model representation for interoperability. By providing a common set of operators and data types for ML models, ONNX allows developers to train a model in one framework and then use or deploy it in another without needing to rewrite the model. This portability was designed to let people “choose the right framework for the task” during development and seamlessly transition to a different framework or runtime for production. In short, ONNX’s creation was motivated by the need for an open, framework-agnostic model format to make AI development more flexible and to avoid lock-in to any single ecosystem.

ONNX not only makes models portable across frameworks but also helps hardware interoperability. Hardware providers (CPU, GPU, TPU, NPU vendors) can optimize for ONNX’s standard ops and have those optimizations benefit multiple frameworks at once. Instead of writing separate low-level kernels for each ML library, a vendor can target the ONNX format and instantly make those speed-ups available to any framework that exports to ONNX. This “write once, run anywhere” approach is why ONNX has been embraced as a core piece for efficient deployment in diverse environments.

1.2: Model Portability and Interoperability with ONNX

A primary goal of ONNX is to enable model portability – the ability to train or define a model in one environment and then use it in another. With ONNX, a data scientist can train a model using PyTorch, export it to ONNX, and a developer can load that same ONNX model in TensorFlow, MXNet, or any framework or runtime that supports ONNX. This interoperability yields several key benefits.

Because of these advantages, ONNX has evolved into a widely adopted standard for exchanging models. Many popular frameworks and tools (PyTorch, TensorFlow via TF-ONNX, scikit-learn via skl2onnx, Keras, MXNet, etc.) can export or import ONNX models. This broad adoption confirms the value of ONNX in enabling an ecosystem where models are not tied to a single framework. It empowers developers to mix and match the best training environment with the best inference environment without being blocked by incompatibilities. In summary, ONNX was created to be the lingua franca of ML models, emphasizing portability and interoperability across the AI landscape.

1.3: The Role of ONNX Runtime in Efficiently Running ONNX Models

While ONNX defines how models are represented, it does not by itself execute models. This is where ONNX Runtime (ORT) comes in. ONNX Runtime is a high-performance inference engine developed by Microsoft to run ONNX models efficiently on different hardware and operating systems. In essence, if ONNX is the standardized model format, ONNX Runtime is the engine that actually loads and runs those models across various environments. It was open-sourced in 2018 to encourage broad use and contribution.

Key characteristics of ONNX Runtime include:

In summary, ONNX Runtime serves as the universal engine for ONNX models, focusing on runtime efficiency and flexibility. By separating the model format (ONNX) from the execution (ORT), developers get a powerful combination: they can train with any framework that exports ONNX, then rely on ONNX Runtime to execute the model with high performance on the target platform of their choice. This separation of concerns (training vs. inference) and emphasis on optimization and hardware utilization is what makes ONNX Runtime particularly valuable for deploying AI models in practice.

2: The Origins of ONNX and Industry Involvement

2.1: Companies Behind ONNX: Microsoft, Facebook, and Others

ONNX’s inception was a collaborative industry effort. It was co-founded by Microsoft and Facebook in 2017 as an open-source project to facilitate AI model interchange () (). At the time, Facebook’s AI researchers were looking for ways to easily move models between their internal PyTorch framework and production tools (like Caffe2), and Microsoft had similar needs with its Cognitive Toolkit and other AI platforms. The two companies joined forces to create ONNX as a common format. Shortly after, Amazon Web Services (AWS) also joined the ONNX effort, contributing to its development (). These tech giants recognized that an open standard would benefit the entire industry, not just their own products.

As ONNX gained momentum, many others in the community and industry got involved. In 2018 and 2019, companies like IBM, Intel, NVIDIA, Qualcomm, Huawei, and AMD expressed support or contributed to ONNX. ONNX eventually became part of the Linux Foundation’s LF AI & Data initiative, cementing its status as an open standard with governance independent of any single company () (). The ONNX community now includes dozens of hardware vendors, cloud providers, and framework maintainers. This broad industry involvement has resulted in a rich ecosystem: most major deep learning frameworks can export ONNX models, and many hardware vendors provide ONNX support in their drivers or libraries (). In short, while Microsoft and Facebook were the initial drivers, ONNX quickly grew into a community-driven standard embraced by a wide range of AI players.

2.2: Evolution of ONNX as an Open Standard for ML Models

Since its introduction, ONNX has evolved both technically and organizationally into the de facto open standard for machine learning model representation. Technically, ONNX started with a focus on neural network inference (ops for feed-forward execution) and has expanded to include traditional ML (ONNX-ML profile for tree models, linear models, etc.) and even some training capabilities. The ONNX operator set (the standardized functions like Conv, Relu, Gemm, etc.) is now on versioned opsets that add new operations as needed by cutting-edge models, while maintaining backward compatibility for older models. This means the format keeps up with new innovations in AI (new layers, new model types) in a controlled way.

Organizationally, in 2019 ONNX was accepted into the LF AI Foundation (Linux Foundation’s AI umbrella), reinforcing an open governance model () (). The ONNX project has an open technical steering committee with representatives from various companies. This open governance has encouraged even more collaboration. For example, hardware companies like Intel contributed their nGraph compiler and others contributed converters and testing tools. The result is that ONNX is not owned by any one vendor; it’s a true industry standard. The specification is open-source, with enhancements proposed and discussed in public. This evolution has solidified ONNX’s role as a stable, agreed-upon format that everyone in the AI community can rely on for interoperability.

Because of this open, collaborative development, ONNX has achieved widespread adoption. There is a “model zoo” of pretrained ONNX models shared by the community, and ONNX is integrated into many AI pipelines (from research to cloud deployment). Many of the leading AI frameworks and tools provide built-in ONNX export/import, making ONNX a common part of model development. All these factors highlight how ONNX grew from a Microsoft-Facebook project into an industry-wide standard for representing ML models.

2.3: Microsoft’s Connection to ONNX Runtime

Microsoft not only co-created ONNX; it also spearheaded the development of ONNX Runtime as a practical way to use ONNX models in production. Within Microsoft, there was a pressing need to support a variety of teams building ML models in different frameworks (TensorFlow, PyTorch, scikit-learn, etc.) and deploy them to products like Windows, Office, Bing, and Azure services. Maintaining separate inference solutions for each framework was inefficient. Thus, Microsoft built ONNX Runtime as a unified, high-performance inference engine that could take any ONNX model (regardless of which framework produced it) and execute it efficiently on the target platform. Essentially, ONNX Runtime was the answer to Microsoft’s internal problem of “how do we operationalize models from anywhere in a consistent way” ().

Microsoft open-sourced ONNX Runtime in December 2018, indicating its commitment to make it a community project rather than a proprietary tool. However, Microsoft remains a key contributor and maintainer of ONNX Runtime. The close connection between ONNX Runtime and Microsoft is also evident in its integration into Microsoft products. For instance, Windows ML, the machine learning inference component in Windows 10, uses ONNX Runtime under the hood to execute ONNX models on the local machine. Similarly, ML.NET, Microsoft’s machine learning framework for .NET, leverages ONNX Runtime to run ONNX models in .NET applications. Even Microsoft’s cloud services (like Azure Cognitive Services and Azure Machine Learning) can serve ONNX models using ONNX Runtime behind the scenes.

Beyond Microsoft’s own use, the company worked with partners to extend ONNX Runtime’s capabilities. For example, Microsoft collaborated with Intel to integrate Intel’s nGraph and OpenVINO optimizations as execution providers in ONNX Runtime, and with NVIDIA to integrate TensorRT for high-speed NVIDIA GPU inference. This shows Microsoft’s strategy: make ONNX Runtime a hub that brings together many optimizations from different sources. Microsoft’s heavy investment in ONNX Runtime (researching improvements, integrating with new hardware accelerators, etc.) benefits the entire community, since those improvements land in the open-source project that anyone can use.

In summary, Microsoft’s connection to ONNX Runtime is that of both creator and consumer: they created it to solve an internal need for a unified inference engine, contributed it to open source, and continue to use and improve it as a primary inference runtime across a range of their products. This strong backing and real-world usage by a company of Microsoft’s scale give ONNX Runtime a high degree of maturity and credibility in the industry.

Key references