Skip to content

Harden ORT FlatBuffer model loader against malformed buffers#28186

Open
tianleiwu wants to merge 3 commits intomicrosoft:mainfrom
tianleiwu:tlwu/harden-ort-loader
Open

Harden ORT FlatBuffer model loader against malformed buffers#28186
tianleiwu wants to merge 3 commits intomicrosoft:mainfrom
tianleiwu:tlwu/harden-ort-loader

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

Description

Harden the .ort FlatBuffer model loader to reject malformed buffers early instead of dereferencing null pointers, allocating attacker-controlled sizes, or accessing out-of-bounds memory.

Changes

Null table offset validation (flatbuffers_utils.h/.cc)

  • Add ValidateRequiredTableOffsets<T>() template that scans a FlatBuffer vector of table offsets and rejects any null (zero) entries before the caller dereferences them.
  • Apply to opset imports, initializers, sparse initializers, node args, nodes, node edges, and metadata properties.

Initializer raw_data size validation (graph_flatbuffers_utils.cc)

  • After reading shape and type, compute expected byte count and compare against actual raw_data size. Reject mismatches with a descriptive error.

Node index hardening (graph.cc)

  • Compute required_node_slot_count by scanning actual node/edge indices instead of trusting the serialized max_node_index field, which could be attacker-controlled and cause oversized allocation.
  • Reject duplicate node indices, dangling NodeEdge references to missing nodes, duplicate NodeArg names, and unknown NodeArg references in graph inputs/outputs.

Regression tests (ort_model_only_test.cc)

  • RejectsInitializerRawDataSizeMismatch: crafted buffer with wrong raw_data size
  • RejectsNullNodeArgTableEntry: buffer with zeroed-out node arg offset
  • RejectsDanglingNodeEdge: buffer with a NodeEdge pointing to a non-existent node

Motivation

These checks defend against crafted .ort files that could cause null-pointer dereferences, excessive memory allocation, or out-of-bounds access during model loading.

Testing

  • git diff --check passes (no whitespace issues)
  • Incremental build of touched core objects succeeds
  • New regression tests exercise each validation path

- Add ValidateRequiredTableOffsets<T>() to reject null table offsets
  in FlatBuffer vectors before dereferencing
- Validate initializer raw_data byte size matches shape/type expectation
  in LoadInitializerOrtFormat
- Compute required node slot count from actual node/edge indices instead
  of trusting the attacker-controlled max_node_index field
- Reject dangling NodeEdge references to missing nodes
- Reject duplicate NodeArg names and duplicate node indices
- Fail on unknown NodeArg references in graph inputs/outputs
- Add regression tests for malformed .ort buffers:
  raw_data size mismatch, null node arg entry, dangling node edge
Copy link
Copy Markdown
Contributor Author

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-targeted security hardening of the .ort FlatBuffer model loader. The changes add systematic null-offset detection for FlatBuffer vectors of tables, validate initializer raw data sizes, and replace the trust-the-metadata node sizing with a scan-based approach. Tests exercise the new error paths. Overall code quality is high and the defensive strategy is sound.

The ValidateRequiredTableOffsets template correctly uses ReadScalar<uoffset_t> + zero-check — the only reliable approach since Vector<Offset<T>>::Get(i) on a zero offset returns a non-null dangling pointer rather than nullptr. Good that it follows the existing pattern from LoadInitializerOrtFormat.

The key improvement in graph.cc — replacing nodes_.resize(fbs_graph.max_node_index()) with scan-based required_node_slot_count — eliminates a straightforward memory amplification attack via crafted max_node_index. The duplicate node index and dangling NodeEdge detection add integrity validation that was previously missing. The gsl::not_null → explicit nullptr check change for graph inputs/outputs is also good: returns a descriptive error instead of aborting.

See inline comments for suggestions.

Comment thread onnxruntime/core/graph/graph.cc
Comment thread onnxruntime/core/graph/graph_flatbuffers_utils.cc Outdated
Comment thread onnxruntime/test/framework/ort_model_only_test.cc
- Add sanity bound on required_node_slot_count to reject crafted buffers
  where a single node with a huge index could cause multi-GB allocation.
  Cap at max(1024, 2 * total_entries).
- Read tensor name from fbs_tensor.name() instead of the partially-populated
  protobuf initializer for the raw data size mismatch error message.
- Add RejectsAdversarialLargeNodeIndex test to exercise the new bound.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant