Harden ORT FlatBuffer model loader against malformed buffers#28186
Harden ORT FlatBuffer model loader against malformed buffers#28186tianleiwu wants to merge 3 commits intomicrosoft:mainfrom
Conversation
- Add ValidateRequiredTableOffsets<T>() to reject null table offsets in FlatBuffer vectors before dereferencing - Validate initializer raw_data byte size matches shape/type expectation in LoadInitializerOrtFormat - Compute required node slot count from actual node/edge indices instead of trusting the attacker-controlled max_node_index field - Reject dangling NodeEdge references to missing nodes - Reject duplicate NodeArg names and duplicate node indices - Fail on unknown NodeArg references in graph inputs/outputs - Add regression tests for malformed .ort buffers: raw_data size mismatch, null node arg entry, dangling node edge
tianleiwu
left a comment
There was a problem hiding this comment.
Review Summary
Well-targeted security hardening of the .ort FlatBuffer model loader. The changes add systematic null-offset detection for FlatBuffer vectors of tables, validate initializer raw data sizes, and replace the trust-the-metadata node sizing with a scan-based approach. Tests exercise the new error paths. Overall code quality is high and the defensive strategy is sound.
The ValidateRequiredTableOffsets template correctly uses ReadScalar<uoffset_t> + zero-check — the only reliable approach since Vector<Offset<T>>::Get(i) on a zero offset returns a non-null dangling pointer rather than nullptr. Good that it follows the existing pattern from LoadInitializerOrtFormat.
The key improvement in graph.cc — replacing nodes_.resize(fbs_graph.max_node_index()) with scan-based required_node_slot_count — eliminates a straightforward memory amplification attack via crafted max_node_index. The duplicate node index and dangling NodeEdge detection add integrity validation that was previously missing. The gsl::not_null → explicit nullptr check change for graph inputs/outputs is also good: returns a descriptive error instead of aborting.
See inline comments for suggestions.
- Add sanity bound on required_node_slot_count to reject crafted buffers where a single node with a huge index could cause multi-GB allocation. Cap at max(1024, 2 * total_entries). - Read tensor name from fbs_tensor.name() instead of the partially-populated protobuf initializer for the raw data size mismatch error message. - Add RejectsAdversarialLargeNodeIndex test to exercise the new bound.
Description
Harden the
.ortFlatBuffer model loader to reject malformed buffers early instead of dereferencing null pointers, allocating attacker-controlled sizes, or accessing out-of-bounds memory.Changes
Null table offset validation (
flatbuffers_utils.h/.cc)ValidateRequiredTableOffsets<T>()template that scans a FlatBuffer vector of table offsets and rejects any null (zero) entries before the caller dereferences them.Initializer raw_data size validation (
graph_flatbuffers_utils.cc)raw_datasize. Reject mismatches with a descriptive error.Node index hardening (
graph.cc)required_node_slot_countby scanning actual node/edge indices instead of trusting the serializedmax_node_indexfield, which could be attacker-controlled and cause oversized allocation.NodeEdgereferences to missing nodes, duplicateNodeArgnames, and unknownNodeArgreferences in graph inputs/outputs.Regression tests (
ort_model_only_test.cc)RejectsInitializerRawDataSizeMismatch: crafted buffer with wrong raw_data sizeRejectsNullNodeArgTableEntry: buffer with zeroed-out node arg offsetRejectsDanglingNodeEdge: buffer with a NodeEdge pointing to a non-existent nodeMotivation
These checks defend against crafted
.ortfiles that could cause null-pointer dereferences, excessive memory allocation, or out-of-bounds access during model loading.Testing
git diff --checkpasses (no whitespace issues)