前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >未对齐原始内存的加载和存储操作

未对齐原始内存的加载和存储操作

原创
作者头像
DerekYuYi
修改2022-11-08 16:34:15
1.7K0
修改2022-11-08 16:34:15
举报
文章被收录于专栏:Swift-开源分析

提议:SE-0349

swift 目前没有提供从任意字节源(如二进制文件)加载数据的明确方法,这些文件中可以存储数据而不考虑内存中的对齐。当前提议旨在纠正这种情况。

方法 UnsafeRawPointer.load<T>(fromByteOffset offset: Int, as type: T.Type) -> T要求self+offset处的地址正确对齐,才能用来访问类型T的实例。如果尝试使用指针和字节偏移量的组合,但没有对齐T,会导致运行时 crash。一般来说,保存到文件或网络流中的数据与内存中的数据流并不是遵守同样的限制,往往无法对齐。因此,当将数据从这些源(文件或网络流等)复制到内存时,Swift 用户经常会遇到内存对齐不匹配。

举个例子,给定任务数据流,其中 4 个字节值在字节偏移量 3 到 7 之间编码:

代码语言:Swift
复制
let data = Data([0x0, 0x0, 0x0, 0xff, 0xff, 0xff, 0xff, 0x0])

为了把数据流中的所有0xff字节提取转为UInt32, 我们可以使用load(as:), 如下:

代码语言:Swift
复制
let result = data.dropFirst(3).withUnsafeBytes { $0.load(as: UInt32.self) }

当你运行上述代码时,会发生运行时 crash。因为这种情况下,load方法要求基础指针已经正确进行内存对齐,才能访问UInt32。所以这里需要其他解决方案。比如下面列举一种解决方案:

代码语言:Swift
复制
let result = data.dropFirst(3).withUnsafeBytes { buffer -> UInt32 in
  var storage = UInt32.zero
  withUnsafeMutableBytes(of: &storage) {
    $0.copyBytes(from: buffer.prefix(MemoryLayout<UInt32>.size))
  }
  return storage
}

虽然上述解决方案可以达到效果,但是这里存在2个问题。第一,这个解决方案的意图表现不是那么明显,我理解为嵌套过多。第二,上述解决方案使用了2次拷贝,而不是预期的单个拷贝:第一个拷贝到正确对齐的原始缓冲区,然后第二个拷贝到最后正确类型的变量。我们期望可以用一份拷贝完成这项工作。

改善任意内存对齐的加载操作,很重要的类型是它的值是可以进行逐位复制的类型,而不需要引用计数操作。这些类型通常被称为 "POD"(普通旧数据)或普通类型。我们建议将未对齐加载操作的使用限制到这些 POD 类型里。

解决方案

为了支持UnsafeRawPointer, UnsafeRawBufferPointer 以及他们的可变类型(mutable)的内存未对齐加载,我们提议新增 API UnsafeRawPointer.loadUnaligned(fromByteOffset:as:)。当然这些类型将会明确限制为 POD 类型。那么什么情况下加载非 POD 类型?只有当原始内存是另一个活跃对象时,且该对象的内存构造已经正确对齐。原来的 API(load)会继续支持这种情况。新的 API (loadUnaligned) 在返回值类型是 POD 类型时, 将会在 debug 模式下发生断言 (assert) ,中止运行。release 情况下面会讲到。

UnsafeMutableRawPointer.storeBytes(of:toByteOffset:) API 只对 POD 类型起作用,这点在文档已经注释标明。但是在运行时,该 API 会将内存地址存储强制转为与原始类型已经正确对齐的偏移量。这里我们建议删除该对齐限制,并强制执行文档中标明的 POD 限制。这样虽然文档已经更新,但 API 可以保持不变。

UnsafeRawBufferPointerUnsafeMutableRawBufferPointer 类型都会接受相关的修改。

详细设计

代码语言:Swift
复制
extension UnsafeRawPointer {
  /// Returns a new instance of the given type, constructed from the raw memory
  /// at the specified offset.
  ///
  /// This function only supports loading trivial types,
  /// and will trap if this precondition is not met.
  /// A trivial type does not contain any reference-counted property
  /// within its in-memory representation.
  /// The memory at this pointer plus `offset` must be laid out
  /// identically to the in-memory representation of `T`.
  ///
  /// - Note: A trivial type can be copied with just a bit-for-bit copy without
  ///   any indirection or reference-counting operations. Generally, native
  ///   Swift types that do not contain strong or weak references or other
  ///   forms of indirection are trivial, as are imported C structs and enums.
  ///
  /// - Parameters:
  ///   - offset: The offset from this pointer, in bytes. `offset` must be
  ///     nonnegative. The default is zero.
  ///   - type: The type of the instance to create.
  /// - Returns: A new instance of type `T`, read from the raw bytes at
  ///   `offset`. The returned instance isn't associated
  ///   with the value in the range of memory referenced by this pointer.
  public func loadUnaligned<T>(fromByteOffset offset: Int = 0, as type: T.Type) -> T
}
代码语言:Swift
复制
extension UnsafeMutableRawPointer {
  /// Returns a new instance of the given type, constructed from the raw memory
  /// at the specified offset.
  ///
  /// This function only supports loading trivial types,
  /// and will trap if this precondition is not met.
  /// A trivial type does not contain any reference-counted property
  /// within its in-memory representation.
  /// The memory at this pointer plus `offset` must be laid out
  /// identically to the in-memory representation of `T`.
  ///
  /// - Note: A trivial type can be copied with just a bit-for-bit copy without
  ///   any indirection or reference-counting operations. Generally, native
  ///   Swift types that do not contain strong or weak references or other
  ///   forms of indirection are trivial, as are imported C structs and enums.
  ///
  /// - Parameters:
  ///   - offset: The offset from this pointer, in bytes. `offset` must be
  ///     nonnegative. The default is zero.
  ///   - type: The type of the instance to create.
  /// - Returns: A new instance of type `T`, read from the raw bytes at
  ///   `offset`. The returned instance isn't associated
  ///   with the value in the range of memory referenced by this pointer.
  public func loadUnaligned<T>(fromByteOffset offset: Int = 0, as type: T.Type) -> T

  /// Stores the given value's bytes into raw memory at the specified offset.
  ///
  /// The type `T` to be stored must be a trivial type. The memory
  /// must also be uninitialized, initialized to `T`, or initialized to
  /// another trivial type that is layout compatible with `T`.
  ///
  /// After calling `storeBytes(of:toByteOffset:as:)`, the memory is
  /// initialized to the raw bytes of `value`. If the memory is bound to a
  /// type `U` that is layout compatible with `T`, then it contains a value of
  /// type `U`. Calling `storeBytes(of:toByteOffset:as:)` does not change the
  /// bound type of the memory.
  ///
  /// - Note: A trivial type can be copied with just a bit-for-bit copy without
  ///   any indirection or reference-counting operations. Generally, native
  ///   Swift types that do not contain strong or weak references or other
  ///   forms of indirection are trivial, as are imported C structs and enums.
  ///
  /// If you need to store a copy of a value of a type that isn't trivial into memory,
  /// you cannot use the `storeBytes(of:toByteOffset:as:)` method. Instead, you must know
  /// the type of value previously in memory and initialize or assign the
  /// memory. For example, to replace a value stored in a raw pointer `p`,
  /// where `U` is the current type and `T` is the new type, use a typed
  /// pointer to access and deinitialize the current value before initializing
  /// the memory with a new value.
  ///
  ///     let typedPointer = p.bindMemory(to: U.self, capacity: 1)
  ///     typedPointer.deinitialize(count: 1)
  ///     p.initializeMemory(as: T.self, repeating: newValue, count: 1)
  ///
  /// - Parameters:
  ///   - value: The value to store as raw bytes.
  ///   - offset: The offset from this pointer, in bytes. `offset` must be
  ///     nonnegative. The default is zero.
  ///   - type: The type of `value`.
  public func storeBytes<T>(of value: T, toByteOffset offset: Int = 0, as type: T.Type)
}

UnsafeRawBufferPointerUnsafeMutableRawBufferPointer 都有类似的 loadUnaligned 函数。它允许从缓冲区的任意偏移量做加载操作,并遵循BufferPointer类型的通用索引验证规则:在调试模式下编译客户端代码时,将检查索引,而在发布模式下编译客户代码时,则不检查索引。

代码语言:Swift
复制
extension UnsafeMutableRawBufferPointer {
  /// Returns a new instance of the given type, constructed from the raw memory
  /// at the specified offset.
  ///
  /// This function only supports loading trivial types.
  /// A trivial type does not contain any reference-counted property
  /// within its in-memory stored representation.
  /// The memory at `offset` bytes into the buffer must be laid out
  /// identically to the in-memory representation of `T`.
  ///
  /// - Note: A trivial type can be copied with just a bit-for-bit copy without
  ///   any indirection or reference-counting operations. Generally, native
  ///   Swift types that do not contain strong or weak references or other
  ///   forms of indirection are trivial, as are imported C structs and enums.
  ///
  /// You can use this method to create new values from the buffer pointer's
  /// underlying bytes. The following example creates two new `Int32`
  /// instances from the memory referenced by the buffer pointer `someBytes`.
  /// The bytes for `a` are copied from the first four bytes of `someBytes`,
  /// and the bytes for `b` are copied from the next four bytes.
  ///
  ///     let a = someBytes.load(as: Int32.self)
  ///     let b = someBytes.load(fromByteOffset: 4, as: Int32.self)
  ///
  /// The memory to read for the new instance must not extend beyond the buffer
  /// pointer's memory region---that is, `offset + MemoryLayout<T>.size` must
  /// be less than or equal to the buffer pointer's `count`.
  ///
  /// - Parameters:
  ///   - offset: The offset, in bytes, into the buffer pointer's memory at
  ///     which to begin reading data for the new instance. The buffer pointer
  ///     plus `offset` must be properly aligned for accessing an instance of
  ///     type `T`. The default is zero.
  ///   - type: The type to use for the newly constructed instance. The memory
  ///     must be initialized to a value of a type that is layout compatible
  ///     with `type`.
  /// - Returns: A new instance of type `T`, copied from the buffer pointer's
  ///   memory.
  public func loadUnaligned<T>(fromByteOffset offset: Int = 0, as type: T.Type) -> T
}

此外,UnsafeMutableBufferPointer.storeBytes(of:toByteOffset) 方法将像它对应的UnsafeMutablePointer.storeBytes(of:toByteOffset) 方法一样发生更改,不再在运行时强制对齐内存。同样,索引验证行为没有改变:当客户端代码在调试模式(debug)下编译时,将检查索引,而当客户端代码以发布模式(release)编译时,则不检查索引。

代码语言:Swift
复制
extension UnsafeMutableRawBufferPointer {
  /// Stores a value's bytes into the buffer pointer's raw memory at the
  /// specified byte offset.
  ///
  /// The type `T` to be stored must be a trivial type. The memory must also be
  /// uninitialized, initialized to `T`, or initialized to another trivial
  /// type that is layout compatible with `T`.
  ///
  /// The memory written to must not extend beyond the buffer pointer's memory
  /// region---that is, `offset + MemoryLayout<T>.size` must be less than or
  /// equal to the buffer pointer's `count`.
  ///
  /// After calling `storeBytes(of:toByteOffset:as:)`, the memory is
  /// initialized to the raw bytes of `value`. If the memory is bound to a
  /// type `U` that is layout compatible with `T`, then it contains a value of
  /// type `U`. Calling `storeBytes(of:toByteOffset:as:)` does not change the
  /// bound type of the memory.
  ///
  /// - Note: A trivial type can be copied with just a bit-for-bit copy without
  ///   any indirection or reference-counting operations. Generally, native
  ///   Swift types that do not contain strong or weak references or other
  ///   forms of indirection are trivial, as are imported C structs and enums.
  ///
  /// If you need to store a copy of a value of a type that isn't trivial into memory,
  /// you cannot use the `storeBytes(of:toByteOffset:as:)` method. Instead, you must know
  /// the type of value previously in memory and initialize or assign the memory.
  ///
  /// - Parameters:
  ///   - offset: The offset in bytes into the buffer pointer's memory to begin
  ///     reading data for the new instance. The buffer pointer plus `offset`
  ///     must be properly aligned for accessing an instance of type `T`. The
  ///     default is zero.
  ///   - type: The type to use for the newly constructed instance. The memory
  ///     must be initialized to a value of a type that is layout compatible
  ///     with `type`.
  public func storeBytes<T>(of value: T, toByteOffset offset: Int = 0, as: T.Type)
}

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 解决方案
  • 详细设计
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档