前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Containerd深度剖析-Diff上篇

Containerd深度剖析-Diff上篇

作者头像
zouyee
发布2023-02-06 10:42:18
5660
发布2023-02-06 10:42:18
举报
文章被收录于专栏:Kubernetes GO

Containerd提供容器进程的管理,镜像的管理,文件系统快照以及元数据和依赖管理,关于Containerd的介绍,可以参看前文,Containerd深度剖析-runtime篇,本文将为从代码层面分析 Containerd diff 服务的实现逻辑

编辑|zouyee

接受范围|中度

注:Containerd版本为:v1.7.0-beta.2

下图为Containerd架构总览图,其基于微服务实现,内部通过rpc(ttrpc)调用:

Containerd diff插件服务,主要实现Diff 和 Apply 两个rpc方法 :

代码语言:javascript
复制
// Diff service creates and applies diffs
service Diff {
  // Apply applies the content associated with the provided digests onto
  // the provided mounts. Archive content will be extracted and
  // decompressed if necessary.
  rpc Apply(ApplyRequest) returns (ApplyResponse);

  // Diff creates a diff between the given mounts and uploads the result
  // to the content store.
  rpc Diff(DiffRequest) returns (DiffResponse);
}
  • Diff计算提供的upper与lower 挂载目录的差异,并将结果存储到content store,结果为OCI 规范 Changesets方式以tar方式打包的内容。
  • Apply将所指定描述器( ocispec.Descriptor )的相关内容应用到指定的挂载目录。一般情况下,描述器指向一个 tar 格式的文件系统diff。

简而言之,可以概括为,Diff提供diff layer的生成 , Apply提供diff layer 挂载,本文将对Diff实现进行讲解。

服务注册

注册 diff为GRPC Plugin 类型 , 在初始化阶段,首先获取已注册的服务插件列表,然后获取services.DiffService服务,最终返回实现Diff RPC 服务的实例,即service对象的指针, services/diff/service.go:29

代码语言:javascript
复制
func init() {
  plugin.Register(&plugin.Registration{
    Type: plugin.GRPCPlugin,
    ID:   "diff",
    Requires: []plugin.Type{
      plugin.ServicePlugin,
    },
    InitFn: func(ic *plugin.InitContext) (interface{}, error) {
      plugins, err := ic.GetByType(plugin.ServicePlugin)
      if err != nil {
        return nil, err
      }
      p, ok := plugins[services.DiffService]
      if !ok {
        return nil, errors.New("diff service not found")
      }
      i, err := p.Instance()
      if err != nil {
        return nil, err
      }
      return &service{local: i.(diffapi.DiffClient)}, nil
    },
  })
}

主要逻辑如下所示:

a. 获取ServicePlugin类型的所有插件

b. 获取ID为diff的插件

c. 调用插件的Instance方法,即InitFn

上述提及的services.DiffService注册逻辑在services/diff/local.go:51

代码语言:javascript
复制
func init() {
  plugin.Register(&plugin.Registration{
    Type: plugin.ServicePlugin,
    ID:   services.DiffService,
    Requires: []plugin.Type{
      plugin.DiffPlugin,
    },
    Config: defaultDifferConfig,
    InitFn: func(ic *plugin.InitContext) (interface{}, error) {
      differs, err := ic.GetByType(plugin.DiffPlugin)
      if err != nil {
        return nil, err
      }

      orderedNames := ic.Config.(*config).Order
      ordered := make([]differ, len(orderedNames))
      for i, n := range orderedNames {
        differp, ok := differs[n]
        if !ok {
          return nil, fmt.Errorf("needed differ not loaded: %s", n)
        }
        d, err := differp.Instance()
        if err != nil {
          return nil, fmt.Errorf("could not load required differ due plugin init error: %s: %w", n, err)
        }

        ordered[i], ok = d.(differ)
        if !ok {
          return nil, fmt.Errorf("differ does not implement Comparer and Applier interface: %s", n)
        }
      }

      return &local{
        differs: ordered,
      }, nil
    },
  })
}

注册ID为"walking"的DiffPlugin插件类型 , 返回diffPlugin结构体,diff/walking/plugin/plugin.go:28

代码语言:javascript
复制
func init() {
  plugin.Register(&plugin.Registration{
    Type: plugin.DiffPlugin,
    ID:   "walking",
    Requires: []plugin.Type{
      plugin.MetadataPlugin,
    },
    InitFn: func(ic *plugin.InitContext) (interface{}, error) {
      md, err := ic.Get(plugin.MetadataPlugin)
      if err != nil {
        return nil, err
      }

      ic.Meta.Platforms = append(ic.Meta.Platforms, platforms.DefaultSpec())
      cs := md.(*metadata.DB).ContentStore()

      return diffPlugin{
        Comparer: walking.NewWalkingDiff(cs),
        Applier:  apply.NewFileSystemApplier(cs),
      }, nil
    },
  })
}

该walking插件的初始化依赖Metadata插件,其用于元数据存储,底层实现为boltdb,通过 metadata.DB获取ContentStore,用于保存 diff layer 元数据信息,其由Comparer和Applier字段组成

源码实现

上一小结中diff 服务插件注册时返回为 local指针,该指针实现diffapi.DiffClient接口,其主要为下面两个方法 Apply() 、Diff()

代码语言:javascript
复制
type DiffClient interface {
  // Apply applies the content associated with the provided digests onto
  // the provided mounts. Archive content will be extracted and
  // decompressed if necessary.
  Apply(ctx context.Context, in *ApplyRequest, opts ...grpc.CallOption) (*ApplyResponse, error)
  // Diff creates a diff between the given mounts and uploads the result
  // to the content store.
  Diff(ctx context.Context, in *DiffRequest, opts ...grpc.CallOption) (*DiffResponse, error)
}

通过diffapi.DiffClient 类型可知,local指针以GRPC请求方式来提供接口调用。diff 服务插件的注册时返回的实现的结果为sevice指针,其local字段为diffapi.DiffClient类型。

代码语言:javascript
复制
func (s *service) Apply(ctx context.Context, er *diffapi.ApplyRequest) (*diffapi.ApplyResponse, error) {
  return s.local.Apply(ctx, er)
}

func (s *service) Diff(ctx context.Context, dr *diffapi.DiffRequest) (*diffapi.DiffResponse, error) {
  return s.local.Diff(ctx, dr)
}

DiffClient实现

a. local.Apply

代码语言:javascript
复制
func (l *local) Apply(ctx context.Context, er *diffapi.ApplyRequest, _ ...grpc.CallOption) (*diffapi.ApplyResponse, error) {
  var (
    ocidesc ocispec.Descriptor
    err     error
    desc    = toDescriptor(er.Diff)
    mounts  = toMounts(er.Mounts)
  )

  var opts []diff.ApplyOpt
  if er.Payloads != nil {
    payloads := make(map[string]typeurl.Any)
    for k, v := range er.Payloads {
      payloads[k] = v
    }
    opts = append(opts, diff.WithPayloads(payloads))
  }
// 前期排序的不同的diff插件,依次调用
  for _, differ := range l.differs {
    ocidesc, err = differ.Apply(ctx, desc, mounts, opts...)
    if !errdefs.IsNotImplemented(err) {
      break
    }
  }

  if err != nil {
    return nil, errdefs.ToGRPC(err)
  }

  return &diffapi.ApplyResponse{
    Applied: fromDescriptor(ocidesc),
  }, nil

}

b. local.Diff

代码语言:javascript
复制
func (l *local) Diff(ctx context.Context, dr *diffapi.DiffRequest, _ ...grpc.CallOption) (*diffapi.DiffResponse, error) {
  var (
    ocidesc ocispec.Descriptor
    err     error
    aMounts = toMounts(dr.Left)
    bMounts = toMounts(dr.Right)
  )

  var opts []diff.Opt
  if dr.MediaType != "" {
    opts = append(opts, diff.WithMediaType(dr.MediaType))
  }
  if dr.Ref != "" {
    opts = append(opts, diff.WithReference(dr.Ref))
  }
  if dr.Labels != nil {
    opts = append(opts, diff.WithLabels(dr.Labels))
  }
  if dr.SourceDateEpoch != nil {
    tm := dr.SourceDateEpoch.AsTime()
    opts = append(opts, diff.WithSourceDateEpoch(&tm))
  }
  // 前期排序的不同的diff插件,依次调用
  for _, d := range l.differs {
    ocidesc, err = d.Compare(ctx, aMounts, bMounts, opts...)
    if !errdefs.IsNotImplemented(err) {
      break
    }
  }
  if err != nil {
    return nil, errdefs.ToGRPC(err)
  }

  return &diffapi.DiffResponse{
    Diff: fromDescriptor(ocidesc),
  }, nil
}

可以看到上面的 gRPC 实现服务层实现了Apply 和 Diff 方法, 其中调用了Apply和Comparer方法,最终调用底层 diffPlugin的两个字段的方法实现:walking.walkingDiff 和 apply.fsApplier

代码语言:javascript
复制
diffPlugin{
  Comparer: walking.NewWalkingDiff(cs),
  Applier:  apply.NewFileSystemApplier(cs),
}

Comparer接口实现

Diff 的差异比对器的相关接口与结构定义如下:

代码语言:javascript
复制
// Comparer allows creation of filesystem diffs between mounts
type Comparer interface {
  // Compare 计算两个挂载点的不同,返回计算 diff 的描述符。
  // 参数opts (可选项):
  // ref 参照ID -- 可被用于定位所创建的 diff 的实际内容
  // media 类型 -- 用于确定创建的content的格式
    Compare(ctx context.Context, lower, upper []mount.Mount, opts ...Opt) (ocispec.Descriptor, error)
}

其中Config定义如下

代码语言:javascript
复制
// Config is used to hold parameters needed for a diff operation
type Config struct {
  // MediaType is the type of diff to generate
  // Default depends on the differ,
  // i.e. application/vnd.oci.image.layer.v1.tar+gzip
  MediaType string

  // Reference is the content upload reference
  // Default will use a random reference string
  Reference string

  // Labels are the labels to apply to the generated content
  Labels map[string]string

  // Compressor is a function to compress the diff stream
  // instead of the default gzip compressor. Differ passes
  // the MediaType of the target diff content to the compressor.
  // When using this config, MediaType must be specified as well.
  Compressor func(dest io.Writer, mediaType string) (io.WriteCloser, error)

  // SourceDateEpoch specifies the SOURCE_DATE_EPOCH without touching the env vars.
  SourceDateEpoch *time.Time
}

// Opt is used to configure a diff operation
type Opt func(*Config) error

Walking.NewWalkingDiff(cs) 返回 walkingDiff指针 ,其实现了 diff.Comparer 接口。walkingDiff store 属性为content.Store(内容存储)

代码语言:javascript
复制
type walkingDiff struct {
  store content.Store
}

// NewWalkingDiff is a generic implementation of diff.Comparer.  The diff is
// calculated by mounting both the upper and lower mount sets and walking the
// mounted directories concurrently. Changes are calculated by comparing files
// against each other or by comparing file existence between directories.
// NewWalkingDiff uses no special characteristics of the mount sets and is
// expected to work with any filesystem.
func NewWalkingDiff(store content.Store) diff.Comparer {
  return &walkingDiff{
    store: store,
  }
}

Compare() 基于指定的两个挂载目录差异计算以创建 diff layer,并将其 diff 内容存储到content store,代码实现如下:

代码语言:javascript
复制
// Compare creates a diff between the given mounts and uploads the result
// to the content store.
func (s *walkingDiff) Compare(ctx context.Context, lower, upper []mount.Mount, opts ...diff.Opt) (d ocispec.Descriptor, err error) {
  // 配置config
  var config diff.Config
  for _, opt := range opts {
    if err := opt(&config); err != nil {
      return emptyDesc, err
    }
  }
  if tm := epoch.FromContext(ctx); tm != nil && config.SourceDateEpoch == nil {
    config.SourceDateEpoch = tm
  }

  var writeDiffOpts []archive.WriteDiffOpt
  if config.SourceDateEpoch != nil {
    writeDiffOpts = append(writeDiffOpts, archive.WithSourceDateEpoch(config.SourceDateEpoch))
  }

  var isCompressed bool
  // 判断是否压缩及设置media type
  if config.Compressor != nil {
    if config.MediaType == "" {
      return emptyDesc, errors.New("media type must be explicitly specified when using custom compressor")
    }
    isCompressed = true
  } else {
    if config.MediaType == "" {
      config.MediaType = ocispec.MediaTypeImageLayerGzip
    }

    switch config.MediaType {
    case ocispec.MediaTypeImageLayer:
    case ocispec.MediaTypeImageLayerGzip:
      isCompressed = true
    default:
      return emptyDesc, fmt.Errorf("unsupported diff media type: %v: %w", config.MediaType, errdefs.ErrNotImplemented)
    }
  }

  var ocidesc ocispec.Descriptor
  // 临时挂载处理逻辑
  if err := mount.WithTempMount(ctx, lower, func(lowerRoot string) error {
    return mount.WithTempMount(ctx, upper, func(upperRoot string) error {
      var newReference bool
      if config.Reference == "" {
        newReference = true
        config.Reference = uniqueRef()
      }
      // writer 写入
      cw, err := s.store.Writer(ctx,
        content.WithRef(config.Reference),
        content.WithDescriptor(ocispec.Descriptor{
          MediaType: config.MediaType, // most contentstore implementations just ignore this
        }))
      if err != nil {
        return fmt.Errorf("failed to open writer: %w", err)
      }

      // errOpen is set when an error occurs while the content writer has not been
      // committed or closed yet to force a cleanup
      var errOpen error
      defer func() {
        if errOpen != nil {
          cw.Close()
          if newReference {
          // 回滚
            if abortErr := s.store.Abort(ctx, config.Reference); abortErr != nil {
              log.G(ctx).WithError(abortErr).WithField("ref", config.Reference).Warnf("failed to delete diff upload")
            }
          }
        }
      }()
      if !newReference {
        if errOpen = cw.Truncate(0); errOpen != nil {
          return errOpen
        }
      }

      if isCompressed {
      // 压缩处理
        dgstr := digest.SHA256.Digester()
        var compressed io.WriteCloser
        if config.Compressor != nil {
          compressed, errOpen = config.Compressor(cw, config.MediaType)
          if errOpen != nil {
            return fmt.Errorf("failed to get compressed stream: %w", errOpen)
          }
        } else {
        // gzip默认处理
          compressed, errOpen = compression.CompressStream(cw, compression.Gzip)
          if errOpen != nil {
            return fmt.Errorf("failed to get compressed stream: %w", errOpen)
          }
        }
        // 写入diff信息
        errOpen = archive.WriteDiff(ctx, io.MultiWriter(compressed, dgstr.Hash()), lowerRoot, upperRoot, writeDiffOpts...)
        compressed.Close()
        if errOpen != nil {
          return fmt.Errorf("failed to write compressed diff: %w", errOpen)
        }

        if config.Labels == nil {
          config.Labels = map[string]string{}
        }
        config.Labels[uncompressed] = dgstr.Digest().String()
      } else {
        if errOpen = archive.WriteDiff(ctx, cw, lowerRoot, upperRoot, writeDiffOpts...); errOpen != nil {
          return fmt.Errorf("failed to write diff: %w", errOpen)
        }
      }

      var commitopts []content.Opt
      if config.Labels != nil {
        commitopts = append(commitopts, content.WithLabels(config.Labels))
      }

      dgst := cw.Digest()
      // 生成digest提交metadata
      if errOpen = cw.Commit(ctx, 0, dgst, commitopts...); errOpen != nil {
        if !errdefs.IsAlreadyExists(errOpen) {
          return fmt.Errorf("failed to commit: %w", errOpen)
        }
        errOpen = nil
      }
      // 获取digest的info
      info, err := s.store.Info(ctx, dgst)
      if err != nil {
        return fmt.Errorf("failed to get info from content store: %w", err)
      }
      if info.Labels == nil {
        info.Labels = make(map[string]string)
      }
      // Set uncompressed label if digest already existed without label
      // 比对是否应该有压缩的标签
      if _, ok := info.Labels[uncompressed]; !ok {
        info.Labels[uncompressed] = config.Labels[uncompressed]
        if _, err := s.store.Update(ctx, info, "labels."+uncompressed); err != nil {
          return fmt.Errorf("error setting uncompressed label: %w", err)
        }
      }

      ocidesc = ocispec.Descriptor{
        MediaType: config.MediaType,
        Size:      info.Size,
        Digest:    info.Digest,
      }
      return nil
    })
  }); err != nil {
    return emptyDesc, err
  }

  return ocidesc, nil
}

walkingDiff Compare 方法处理方式归结如下:

  • upper 上层为变化层挂载目录,lower 下层为基线层挂载目录,upper/lower进行差异化比对
  • 按照 OCI changesets layer 规范打包计算差异结果集 diff tar 并存储至content store
  • 通过对路径对比、文件属性对比、文件内容字节对比来分析,最终计算出变化类型
  • 变化类型分为:add 、delete 、modify、unmodified

archive.WriteDiff 方法所提供的两个目录(a/b)的差异并写入 tar 包中。fs.Changes() 以遍历方式计算差异,回调函数HandleChange为 ChangeSets 差异集 tar 打包处理, 其所生成的 tar 是以OCI 标准规范的的文件方式进行标记,其中删除的文件基于 AUFS whiteouts的规范以 ".wh." 作为前缀命名文件。详细规范可以参考官网

代码语言:javascript
复制
// WriteDiff writes a tar stream of the computed difference between the
// provided paths.
//
// Produces a tar using OCI style file markers for deletions. Deleted
// files will be prepended with the prefix ".wh.". This style is
// based off AUFS whiteouts.
// See https://github.com/opencontainers/image-spec/blob/main/layer.md
func WriteDiff(ctx context.Context, w io.Writer, a, b string, opts ...WriteDiffOpt) error {
  ...
  if options.writeDiffFunc == nil {
    options.writeDiffFunc = writeDiffNaive
  }

  return options.writeDiffFunc(ctx, w, a, b, options)
}


func writeDiffNaive(ctx context.Context, w io.Writer, a, b string, o WriteDiffOptions) error {
  var opts []ChangeWriterOpt
  if o.SourceDateEpoch != nil {
    opts = append(opts,
      WithModTimeUpperBound(*o.SourceDateEpoch),
      WithWhiteoutTime(*o.SourceDateEpoch))
  }
  cw := NewChangeWriter(w, b, opts...)
  err := fs.Changes(ctx, a, b, cw.HandleChange)
  if err != nil {
    return fmt.Errorf("failed to create diff tar stream: %w", err)
  }
  return cw.Close()
}
// NewChangeWriter returns ChangeWriter that writes tar stream of the source directory
// to the privided writer. Change information (add/modify/delete/unmodified) for each
// file needs to be passed through HandleChange method.
func NewChangeWriter(w io.Writer, source string, opts ...ChangeWriterOpt) *ChangeWriter {
  cw := &ChangeWriter{
    tw:        tar.NewWriter(w),
    source:    source,
    whiteoutT: time.Now(), // can be overridden with WithWhiteoutTime(time.Time) ChangeWriterOpt .
    inodeSrc:  map[uint64]string{},
    inodeRefs: map[uint64][]string{},
    addedDirs: map[string]struct{}{},
  }
  for _, o := range opts {
    o(cw)
  }
  return cw
}

Changes函数调用changeFn计算变更差异,其中'a' 是基线目录和 'b' 是更改的目录。变化回调需要按照路径名顺序调用及应用。根据按序回调的要求,下列情况符合条件:

  • 删除的目录树只为删除的根目录创建一个更改目录,其余的更改是隐式的。
  • 将目录变更为文件的场景不会有删除子路径项的条目,父目录删除条目代表这些条目的删除。

其中隐藏目录不做特殊处理,基础目录的每个被删除的文件都显示为删除。针对具有时间戳且可能存在被截断情况的文件进行文件内容比对,如果比较的任意一个文件存在零纳秒值,将比较每个字节差异。如果两个文件具有相同的秒值但不同纳秒值且其中一个值为零,则在内容相同时,视为未更改。这种行为是由打包处理期间对时间戳截断印发的。

代码语言:javascript
复制
func Changes(ctx context.Context, a, b string, changeFn ChangeFunc) error {
  if a == "" {
  // 如果a为空,则直接以新增目录方式处理
   ...
    return addDirChanges(ctx, changeFn, b)
  } else if diffOptions := detectDirDiff(b, a); diffOptions != nil {
   ...
   // 目前detectDirDiff直接返回nil
    return diffDirChanges(ctx, changeFn, a, diffOptions)
  }
  ...
  // 处理lower跟upper目录
  return doubleWalkDiff(ctx, changeFn, a, b)
}

其中addDirChanges函数遇到非root路径时,以add模式进行处理

代码语言:javascript
复制
func addDirChanges(ctx context.Context, changeFn ChangeFunc, root string) error {
  return filepath.Walk(root, func(path string, f os.FileInfo, err error) error {
    if err != nil {
      return err
    }

    // Rebase path
    path, err = filepath.Rel(root, path)
    if err != nil {
      return err
    }

    path = filepath.Join(string(os.PathSeparator), path)

    // Skip root
    if path == string(os.PathSeparator) {
      return nil
    }

    return changeFn(ChangeKindAdd, path, f, nil)
  })
}

其中doubleWalkDiff函数通过遍历两个目录,进行diff处理,最终创建一个diff layer

代码语言:javascript
复制
func doubleWalkDiff(ctx context.Context, changeFn ChangeFunc, a, b string) (err error) {
  g, ctx := errgroup.WithContext(ctx)

  var (
    c1 = make(chan *currentPath)
    c2 = make(chan *currentPath)

    f1, f2 *currentPath
    rmdir  string
  )
  g.Go(func() error {
    defer close(c1)
    return pathWalk(ctx, a, c1)
  })
  g.Go(func() error {
    defer close(c2)
    return pathWalk(ctx, b, c2)
  })
  g.Go(func() error {
   // 循环condition 为c1 != nil 或者 c2 != nil
    for c1 != nil || c2 != nil {
      // 处理c1 channel未关闭,且没读取情况(f1为nil)
      if f1 == nil && c1 != nil {
        f1, err = nextPath(ctx, c1)
        if err != nil {
          return err
        }
        if f1 == nil {
          c1 = nil
        }
      }
      // 处理c2 channel未关闭,且没读取情况(f2为nil)
      if f2 == nil && c2 != nil {
        f2, err = nextPath(ctx, c2)
        if err != nil {
          return err
        }
        if f2 == nil {
          c2 = nil
        }
      }
      if f1 == nil && f2 == nil {
        continue
      }

      var f os.FileInfo
      // pathChange返回变化类型:add、update、delete
      k, p := pathChange(f1, f2)
      switch k {
      // add类型
      case ChangeKindAdd:
        if rmdir != "" {
          rmdir = ""
        }
        f = f2.f
        f2 = nil
      // delete类型
      case ChangeKindDelete:
        // Check if this file is already removed by being
        // under of a removed directory
        if rmdir != "" && strings.HasPrefix(f1.path, rmdir) {
          f1 = nil
          continue
        } else if f1.f.IsDir() {
          rmdir = f1.path + string(os.PathSeparator)
        } else if rmdir != "" {
          rmdir = ""
        }
        f1 = nil
      // modify类型
      case ChangeKindModify:
        // 判断是否为同一文件
        same, err := sameFile(f1, f2)
        if err != nil {
          return err
        }
        if f1.f.IsDir() && !f2.f.IsDir() {
          rmdir = f1.path + string(os.PathSeparator)
        } else if rmdir != "" {
          rmdir = ""
        }
        f = f2.f
        f1 = nil
        f2 = nil
        if same {
          if !isLinked(f) {
            continue
          }
          k = ChangeKindUnmodified
        }
      }
      if err := changeFn(k, p, f, nil); err != nil {
        return err
      }
    }
    return nil
  })

  return g.Wait()
}

其中HandleChange函数, 作为Changes函数的输入参数,针对每个改变类型进行调用处理,其中变化类型分为:add 、delete、modify 、unmodified。

针对删除类型处理方式为 whiteOut规范 , 即对删除的文件或目录创建前缀 ".wh."+原始名文件,其它类型将diff内容以tar方式打包。

代码语言:javascript
复制
func (cw *ChangeWriter) HandleChange(k fs.ChangeKind, p string, f os.FileInfo, err error) error {
  if err != nil {
    return err
  }
  // 处理delete类型
  if k == fs.ChangeKindDelete {
    whiteOutDir := filepath.Dir(p)
    whiteOutBase := filepath.Base(p)
    whiteOut := filepath.Join(whiteOutDir, whiteoutPrefix+whiteOutBase)
    hdr := &tar.Header{
      Typeflag:   tar.TypeReg,
      Name:       whiteOut[1:],
      Size:       0,
      ModTime:    cw.whiteoutT,
      AccessTime: cw.whiteoutT,
      ChangeTime: cw.whiteoutT,
    }
    // includeParents 处理whiteout parent目录不为root时,作为update处理
    if err := cw.includeParents(hdr); err != nil {
      return err
    }
    // 写入whiteout头
    if err := cw.tw.WriteHeader(hdr); err != nil {
      return fmt.Errorf("failed to write whiteout header: %w", err)
    }
  } else {
    var (
      link   string
      err    error
      source = filepath.Join(cw.source, p)
    )
    // 判断文件类型
    switch {
    case f.Mode()&os.ModeSocket != 0:
      return nil // ignore sockets
    case f.Mode()&os.ModeSymlink != 0:
      if link, err = os.Readlink(source); err != nil {
        return err
      }
    }
    
    hdr, err := tar.FileInfoHeader(f, link)
    if err != nil {
      return err
    }
    // 设置header权限
    hdr.Mode = int64(chmodTarEntry(os.FileMode(hdr.Mode)))

    // truncate timestamp for compatibility. without PAX stdlib rounds timestamps instead
    hdr.Format = tar.FormatPAX
    if cw.modTimeUpperBound != nil && hdr.ModTime.After(*cw.modTimeUpperBound) {
      hdr.ModTime = *cw.modTimeUpperBound
    }
    hdr.ModTime = hdr.ModTime.Truncate(time.Second)
    hdr.AccessTime = time.Time{}
    hdr.ChangeTime = time.Time{}

    name := p
    if strings.HasPrefix(name, string(filepath.Separator)) {
      name, err = filepath.Rel(string(filepath.Separator), name)
      if err != nil {
        return fmt.Errorf("failed to make path relative: %w", err)
      }
    }
    // Canonicalize to POSIX-style paths using forward slashes. Directory
    // entries must end with a slash.
    // 设置名称
    name = filepath.ToSlash(name)
    if f.IsDir() && !strings.HasSuffix(name, "/") {
      name += "/"
    }
    hdr.Name = name

    if err := setHeaderForSpecialDevice(hdr, name, f); err != nil {
      return fmt.Errorf("failed to set device headers: %w", err)
    }

    // additionalLinks stores file names which must be linked to
    // this file when this file is added
    var additionalLinks []string
    inode, isHardlink := fs.GetLinkInfo(f)
    // hard link处理
    if isHardlink {
      // If the inode has a source, always link to it
      if source, ok := cw.inodeSrc[inode]; ok {
        hdr.Typeflag = tar.TypeLink
        hdr.Linkname = source
        hdr.Size = 0
      } else {
        if k == fs.ChangeKindUnmodified {
          cw.inodeRefs[inode] = append(cw.inodeRefs[inode], name)
          return nil
        }
        cw.inodeSrc[inode] = name
        additionalLinks = cw.inodeRefs[inode]
        delete(cw.inodeRefs, inode)
      }
    } else if k == fs.ChangeKindUnmodified {
    // 未变更则返回即可
      // Nothing to write to diff
      return nil
    }
    // 获取安全属性
    if capability, err := getxattr(source, "security.capability"); err != nil {
      return fmt.Errorf("failed to get capabilities xattr: %w", err)
    } else if len(capability) > 0 {
      if hdr.PAXRecords == nil {
        hdr.PAXRecords = map[string]string{}
      }
      hdr.PAXRecords[paxSchilyXattr+"security.capability"] = string(capability)
    }

    if err := cw.includeParents(hdr); err != nil {
      return err
    }
    if err := cw.tw.WriteHeader(hdr); err != nil {
      return fmt.Errorf("failed to write file header: %w", err)
    }

    if hdr.Typeflag == tar.TypeReg && hdr.Size > 0 {
      file, err := open(source)
      if err != nil {
        return fmt.Errorf("failed to open path: %v: %w", source, err)
      }
      defer file.Close()

      n, err := copyBuffered(context.TODO(), cw.tw, file)
      if err != nil {
        return fmt.Errorf("failed to copy: %w", err)
      }
      if n != hdr.Size {
        return errors.New("short write copying file")
      }
    }

    if additionalLinks != nil {
      source = hdr.Name
      for _, extra := range additionalLinks {
        hdr.Name = extra
        hdr.Typeflag = tar.TypeLink
        hdr.Linkname = source
        hdr.Size = 0
        // includeParents 处理whiteout parent目录不为root时,作为update处理
        if err := cw.includeParents(hdr); err != nil {
          return err
        }
        // 写入header
        if err := cw.tw.WriteHeader(hdr); err != nil {
          return fmt.Errorf("failed to write file header: %w", err)
        }
      }
    }
  }
  return nil
}

由于笔者时间、视野、认知有限,本文难免出现错误、疏漏等问题,期待各位读者朋友、业界专家指正交流。

参考文献

1. https://www.jianshu.com/p/51ac82d4bd8a

2. https://github.com/containerd/containerd/pull/6965

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2023-01-09,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DCOS 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
对象存储
对象存储(Cloud Object Storage,COS)是由腾讯云推出的无目录层次结构、无数据格式限制,可容纳海量数据且支持 HTTP/HTTPS 协议访问的分布式存储服务。腾讯云 COS 的存储桶空间无容量上限,无需分区管理,适用于 CDN 数据分发、数据万象处理或大数据计算与分析的数据湖等多种场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档