C++ 中文周刊 2024-10-28 第170期

王很水

发布于 2024-10-29 18:51:24

950

发布于 2024-10-29 18:51:24

文章被收录于专栏：C++ 动态新闻推送

文章

On designing Tenseur, A C++ tensor library with lazy evaluation

https://istmarc.github.io/post/2024/10/27/on-designing-tenseur-a-c-tensor-library-with-lazy-evaluation/

这个人写了个类似eigen的库，介绍他的设计原理。其实主要是表达式模版，这里举一个例子

比如你有一个数组相加的场景，但想延迟计算

原型

/// @brief class representing a mathematical 3D vector
class Vec : public std::array<double, 3> {
  public:
    using std::array<double, 3>::array; 
    // inherit constructor (C++11)
    // see https://en.cppreference.com/w/cpp/language/using_declaration
};


/// @brief sum 'u' and 'v' into a new instance of Vec
Vec operator+(Vec const &u, Vec const &v) {
    Vec sum;
    for (size_t i = 0; i < u.size(); i++) {
        sum[i] = u[i] + v[i];
    }
    return sum;
}

Vec x = a + b + c 就有点低效了，中间结果优化不掉

我们的想法就是让operator+推迟，比如a+b生成一个VecSum 他本身不实际计算，直到多个VecSum合并成一个VecSum之后再计算

显然这种转发得用CRTP

template <typename E>
class VecExpression {
  public:
    static constexpr bool is_leaf = false;

    double operator[](size_t i) const {
        // Delegation to the actual expression type. This avoids dynamic polymorphism (a.k.a. virtual functions in C++)
        return static_cast<E const&>(*this)[i];
    }
    size_t size() const { return static_cast<E const&>(*this).size(); }
};

class Vec : public VecExpression<Vec> {
    std::array<double, 3> elems;

  public:
    static constexpr bool is_leaf = true;

    decltype(auto) operator[](size_t i) const { return elems[i]; }
    decltype(auto) &operator[](size_t i)      { return elems[i]; }
    size_t size()               const { return elems.size(); }

    // construct Vec using initializer list 
    Vec(std::initializer_list<double> init) {
        std::copy(init.begin(), init.end(), elems.begin());
    }

    // A Vec can be constructed from any VecExpression, forcing its evaluation.
    template <typename E>
    Vec(VecExpression<E> const& expr) {
        for (size_t i = 0; i != expr.size(); ++i) {
            elems[i] = expr[i];
        }
    }
};
template <typename E1, typename E2>
class VecSum : public VecExpression<VecSum<E1, E2> > {
  // cref if leaf, copy otherwise
  typename std::conditional<E1::is_leaf, const E1&, const E1>::type _u;
  typename std::conditional<E2::is_leaf, const E2&, const E2>::type _v;

  public:
    static constexpr bool is_leaf = false;

    VecSum(E1 const& u, E2 const& v) : _u(u), _v(v) {
        assert(u.size() == v.size());
    }
    decltype(auto) operator[](size_t i) const { return _u[i] + _v[i]; }
    size_t size()               const { return _v.size(); }
};
  
template <typename E1, typename E2>
VecSum<E1, E2>
operator+(VecExpression<E1> const& u, VecExpression<E2> const& v) {
   return VecSum<E1, E2>(*static_cast<const E1*>(&u), *static_cast<const E2*>(&v));
}

这样 a+b+c 的类型是 VecSum<VecSum<Vec, Vec>, Vec>

Vec x = a + b + c 会调用Vec(VecExpression<E> const& expr)

elems[i] = expr[i];会展开成elems[i] = a.elems[i] + b.elems[i] + c.elems[i]

这样就没有临时Vec对象了

教你 require用法

https://www.think-cell.com/en/career/devblog/if-constexpr-requires-requires-requires

基本，require concept

template <typename T>
auto debug_output(const T&) { // default implementation
    return "???";
}

template <typename T>
    requires std::integral<T>
auto debug_output(const T& t) {
    // return a range of characters representing the integer value of t
}

template <typename T>
    requires std::floating_point<T>
auto debug_output(const T& t) {
    // return a range of characters representing the floating point value of t
}

用在constexpr里

template <typename Cont, typename Rng>
void cont_assign(Cont& cont, Rng&& rng) {
    cont.clear();

    if constexpr (requires { cont.reserve(std::ranges::size(rng)); }) {
        cont.reserve(std::ranges::size(rng));
    }
    for (auto&& elem : std::forward<Rng>(rng)) {
        cont.push_back(std::forward<decltype(elem)>(elem));
    }
}

requires requires, requires本身就是concept

template <typename T>
    requires requires(const T& t) { t.debug_output(); }
auto debug_output(const T& t) noexcept(noexcept(t.debug_output())) {
    return t.debug_output();
}

requires { requires } 用在constexpr里

template <std::ranges::forward_range Rng>
bool all_same(Rng&& rng) {
    if constexpr (requires { requires tc::constexpr_size<Rng>() <= 1; }) {
        return true;
    } else {
        … // loop as before
    }
}

Heterogeneous lookup in unordered C++ containers

https://schneide.blog/2024/10/23/heterogeneous-lookup-in-unordered-c-containers/

透明查找，避免复制，默认不开，怎么开？看代码

struct stringly_hash
{
  using is_transparent = void;
  [[nodiscard]] size_t operator()(char const* rhs) const
  {
    return std::hash<std::string_view>{}(rhs);
  }
  [[nodiscard]] size_t operator()(std::string_view rhs) const
  {
    return std::hash<std::string_view>{}(rhs);
  }
  [[nodiscard]] size_t operator()(std::string const& rhs) const
  {
    return std::hash<std::string>{}(rhs);
  }
};

template <typename ValueType>
using unordered_string_map = std::unordered_map<
  std::string,
  ValueType,
  stringly_hash,
  std::equal_to<>
>;

咱们说过挺多次了

Zero or sign extend

https://fgiesen.wordpress.com/2024/10/23/zero-or-sign-extend/

讨论一种场景，无符号数/有符号数的扩展问题，比如给你一个11位的数字，你给扩展到32位

一种最简单的写法

int32 sign_extend(int32 val_11b) {
    int32 t = val_11b << (32 - 11);
    return t >> (32 - 11);
}

举例

// 假设输入的11位数是: 000 0010 1001 (原始值为41)

int32 t = val_11b << (32 - 11); // 左移21位

// 变成: 0010 1001 0000 0000 0000 0000 0000 0000

return t >> (32 - 11); // 算术右移21位

// 变成: 0000 0000 0000 0000 0000 0000 0010 1001 (仍然为41)

// 如果输入是负数，比如 110 0010 1001 (-215)

// 左移后: 0010 1001 0000 0000 0000 0000 0000 0000

// 算术右移后: 1111 1111 1111 1111 1111 1110 0010 1001 (-215)

>> 移动保留符号，所以这么玩也不会存在问题，但可能依赖数字实现(补码反码问题)

考虑位运算

int sign_extend(int val_11b) {
    return (val_11b & 0x3ff) - (val & 0x400);
}

0x3ff和400哪里来的？0x400是11位, 0x3ff是后10位

举例

// 对于正数 0010 1001 (41):

(41 & 0x3ff) - (41 & 0x400)

= 41 - 0 = 41

// 对于负数 1100 1001 (-215):

(0x329 & 0x3ff) - (0x329 & 0x400)

= 809 - 1024 = -215

当然还有更简洁的

int sign_extend(int val_11b) {
    return val - (val & 0x400) * 2;
}

举例 // 对于正数 0010 1001 (41):

41 - (41 & 0x400) * 2

= 41 - (0) * 2

= 41 - 0

= 41

// 对于负数 1100 1001 (原值 809):

809 - (809 & 0x400) * 2

= 809 - (0x400) * 2

= 809 - 2048

= -215

让我们从11位扩展到任意位(小于32)

int zero_or_sign_extend(int val, int sign_bit) {
    return val - (val & sign_bit) * 2;
}

当然也可以异或

int zero_or_sign_extend(int val, int sign_bit) {
    return (val ^ sign_bit) - sign_bit;
}

举例

// 对于11位正数 0010 1001 (41), sign_bit = 0x400:

(41 ^ 0x400) - 0x400

= 1065 - 1024

= 41

// 对于11位负数 1100 1001 (809), sign_bit = 0x400:

(809 ^ 0x400) - 0x400

= 809 - 1024

= -215

// 对于8位数: // 正数 0100 0001 (65), sign_bit = 0x80:

(65 ^ 0x80) - 0x80

= 193 - 128

= 65

// 负数 1100 0001 (193), sign_bit = 0x80:

(193 ^ 0x80) - 0x80

= 65 - 128

= -63

Inserting a 0 bit in the middle of a value

https://fgiesen.wordpress.com/2024/10/24/inserting-a-0-bit-in-the-middle-of-a-value/

这个背景可以不提，简单说就是给一个二进制中间差一个0，很妙的办法，直觉来说怎么写？

首先给位置分两段，低位不变，高位置移动一位，然后拼起来，对不对

uint64 insert_zero_bit(uint64 value, int pos) {
    uint64 bottom_mask = (1u64 << pos) - 1;
    uint64 top_mask = ~bottom_mask;

    uint64 bottom_bits = value & bottom_mask;
    uint64 top_bits = value & top_mask;
    return bottom_bits | (top_bits << 1);
}

代码也很直观，咱们拿一个例子带入一下

假如 1 0 1 1 0 1 0 1 -> 1 0 1 1 0 0 1 0 1

第四位插个0 pos=3

首先拿到bottom_mask 0 0 0 0 0 0 0 1左移3位减一 0 0 0 0 0 1 1 1

top mask就是 1 1 1 1 1 0 0 0

bottom_bits就是 1 0 1 1 0 1 0 1 保留后三位 0 0 0 0 0 1 0 1

top_bits 就是1 0 1 1 0 1 0 1 保留前五位 1 0 1 1 0 0 0 0

然后top bit左移一位组合1 0 1 1 0 0 0 0 0 ｜ 0 0 0 0 0 1 0 1 -> 1 0 1 1 0 0 1 0 1

这里提到了一个优化的写法

我们要做的就是高位移动一位，就不要考虑低位了，还要算来算去

最简单的移动方法就是自己加自己对不对？ 1 + 1 -> 10

那我高位加自己不就解决了？

这也就是优化算法的原理

uint64 insert_zero_bit(uint64 value, int pos) {
    uint64 top_mask = ~0u64 << pos;
    return value + (value & top_mask);
}

首先我们找到高位，然后高位相加等于移动一位，然后低位没加，不变

很巧妙的思路，就是不能一次性加多个0，得一点一点加

当然同理，我们可以弄一个去掉0 的算法

uint64 remove_zero_bit(uint64 value, int pos) {
    uint64 top_mask = ~0u64 << pos;
    return value - ((value & top_mask) >> 1);
}

注意我们知道指定位置是0，所以移动没啥影响，如果指定去除位置的值，就不能这么简单的算了

一切的前提是知道指定pos是0 的前提下展开的

Triaging clang C++ frontend bugs

https://shafik.github.io/c++/llvm/2024/10/17/triaging-clang-fronend-bugs.html

教你处理clang前端bug/修复指引

Implementing Trivial Relocation in Library

https://brevzin.github.io/c++/2024/10/21/trivial-relocation/

介绍trivial relocation基于反射的实现。复杂。看一乐，这个之前提到过，是两个提案实现方法不同，一直在吵架

Placeholder substitution in the preprocessor

https://holyblackcat.github.io/blog/2024/10/22/macro-placeholders.html

用宏实现了magic enum类似手法。这拖代码屎一样 https://gcc.godbolt.org/z/er1ff1T3d

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2024-10-29，如有侵权请联系 cloudcommunity@tencent.com 删除

c++

本文分享自 CPP每周推送微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

登录后参与评论

0 条评论

热度