https://johnfarrier.com/exploring-c-stdspan-part-4-const-correctness-and-type-safety/?utm_source=rss&utm_medium=rss&utm_campaign=exploring-c-stdspan-part-4-const-correctness-and-type-safety
还是介绍span的优缺点。这里介绍一种API冗余的问题
#include <span>
#include <iostream>
#include <vector>
void processData(int* data, std::size_t size) {
std::cout << "fucked";
}
void processData(std::span<int> data) {
std::cout << "ok";
}
int main() {
std::vector<int> v{ 1,3, 5,7, 9};
processData({v.data(), v.size()});
}
两种接口存在误用可能,保留一个即可
https://devblogs.microsoft.com/oldnewthing/20241115-00/?p=110527
https://devblogs.microsoft.com/oldnewthing/20241118-00/?p=110535
raymood chen这俩文章是连着的,我就放在一起讲了
一个是如何处理optional构造不能复制/移动的对象,简单来说解决方案就是类似std::elide
看代码 godbolt https://godbolt.org/z/vWrzG5Gfj
#include <vector>
#include <string>
#include <iostream>
#include <optional>
struct Region {
int dummy[100];
};
struct Widget
{
Widget(Region const& region) {
}
Widget() = delete;
Widget(Widget const&) = delete;
Widget(Widget &&) = delete;
Widget& operator=(Widget const&) = delete;
Widget& operator=(Widget &&) = delete;
static Widget CreateInside(Region const& region) {
std::cout << "inside\n";
return Widget(region);
}
static Widget CreateOutside(Region const& region) {
std::cout << "outside\n";
return Widget(region);
}
private:
int dummy[100];
};
struct WidgetInsideRegionCreator
{
WidgetInsideRegionCreator(Region const& region) : m_region(region) {}
operator Widget() { return Widget::CreateInside(m_region); }
Region const& m_region;
};
template<typename F>
struct EmplaceHelper
{
EmplaceHelper(F&& f) : m_f(f) {}
operator auto() { return m_f(); }
F& m_f;
};
int main()
{
Region region;
// construct with a Widget value
std::optional<Widget> o0(WidgetInsideRegionCreator(region)); //函数声明我草
std::optional<Widget> o{WidgetInsideRegionCreator(region)};
std::optional<Widget> o2;
// or place a Widget into the optional
o2.emplace(WidgetInsideRegionCreator(region));
// construct with a Widget value
// 为什么这个不被解析成函数声明?
std::optional<Widget> o3(
EmplaceHelper([&] {
return Widget::CreateInside(region);
}));
std::optional<Widget> o4;
// or place a Widget into the optional
o4.emplace(EmplaceHelper([&] {
return Widget::CreateInside(region);
}));
}
简单来说是通过optional的成员构造,从T本身构造,通过wrapper转发给optional隐式构造
第二篇文章是 map插入接口复杂以及带来的构造问题
Operation 操作 | Method 方法 |
---|---|
Read, throw if missing读取,如果缺失则抛出异常 | m.at(key) |
Read, allow missing 阅读,允许缺失 | m.find(key) |
Read, create if missing阅读,如果不存在则创建 | m[key] |
Write, nop if exists, discard value写入,如果存在则 nop,丢弃值 | m.insert({ key, value }) |
m.emplace(key, value) | |
Write, nop if exists写入,如果存在则 nop | m.emplace(std::piecewise_construct[1], ...) |
m.try_emplace(key, params) | |
Write, overwrite if exists编写,如果存在则覆盖 | m.insert_or_assign(key, value) |
这么多接口,想实现如果没有就插入,如果有不要浪费构造这种逻辑,怎么做?
template<typename Map, typename Key, typename... Maker>
auto& ensure(Map&& map, Key&& key, Maker&&... maker)
{
auto lower = map.lower_bound(key);
if (lower == map.end() || map.key_comp()(lower->first, key)) {
lower = map.emplace_hint(lower, std::forward<Key>(key),
std::invoke(std::forward<Maker>(maker)...));
}
return lower->second;
}
auto& ensure_named_widget(std::string const& name)
{
return ensure(widgets, name,
[](auto&& name) { return std::make_shared<Widget>(name); },
name);
}
一坨屎啊,注意到我们转发的这个逻辑,其实就是上面的elide
之前在115期也介绍过优化insert
insert可能不成功,所以需要推迟value构造,惰性构造,也不能说惰性构造
更像是强制RVO,省掉拷贝
template <class F>
struct lazy_call {
F f;
template <class T> operator T() { return f(); }
};
#define LAZY(expr) lazy_call{[&]{ return expr; }}
auto [iter, success] = map.try_emplace(key, LAZY(acquire_value()));
其实就是elide http://www.virjacode.com/papers/p3288r1.htm
应该有机会合入,代码也很简单 godbolt https://godbolt.org/z/rd5qbfE7E
那我们可以用emplacehelper,也就是elide重新实现一下,就用try_emplace就好了
template<typename Map, typename Key, typename... Maker>
auto& ensure(Map&& map, Key&& key, Maker&&... maker)
{
return *map.try_emplace(key, EmplaceHelper([&] {
return std::invoke(std::forward<Maker>(maker)...);
}).first;
}
一行,甚至都不用封装这么丑的代码,直接在用到map的地方直接调用就行,比如
auto& item =
*widgets.try_emplace(name, EmplaceHelper([&] {
return std::make_shared<Widget>(name); })).first;
Some Of My Experience About Linking C/C++ On Linux
https://coyorkdow.github.io/linking/2024/11/17/C++_linking_linux.html
简单来说就是符号查找的问题
比如不同库可能的符号覆盖,符号weak strong之类的说明,不了解的建议翻翻《链接装载库》,虽然旧也够用
引申阅读 问题排查:C++ exception with description “getrandom“ thrown in the test body https://zhuanlan.zhihu.com/p/5392960438
找不到符号 -> 宿主机没提供 编译问题,版本没对上,实现链接了不存在的符号
https://security.googleblog.com/2024/11/retrofitting-spatial-safety-to-hundreds.html
加固详细介绍 https://libcxx.llvm.org/Hardening.html
同事kpi汇报 https://bughunters.google.com/blog/6368559657254912/llvm-s-rfc-c-buffer-hardening-at-google
google 安全问题实践,使用libc++ harden mode,性能损耗不大,但是安全问题显著降低,完成了今年的KPI
他们测试数据性能影响不到1%,非常可观
加固只有1% 3%性能损失?
While these new runtime safety checks improve security, they add additional runtime overhead and can negatively impact performance. We studied the performance degradation for Google workloads and Feedback Direct Optimization (FDO) proved to be effective in minimizing it. As an example, enabling the hardened libc++, without any FDO, in a representative Google fleet workload added a ~0.9% queries per second (QPS) regression and a ~2.5% latency regression. When properly using FDO, we measured a ~65% reduction in QPS overhead and a ~75% reduction in latency overhead.
存在问题
但这个比起使用debug模式或者FORTIFY_SOURCE 之类的安全模式的开销要低的多
现在contracts没有实现之前大方向是向harden mode切换
最近看到个不用typeinfo的map实现,使用类型擦除加模拟,type使用static 变量地址或者类似的typemap技术实现
类型擦除带来的new delete开销不小
但去掉typeinfo/sso通过类型擦除带来的get收益也不小
看godbolt结果很奇怪 https://godbolt.org/z/Kx6hn9ccM
需要调查
https://uploadcare.com/blog/faster-blurhash/
blurhash encode计算存在以下问题
作者根据这俩思路优化了128倍
不过我观察这项目虽然星不少但是大多只管调用没啥性能优化开发的,这个PR作者有心了,不过已经挂哪里快俩月了
pr在这里 https://github.com/woltapp/blurhash-python/pull/25
手把手带你了解ChibiHash的实现原理,他的测试说性能比xxhash好,需要测一下
代码直接贴了
他这个性能好,但是碰撞概率还是会很高的,他自己也说了,感觉不如wyhash
// small, fast 64 bit hash function.
//
// https://github.com/N-R-K/ChibiHash
// https://nrk.neocities.org/articles/chibihash
//
// This is free and unencumbered software released into the public domain.
// For more information, please refer to <https://unlicense.org/>
#pragma once
#include <stdint.h>
#include <stddef.h>
static inline uint64_t
chibihash64__load64le(const uint8_t *p)
{
return (uint64_t)p[0] << 0 | (uint64_t)p[1] << 8 |
(uint64_t)p[2] << 16 | (uint64_t)p[3] << 24 |
(uint64_t)p[4] << 32 | (uint64_t)p[5] << 40 |
(uint64_t)p[6] << 48 | (uint64_t)p[7] << 56;
}
static inline uint64_t
chibihash64(const void *keyIn, ptrdiff_t len, uint64_t seed)
{
const uint8_t *k = (const uint8_t *)keyIn;
ptrdiff_t l = len;
// 写成了十六进制形式的 e(欧拉数)、π(圆周率)和黄金比例,最后一位调整为奇数(因为乘以偶数是不可逆的)
const uint64_t P1 = UINT64_C(0x2B7E151628AED2A5);
const uint64_t P2 = UINT64_C(0x9E3793492EEDC3F7);
const uint64_t P3 = UINT64_C(0x3243F6A8885A308D);
uint64_t h[4] = { P1, P2, P3, seed };
for (; l >= 32; l -= 32) {
for (int i = 0; i < 4; ++i, k += 8) {
uint64_t lane = chibihash64__load64le(k);
h[i] ^= lane;
h[i] *= P1;
h[(i+1)&3] ^= ((lane << 40) | (lane >> 24));
}
}
h[0] += ((uint64_t)len << 32) | ((uint64_t)len >> 32);
if (l & 1) {
h[0] ^= k[0];
--l, ++k;
}
h[0] *= P2; h[0] ^= h[0] >> 31;
for (int i = 1; l >= 8; l -= 8, k += 8, ++i) {
h[i] ^= chibihash64__load64le(k);
h[i] *= P2; h[i] ^= h[i] >> 31;
}
for (int i = 0; l > 0; l -= 2, k += 2, ++i) {
h[i] ^= (k[0] | ((uint64_t)k[1] << 8));
h[i] *= P3; h[i] ^= h[i] >> 31;
}
uint64_t x = seed;
x ^= h[0] * ((h[2] >> 32)|1);
x ^= h[1] * ((h[3] >> 32)|1);
x ^= h[2] * ((h[0] >> 32)|1);
x ^= h[3] * ((h[1] >> 32)|1);
// moremur: https://mostlymangling.blogspot.com/2019/12/stronger-better-morer-moremur-better.html
x ^= x >> 27; x *= UINT64_C(0x3C79AC492BA7B653);
x ^= x >> 33; x *= UINT64_C(0x1C69B3F74AC4AE35);
x ^= x >> 27;
return x;
}
https://johnnysswlab.com/memory-subsystem-optimizations-the-remaining-topics/
这个介绍了一些优化技巧边角料,这个主题我准备全翻译一下,这里标记一个TODO,感兴趣的可以直接看
简单说就是
性能测试就不列举了
https://pvs-studio.com/en/blog/posts/cpp/1174/
代码鉴赏环节
#include <iostream>
#include <string_view>
void print_me(std::string_view s) {
printf("%s\n", s.data());
}
int main() {
char next[] = {'n','e','x','t'};
char hello[] = {'H','e','l','l','o', ' ',
'W','o','r','l','d'};
std::string_view sub(hello, 5);
std::cout << sub << "\n";
print_me(sub);
}
挂sanitize=address会报错。godbolt https://godbolt.org/z/r4YsxvvM8
另外这个代码不挂sanitize可能会炸,我godbolt没复现 https://godbolt.org/z/TWxWcKoTd
当然越界访问是肯定了,比如
#include <cstdio>
#include <cstring>
#include <iostream>
#include <string_view>
void print_me(std::string_view s) {
printf("%s\n", s.data());
}
void paint(void* p, size_t n) {
std::memset(p, 'X', n);
}
auto paint_fp = &paint;
int main() {
{
char buff[4096];
paint_fp(buff, sizeof buff);
}
char hello[] = {'H','e','l','l','o', ' ',
'W','o','r','l','d'};
std::string_view sub(hello, 5);
std::cout << sub << "\n";
print_me(sub);
}
godbolt https://godbolt.org/z/zEYMqEq4Y
使用friend tag绕过,非常猥琐 godbolt https://godbolt.org/z/xhzr6nE73
#include <memory>
struct Arg1 {};
struct Arg2 {};
class MyComponent {
private:
struct private_ctor_token {
friend class MyComponent;
private:
private_ctor_token() = default;
};
public:
static auto make(Arg1 arg1, Arg2 arg2) -> std::shared_ptr<MyComponent> {
return std::make_shared<MyComponent>(private_ctor_token{}, std::move(arg1), std::move(arg2));
}
// Ban copy and move constructors to avoid accidentally copying and moving
// the object data into a stack instance.
MyComponent(const MyComponent&) = delete;
MyComponent(MyComponent&&) = delete;
// Ban those bros too, that's optional.
MyComponent& operator = (const MyComponent&) = delete;
MyComponent& operator = (MyComponent&&) = delete;
MyComponent(private_ctor_token, Arg1, Arg2) { };
};
int main() {
// MyComponent c({}, {}, {}); // Compilation error
}
主要问题,aligned_*_t 和 aligned_* 容易误用,漏掉**_t** 比如 godbolt https://godbolt.org/z/eEx1Pe94G
#include <memory>
#include <xmmintrin.h>
#include <type_traits>
const size_t head = 0;
std::aligned_union<256, __m128i> storage;
//std::aligned_union_t<256, __m128i> storage;
int main() {
__m128i* a = new (&storage) __m128i();
__m128i* b = new ((char*)(&storage) + 0) __m128i();
__m128i* c = new ((char*)&storage + head) __m128i();
}
挺抽象 见 P1413 https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1413r3.pdf
解决方案就是用alignas替换
// To replace std::aligned_storage...
template <typename T>
class MyContainer {
// [...]
private:
- std::aligned_storage_t<sizeof(T), alignof(T)> t_buff;
+ alignas(T) std::byte t_buff[sizeof(T)];
// [...]
};
// To replace std::aligned_union...
template <typename... Ts>
class MyContainer {
// [...]
private:
- std::aligned_union_t<0, Ts...> t_buff;
+ alignas(Ts...) std::byte t_buff[std::max({sizeof(Ts)...})];
// [...]
};
#include <string_view>
#include <string>
#include <concepts>
struct MetricSample{
// To avoid implicit conversions, we immediately
// added 'explicit', as all the best practices recommend.
explicit MetricSample(std::same_as<double> auto val): value {val} {}
private:
double value;
};
class Metrics {
public:
// To make it clear for the user why to allocate memory for string and avoid redundant implicit copies,
// we use a rvalue reference.
// After all, this is a great way to show that the interface intends to take ownership of the string.
// A user should explicitly perform a move operation.
void set(std::string_view metric_name, MetricSample val, std::string&& comment) {};
// void set(std::string_view metric_name, MetricSample val, std::same_as<std::string> auto&& comment) {};
};
int main() {
Metrics m;
auto val = MetricSample(1.0);
m.set("Metric", val, 0);
}
必炸 https://godbolt.org/z/sb5avWo4G
有个talk得看一下 Keynote Meeting C++ 2022. Nico Josuttis. Belle Views on C++ Ranges, their Details and the Devil. https://www.youtube.com/watch?v=O8HndvYNvQ4
看一下这个例子 观察drop take语义区别 godbolt https://godbolt.org/z/Yz7M6zo5s
#include <string>
#include <ranges>
#include <list>
#include <iostream>
#include <format>
void print_range(std::ranges::range auto&& r) {
for (auto&& x: r) {
std::cout << x << " ";
}
std::cout << "\n";
}
void test_drop_print() {
std::list<int> ints = {1, 2, 3 ,4, 5};
auto v = ints | std::views::drop(2);
ints.push_front(-5);
print_range(v);
}
void test_drop_print_before_after() {
std::list<int> ints = {1, 2, 3 ,4, 5};
auto v = ints | std::views::drop(2);
print_range(v);
ints.push_front(-5);
print_range(v);
}
void test_take_print() {
std::list<int> ints = {1, 2, 3 ,4, 5};
auto v = ints | std::views::take(2);
ints.push_front(-5);
print_range(v);
}
void test_take_print_before_after() {
std::list<int> ints = {1, 2, 3 ,4, 5};
auto v = ints | std::views::take(2);
print_range(v);
ints.push_front(-5);
print_range(v);
}
int main() {
std::cout << "drop: \n";
test_drop_print();
std::cout << "------\n";
test_drop_print_before_after();
std::cout << "take: \n";
test_take_print();
std::cout << "------\n";
test_take_print_before_after();
}
打印
drop:
2345
------
345
345
take:
-51
------
12
-51
由于range的惰性,执行operator | drop并没有完成计算,只有range被访问才会真正计算,并计算一次drop
上面例子使用的是&& 同一个对象执行一次,如果改一下,改成传值,又会不一样
同一个view只会计算begin end一次,每次都是新的view,每次都会计算drop。
而take每次都会计算,不会缓存结果
代码就不贴了,godbolt https://godbolt.org/z/8P3MfvcKb
[1]
What's up with std::piecewise_construct and std::forward_as_tuple?:https://devblogs.microsoft.com/oldnewthing/20220428-00/?p=106540