Siyuan Fu

Avater

send mail Github Linkedin orcid Twitter

C++ Tips

Memory Layout

Code (Text) Segment

Store executable instructions. i.e., functions and member functions.

Data Segment

Global and static variables.

Stack

Temporary variables.

Heap

Dynamic memory allocation. Notice that member variables could be both in stack or heap. It depends on how the object is instantiated.

Miscellaneous

| | | | | ——– | ——- | ——- | | const_cast | CPP Reference | | | std::distance | CPP Reference | CPlusPlus.com | | std::remove_reference | CPP Reference | CPlusPlus.com | | std::forward | CPP Reference | CPlusPlus.com |

Substitution failure is not an error

TODO

Multithreading

Atomic vs. Mutex

TODO

Memory Barrier

int x = 0, y = 0;
// thread 1
x = 1;
r1 = y;
cout << r1 << "\n";
// thread 2
y = 1;
r2 = x;
cout << r2 << "\n";

The results might all be zero because the optimization of the compiler, or different CPU core having different caches. To synchronize the order of memory RW, we can use mutex. We can also use memory barriers (faster than mutex).

Memory barriers will guarantee that the order of RW memories will be the same as the order of the source codes. E.g., thread 1 may execute r1 = y before x = 1 because the optimization by the compiler. Using memory barriers can force the execution order.

int x = 0, y = 0;
// thread 1
x = 1;
STORELOAD_FENCE();
r1 = y;
cout << r1 << "\n";
// thread 2
y = 1;
STORELOAD_FENCE();
r2 = x;
cout << r2 << "\n";

In this case the results might be 1 1, 0 1, 1 0 but never 0 0!

Memory Order

Sequence Consistent (strictest)

std::atomic<bool> x { false };
// in thread, write
x.store(true, std::memory_order_deq_cst);
// in thread, read
bool y = x.load(std::memory_order_deq_cst);

Sequence Consistent guarantees the memory RW order is the same as source codes (memory barriers).

It also guarantees if at a certain time t, a thread R/W the atomic variable, all the threads after t will see the results after this R/W. E.g., if thread 1 writes the variable with 1 at time t, then all the threads at t+1 loading the variable will get 1. (think of how it might not be 1 on a multi core CPU)

Relaxed (least strict)

std::memory_order_relaxed. Used in shared_ptr

Relaxed guarantees the R/W is atomic. It doesn’t guarantee the order of executions of R/W is the same as the source codes. (no memory barriers)

Acquire & Release

See Memory Barriers.


```std::memory_order_release```

All the memory R/W before Release will be seen by Acquire.

![](/docs/cheat_sheets/img/acquire_and_release.png)

### Consume

```std::memory_order_consume```.

Consume is weaker than Acquire. It won't carry dependancies from memories that not related to Release.

![](/docs/cheat_sheets/img/consume_and_release.png)

## Memory

- Packing (padding is the size of struct's largest element): 
  ```cpp
  struct A { // size 16
    int a;
    double e;
  };
  struct B { // size 12
    char a; char b; char c;
    int d;
    char e;
  };
  struct C { // size 24.
    // padding is not sizeof(A) but still sizeof(int). nested structures doesn't change padding
    // but sizeof(aa) is still A's size.
    int a;  // 8
    A aa;   // 16
  };
  struct D { // size 24.
    // padding is not sizeof(A) but still sizeof(int). nested structures doesn't change padding
    // but sizeof(aa) is still A's size.
    int a; // 8
    A aa;  // 16
    int b; //8
  };

To disable packing, #pragma pack(1). Without packing, CPU deoes more readings.

     
realloc CPP Reference CPlusPlus.com
calloc CPP Reference CPlusPlus.com
std::allocator CPP Reference CPlusPlus.com

STL

     
std::reduce CPP Reference  
std::max_element CPP Reference CPlusPlus.com
std::upper_bound CPP Reference CPlusPlus.com
std::lower_bound CPP Reference CPlusPlus.com
std::equal_range CPP Reference CPlusPlus.com
std::reverse CPP Reference CPlusPlus.com
std::partition CPP Reference CPlusPlus.com
std::make_heap CPP Reference CPlusPlus.com
std::pop_heap CPP Reference CPlusPlus.com
std::push_heap CPP Reference CPlusPlus.com

Algorithms

Prim

void prim(vector<vector<ii>> const& adj, int s) {
  priority_queue<ii, vector<ii>, greater<ii>> pq;
  parents.resize(N);
  keys.resize(N, INT_MAX);
  keys[s] = 0;
  pq.emplace(0, s);

  while(!pq.empty()) {
    auto [d, u] = pq.top();
    pq.pop();

    if(d > keys[u])
      continue; //lazy deletion

    for(auto &[v,w]: adj[u]) {  
      if(w < keys[v]) { // note: difference here
        keys[v] = w;
        parents[v] = u;
        pq.emplace(keys[v], v);
      }
    }
  }
}

Quick Sort

vector<int> sortArray(vector<int>& nums) {
  quickSort(nums, 0, nums.size()-1);
  return nums;
}
void quickSort(vector<int>& nums, int left, int right)
{
  if (left >= right) return;
  if (left < 0) return;
  if (right >= nums.size()) return;

  int pivot = nums[(left+right) / 2];
  swap(nums[left], nums[(left+right) / 2]);
  int i = left;
  int j = right;
  while (true)
  {
    while (nums[j] >= pivot && j > i) j--;
    if (j == i) break;
    while (nums[i] <= pivot && j > i) i++;
    if (j == i) break;
    swap(nums[i], nums[j]);
  }
  swap(nums[left], nums[i]);
  quickSort(nums, left, i-1);
  quickSort(nums, i+1, right);
}

Quick Select

tags: C++