![]() |
![]() |
![]() |
![]() |
, where
is angle,
is axis.
Denote by
To rotate a vector by quaternion
When sampling shadowmapping, sample multiple times using the nearby positions around the shading point (noticing this is not filtering).
When using PCF, sample multiple pixels on the shadowmap, calculate the average distance between the shading point and the object that blocks the light source. Use this distance to determine the size of PCF.
Spherical Harmonics are a set of 2D basis functions defined on the sphere.
The functions on a sphere can be projected into Spherical Harmonics: , where
is the coefficent and
is the SH.
Higher the degree of SH, higher the frequncy of the information it can encode.
A rotation R about the origin that sends the unit vector r to r’. Under this operation, a spherical harmonic of degree l and order m transforms into a linear combination of spherical harmonics of the same degree.
The integral of the multiplication of two functions is the dot production of the two corresponding SH coefficients.
For static scenes, we can project the BRDF, the Visibility, and Lambertian to SH. Since Lighting can also be projected to SH, we can render this scene in different lighting conditions in real-time (solve the Rendering Equation with a simple dot production).
Notice the light are allowed to rotate due to the rotational behavior of the spherical harmonics (see above).
For glossy material, the BRDF also need to be projected to SH. In this case, the number of coefficents needed for the scene is squared (a matrix). Then the shading is a vector-matrix multiplication instead of a dot production.
After enabling MSAA N, the cost for frament shading remains unchanged; the cost for rasterizaion is N times more; the size of fragement buffer, depth buffer and stencil buffer is N times more.
When creating the frame buffers, graphics API needs to know wether MSAA is enabled or not. E.g., Vulkan’s VkImageCreateInfo
requires a VkSampleCountFlagBits
Is MSAA compatible with Early-Z Testing? (my assumption: after enable MSAA the cost for fragment shading with Early-Z testing will increase)
float3 e1 = v1 - v0;
float3 e2 = v2 - v0;
float3 dUV1 = uv1 - uv0;
float3 dUV2 = uv2 - uv0;
float f = 1.0f / (dUV1.x * dUV2.y - dUV2.x * dUV1.y);
float3 t = (e1 * dUV2.y - e2 * dUV1.y) * f;
float3 b = (e2 * dUV1.x - e1 * dUV2.x) * f;
float3 n = normalize(cross(t, b));
where is Normal Distrubution Function,
is Fresnel Term, and
is Geometry Term.
Describe the smoothness of the surface. The smoother the surface is, the more the lobe concentrates on the reflection direction.
Describe the impact of self occlusion and shadowing.
Describe the metallic of the material. The closer to the metal, the more the lobe conentratese on the gazing angle.
Not only do path tracing from the eye, but also from the light source. One drawback of this is it actually doubles the noises since in some region sampling the eye does a good job while in some region sampling the light source does a good job. Simply adding them would introduce both noises.
Using Multiple Importance Sampling can solve this issue. In MIS when a ray intersects with the surface, we not only importance-sample the BRDF and generated a new ray, but also directly sample the light source and generated a ray towards the light source.
MIS can effectively combine results from the importance sampling:
are the two samples we generated by sampling BRDF and direct light sampling in this iteration.
are special weighting functions chosen such that the expected value of this estimator.
A recomandation for weighting is power heuristic:
Path tracing cannot go on forever. We could either set a maximum bounce number, or use Russian Roulette to terminate it:
P_RR = 0.6
sample(pos, wo):
ksi = uniform_sample(0, 1)
if (ksi > P_RR) return 0
Li = ray_tracing(pos, wo)
return Li / P_RR
Used for approximating depth of field or glossy reflection.
We can do one inclusive scan for row and one inclusive scan for column to get SAT.
Based on Cook-Torrance, use to approximate
Apparently the diffuse term is . We can pre-bake the irridiance map:
To querry the map, sample along the reflection angle.
Then, use to approximate the specular term (
is 1).
Pre-baked radiance mipmap:
Pre-baked BRDF LUT:
vec3 getIBLContribution(PBRInfo pbrInputs, vec3 n, vec3 reflection)
float lod = (pbrInputs.perceptualRoughness * uboParams. prefilteredCubeMipLevels);
vec3 brdf = (texture(samplerBRDFLUT, vec2(pbrInputs.NdotV, 1.0 - pbrInputs.perceptualRoughness))).rgb;
vec3 diffuseLight = SRGBtoLINEAR(tonemap(texture(samplerIrradiance, n))).rgb;
vec3 specularLight = SRGBtoLINEAR(tonemap(textureLod (prefilteredMap, reflection, lod))).rgb;
vec3 diffuse = diffuseLight * pbrInputs.diffuseColor;
vec3 specular = specularLight * (pbrInputs.specularColor * brdf.x + brdf.y);
return diffuse + specular;
weighted_lum[pxiel_idx] = area_of_pixel(pxiel_idx) * lum_of_pixel(pxiel_idx);
float inv_integral = 1.0 / accumulate(weighted_lum);
float inv_avg = (float)weighted_lum.size() * inv_integral;
for (int i = 0; i < weighted_lum.size(); ++i) {
accel_struct[i].q = weighted_lum[i] * inv_avg;
accel_struct[i].alias = i;
vector<size_t> table(weighted_lum.size());
size_t small = 0; size_t large = weighted_lum.size(); // double pointers
for (int i = 0; i < weighted_lum.size(); ++i) {
if (accel_struct[i].q < 1.0) table[small++] = i;
else table[--large] = i;
for (small = 0; small < large && large < weighted_lum.size(); ++small) {
auto smallIdx = table[small]; auto largeIdx = table[large];
auto& smallAccel = accel_struct[smallIdx];
auto& largeAccel = accel_struct[largeIdx];
smallAccel.alias = largeIdx;
largeAccel.q -= (1.0 - smallAccel.q); // calibrate the weight
if (largeAccel.q < 1.0) // check if heigher-energy has become lower-energy
for (auto& accel : accel_struct) {
accel.pdf = lum_of_pixel(i) * inv_integral;
for (auto& accel : accel_struct) {
accel.aliasPdf = accel_struct[accel.alias].pdf;
const uint idx = min(uint(rand_u01() * float(texSize)), texSize - 1);
const EnvAccel sample_data = accel_struct[idx];
uint env_idx;
if(rand_u01() < sample_data.q) {
env_idx = idx;
pdf = sample_data.pdf;
else {
env_idx = sample_data.alias;
pdf = sample_data.aliasPdf;
vec3 CosineSampleHemisphere(float r1, float r2, out float pdf) {
vec3 dir;
float r = sqrt(r1);
float phi = TWO_PI * r2;
dir.x = r * cos(phi);
dir.y = r * sin(phi);
dir.z = sqrt(max(0.0, 1.0 - dir.x * dir.x - dir.y * dir.y));
pdf = dir.z * INV_PI;
return dir;
vec3 UniformSampleHemisphere(float r1, float r2) {
float r = sqrt(max(0.0, 1.0 - r1 * r1));
float phi = TWO_PI * r2;
return vec3(r * cos(phi), r * sin(phi), r1);
With Early-Z Testing, the depth test is performed before the fragment shader and the fragments that fail to pass the test will be discarded. With this we can reduce the cost of fragment shading.
The Directx 12 and Vulkan supports this feature by default. To enable it on OpenGL, see OpenGL’s wiki.
Also we can add an explicit depth prepass.
An effective algorithm to partition primitives and build Bounding Volume Hiearchy (BVH).
When computing the bounding volume for each of the primitives, we keep track of the largest distance between the centroids of two bounding volumes. Then, we perform the Surface Area Heuristic algorithm to determine the axis at which the largest distance occurs.
We divide the distance range into multiple buckets (e.g., 32). Then, we attempt to partition at each boundary of the bucket and select the boundary with the lowest cost. The cost is calculated as the ratio between the two surface areas of the combined bounding volumes after the partition.
tags: Graphics