11-05-2023

Math notation handbook

here are a couple of little mathematical notation things I didn’t recognize at first, so I thought I’d save them somewhere.

Understanding new notation

Asking on the Mathematics Discord.

Google/Yandex Images’ reverse image search

Copying symbol (if KaTeX) and searching it

Exploring related content or papers - some aren’t really “notation” but general adoption, this is not as easily to realize through search engine results

| - Pipe, vertical bar

resource
“divides”. $\{x\in\mathbb{Z}:4|x\}$ = set of integers such that x is divisible by 4.
“given”. $P(A|B)$ = probability of A given B, aka conditional probability

; - Semicolon

resource
$f(x;p)$ - the parameter $p$ will define a new function with input argument $x$ .
It’s basically like programmings generics

~ - Distributed over

resource (point 4)
$X\sim N(0,1)$ : the random variable X distributed over N (standard normal distribution) (OR X is sampled according to N) with the bounds 0, 1
$X\sim Y$ : if both are stochastic variables, then X has the same distribution as Y

$\rightarrow$ - Maps to set / return type

resource
Say you see $f:A\rightarrow B$ ; it means that $f$ takes things from A and maps them to B. Basically like defining the return type in programming. Note that we’re using sets, not variables (in this case, set A and B).

$\mapsto$ - Maps to variable / return variable

resource
$f:x\mapsto y$ is the same as $f(x)=y$ . Notice that this works with variables and not sets (though, your variable COULD be a set too).
$5\mapsto 3$ is valid, for example, if we have some $f(5)=3$ .
It differs from mapping because you’re not restricted to sets. It also doesn’t identify a return type, but rather a variable.

$\times$ - Cartesian product

Set theory: Cartesian Product. say $A\times B$ , we get a matrix of $C_{i\in A, j \in B}$ . ex.: $R:State\times Action\rightarrow\mathbb{R}$ = function R that takes a paired State and Action and returns something in the set of real numbers.

$\nabla$ - Nabla / gradient

resource
When you prepend it right next to a function, that means you’re taking that function’s gradient. That is, the partial derivative of all its inputs. $\nabla f(x,y)=\begin{bmatrix}\frac{\partial f}{\partial x}\\\frac{\partial f}{\partial y}\end{bmatrix}$ It has a geometrical significance. When you visualize it as vectors, with an origin at point $(x,y)$ where the corresponding partial derivatives form the vector, like so: $\underbrace{\vec{o}}_{\text{output vector}}=[\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}]$ Map them to a 2D plane, and you get the following (arrows scaled down to fit):

Visualization of gradients over a function. Vector color maps to "steepness".
When you have a little subscript, i.e. $\nabla_{\vec{v}}$ , that’s the directional derivative, which is like the gradient but you sum it into one value. So in this case $\nabla_{\vec{v}}=\frac{\partial f}{\partial x}+ \frac{\partial f}{\partial y}$

$\widehat{x}$ - Wide hat

In statistics: an approximation of a variable $x$ (underneath the hat)
In machine learning: unbiased, i.e. the unbiased momentum vectors $\widehat{m},\widehat{v}$ (ex.: gradient descent with momentum).

$\mu$ - Mean

In statistics: the mean value. Used in the Gaussian distribution for example.

$\overset{\text{def}}=$ - Define new variable

Used to contrast that the content to the left of the equality is being defined now, rather than being predefined.
Similar to $:=$ .. it seems?

$:=$ - Define new variable

$||$ - Vector norm / $L^p$ norm

resource
Basically a way of measuring the magnitude of an n-dimensional vector. $L^2$ , for instance, is the Euclidean distance. You can see an example below, where the $p$ parameter is the x axis:

`::=` - Defined as

resource
Not actually a math symbol. In programming, it means that whatever is to the right of it, gets assigned to it. Often used for defining language grammar. It follows BNF notation.

$\odot$ - Hadamard product

In machine learning: $A\odot B$ returns an element-wise multiplication. That is, $(A\odot B)_{i, j}=A_{i, j}\cdot B_{i, j}$