資訊本體

在資訊理論中，資訊本體（英語：self-information），由克勞德·夏農提出。資訊本體指的是當我們接收到一個消息時所獲得的資訊量。具體來說，對於一個事件，其資訊本體的大小與該事件發生的概率有關，它是與概率空間中的單一事件或離散隨機變量的值相關的資訊量的量度。它用資訊的單位表示，例如 bit、nat或是hart，使用哪個單位取決於在計算中使用的對數的底。資訊本體的期望值就是資訊理論中的熵，它反映了隨機變量採樣時的平均不確定程度。

定義

由定義，當資訊被擁有它的實體傳遞給接收它的實體時，僅當接收實體不知道資訊的先驗知識時資訊才得到傳遞。如果接收實體事先知道了消息的內容，這條消息所傳遞的資訊量就是0。只有當接收實體對消息的先驗知識少於100%時，消息才真正傳遞資訊。

因此，一個隨機產生的事件 $\omega _{n}$ 所包含的資訊本體數量，只與事件發生的機率相關。事件發生的機率越低，在事件真的發生時，接收到的資訊中，包含的資訊本體越大。

$\omega _{n}$ 的資訊本體量 $\operatorname {I} (\omega _{n})=f(\operatorname {P} (\omega _{n}))$

如果 $\operatorname {P} (\omega _{n})=1$ ，那麼 $\operatorname {I} (\omega _{n})=0$ 。如果 $\operatorname {P} (\omega _{n})<1$ ，那麼 $\operatorname {I} (\omega _{n})>0$ 。

此外，根據定義，資訊本體的量度是非負的而且是可加的。如果事件 $C$ 是兩個獨立事件 $A$ 和 $B$ 的交集，那麼宣告 $C$ 發生的資訊量就等於分別宣告事件 $A$ 和事件 $B$ 的資訊量的和：

$\operatorname {I} (C)=\operatorname {I} (A\cap B)=\operatorname {I} (A)+\operatorname {I} (B)$

因為 $A$ 和 $B$ 是獨立事件，所以 $C$ 的概率為

$\operatorname {P} (C)=\operatorname {P} (A\cap B)=\operatorname {P} (A)\cdot \operatorname {P} (B)$

應用函數 $f(\cdot )$ 會得到

${\begin{aligned}\operatorname {I} (C)&=\operatorname {I} (A)+\operatorname {I} (B)\\f(\operatorname {P} (C))&=f(\operatorname {P} (A))+f(\operatorname {P} (B))\\&=f{\big (}\operatorname {P} (A)\cdot \operatorname {P} (B){\big )}\\\end{aligned}}$

所以函數 $f(\cdot )$ 有性質

$f(x\cdot y)=f(x)+f(y)$

而對數函數正好有這個性質，不同的底的對數函數之間的區別只差一個常數

$f(x)=K\log(x)$

由於事件的概率總是在0和1之間，而資訊量必須是非負的，所以 $K<0$

考慮到這些性質，假設事件 $\omega _{n}$ 發生的機率是 $P(\omega _{n})$ ，資訊本體 $I(\omega _{n})$ 的定義就是:

\operatorname {I} (\omega _{n})=-\log(\operatorname {P} (\omega _{n}))=\log \left({\frac {1}{\operatorname {P} (\omega _{n})}}\right)

事件 $\omega _{n}$ 的概率越小, 它發生後的資訊本體量越大。

此定義符合上述條件。在上面的定義中，沒有指定的對數的基底：如果以 2 為底，單位是bit。當使用以 e 為底的對數時，單位將是 nat。對於基底為 10 的對數，單位是 hart。

資訊量的大小不同於資訊作用的大小，這不是同一概念。資訊量只表明不確定性的減少程度，至於對接收者來說，所獲得的資訊可能事關重大，也可能無足輕重，這是資訊作用的大小。

和熵的聯繫

熵是離散隨機變量的資訊本體的期望值。但有時候熵也會被稱作是隨機變量的資訊本體，可能是因為熵滿足 $\operatorname {H} (X)=\operatorname {I} (X;X)$ ，而 $\operatorname {I} (X;X)$ 是 $X$ 與它自己的互資訊。